BRIEF

  1. I have added all relevant R code (after TA fixed the one issue where changing from geom_col to geom_line resulted in disjointed graph) to make the graph fully reproducible. The process in which the data was wrangled and plotted is also fully comprehensible now.
  2. This is a LINE graph.
  3. The DATA was cleaned up by the instructors and given to us as a CSV file titled hot_dog_contest_with_affiliation.csv. It contains 5 attributes of 37 years worth of championship information from Nathan’s Hot Dog Eating Competition. It has qualitative data such as champion’s name, gender, and affiliation status. It also had quantitative data for championship year and number of hotdog buns eaten by champion that year. Gender and affiliation columns were converted into factors. Also, based on EDA, there were 11 champions that were affiliated with IOFOCE, 6 that were formerly affiliated and 20 that were unaffiliated.
  4. The audience is general in nature, with no specializations, who may be interested in fun statistics about Nathan’s hotdog championship over the years.
  5. This is a graph that depicts the affiliation with IFOCE of male champions of Nathan’s hot-dog eating competition from 1981 to 2017.
  6. The line graph uses color to quickly show that champions prior to 2005 were unaffiliated but champions have been either former or current members of IFOCE ever since.
  7. A negative aspect of this line graph is that one cannot tell the transition years from one category to another. So a tooltip over the line indicating the year, or even vertical grid with xticks for each year would have resolved that issue.
  8. A common variation of this line graph is line graph with tooltip which is used to tell where transitions occured amongst categories. An alternative would be a bar graph.
  9. The graph required me to first filter the data for male champions for years after 1980. I then created a new column designating a boolean indication of post-IFOCE affilliation or not. I then used geom_line’s grouping functionality in ggplot to draw the line graph above.
# Affiliation data
hot_dogs2 <- read_csv(here::here("data", "hot_dog_contest_with_affiliation.csv"), 
    col_types = cols(
      gender = col_factor(levels = NULL),
      affiliated = col_factor(levels=NULL)
    ))
hot_dogs3 <- hot_dogs2 %>%
  filter(year >= 1981, gender == "male") %>%
    mutate(post_ifoce = year >= 1997)
hot_dogs3
## # A tibble: 37 x 6
##     year gender name           num_eaten affiliated post_ifoce
##    <int> <fct>  <chr>              <dbl> <fct>      <lgl>     
##  1  2017 male   Joey Chestnut         72 current    TRUE      
##  2  2016 male   Joey Chestnut         70 current    TRUE      
##  3  2015 male   Matthew Stonie        62 current    TRUE      
##  4  2014 male   Joey Chestnut         61 current    TRUE      
##  5  2013 male   Joey Chestnut         69 current    TRUE      
##  6  2012 male   Joey Chestnut         68 current    TRUE      
##  7  2011 male   Joey Chestnut         62 current    TRUE      
##  8  2010 male   Joey Chestnut         54 current    TRUE      
##  9  2009 male   Joey Chestnut         68 current    TRUE      
## 10  2008 male   Joey Chestnut         59 current    TRUE      
## # ... with 27 more rows
# Exploratory Data Analysis (EDA)
hot_dogs3 %>%
  dplyr::distinct(affiliated)
## # A tibble: 3 x 1
##   affiliated    
##   <fct>         
## 1 current       
## 2 former        
## 3 not affiliated
hot_dogs3 %>%
  dplyr::count(affiliated, sort=TRUE)
## # A tibble: 3 x 2
##   affiliated         n
##   <fct>          <int>
## 1 not affiliated    20
## 2 current           11
## 3 former             6
# Changed geom from "col" to "line", yet preserving color
affil_plot2 <- ggplot(hot_dogs3, aes(x = year, y = num_eaten, color=affiliated)) + 
  geom_line(aes(group=1)) +
  labs(x = "Year", y = "Hot Dogs and Buns Consumed") +
  ggtitle("Nathan's Hot Dog Eating Contest Results, 1981-2017") +
  scale_fill_manual(values=c('#E9602B','#2277A0','#CCB683'), 
                       name="IFOCE-affiliation")
affil_plot2