TDM 10100: Project 13 — 2023
Motivation: This semester we took a deep dive into R
and its packages. Let’s take a second to pat ourselves on the back for surviving a long semester and review what we have learned!
Dataset(s)
The project will use the following dataset:
-
/anvil/projects/tdm/data/icecream/combined/products.csv
Questions
Questions 1 (2 pts)
For question 1, read the dataset into a data.frame called orders
-
Create a plot that shows, for each
brand
of ice cream, the total number ofrating_count
(in other words, thesum
of thoserating_count
values) for eachbrand
of icecream. There are 4 brands, so your solution should have 4 values altogether.
|
Before solving Question 2, please build a data frame called bigDF
from these three files
/anvil/projects/tdm/data/icecream/bj/reviews.csv
/anvil/projects/tdm/data/icecream/breyers/reviews.csv
/anvil/projects/tdm/data/icecream/talenti/reviews.csv
using this code:
mybrands <- c("bj", "breyers", "talenti")
myfiles <- paste0("/anvil/projects/tdm/data/icecream/", mybrands, "/reviews.csv")
bigDF <- do.call(rbind, lapply(myfiles, fread))
Use this data frame bigDF
to answer Questions 2, 3, and 4:
Question 2 (2 pts)
-
In which month-and-year pair were the most reviews given? (There is one review per line of this data frame
bigDF
. -
Make a plot that shows, for each year, the average number of stars in that year.
Question 3 (2 pts)
-
Which key has the lowest average number of stars?
-
There is one entry in which the text review has more than 2500 characters! Print the text of this review.
Question 4 (2 pts)
-
Consider all of the authors of the reviews. Which author wrote the most reviews altogether? (Note: there are many blank authors, and there are a lot of Anonymous authors, but please ignore blank authors and Anonymous authors in this question.)
-
Considering the 43 reviews written by the author that you found in question 4a, this author is usually happy and gives high ratings. BUT this author gave one review that only had 1 star. Print the text of that 1 star review from the author you found in question 4a.
Project 13 Assignment Checklist
-
Jupyter Lab notebook with your code, comments and output for the assignment
-
firstname-lastname-project13.ipynb
-
-
R code and comments for the assignment
-
firstname-lastname-project13.R
.
-
-
Submit files through Gradescope
Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted. In addition, please review our submission guidelines before submitting your project. |