Movie Ratings Analysis

5 min readMay 29, 2019

By Tanvi Pati and Vivek Menon

Motivation

With the rapid increase of review-based websites and apps, the way we make our decisions has changed drastically over the years. When we want something to eat, we now have the power to analyze every restaurant nearby us, see exactly how far they are, what their reviews are, and what the prices are before we make a decision, rather than just walking into any restaurant that looks interesting. We can say the same type of decision making process has entered the movie sphere: the reviews a movie gets in its initial release, more specifically its Rotten Tomatoes score, has a significant effect on whether people actually go to the theatre to watch that movie. While many critics agree that a number can’t be assigned to a movie to determine its worth/quality, general moviegoers appeal to the simplicity of the concept. Seeing a movie on the website with a high percentage and a fresh tomato beside it is much more enticing than another movie with a low score and a rotten tomato. But what many people don’t realize is that Rotten Tomatoes calculates two scores for a given movie: a critics score and an audience score. The critics score is more prominent, often attached to a movie as a promotional tool (if the score is high) and usually is the score people see first when researching a movie to decide if they want to watch it or not. This was intriguing to us: why do general audiences usually refer to critics scores in their decisions, instead of audience scores? Do audiences and critics tend to rate movies similarly? Are there certain genres of movies we should be adhering to the audience score rather than critics? Using a dataset of 426 randomly sampled movies from 1970 to 2012, we were able to answer some of these questions. Here’s what we found:

Plotting critics scores against audience scores, we can see a pretty large spread across the data. The correlation coefficient is 0.628, which while higher than we expected, is not as high as we would have hoped. The fact that the data is visually spread across the graph and carries a relatively low correlation coefficient goes to show that audience scores and critics scores are generally not aligned enough to the point where we can rely on critics scores for any movie to be an accurate judgement of whether we’ll enjoy the movie or not. After concluding this, we decided to dig a little deeper and try to figure out if we could generalize where the most drastic differences occur.

What about genres?

Now that we saw the overall association between the audience scores and critics scores, we were curious about how this varies across different the different genres. When we summarized the number of movies per genre in our dataset, this is what we found:

As we can see, the most movies were dramas and least were documentaries.

To look at the differences between critics and audience scores, we decided to make a bar graph with the different genres over the score. Each genre had 2 columns — one for the average audience score and one for the average critics score. We used the ggplot2 package in R to create the bar graph. This was our result:

We can see that for genres ‘Action & Adventure’, ‘Comedy’ and ‘Mystery & Suspense’, the average audience scores are higher than the critics scores. These results seem to make sense as these would be more enjoyable to the general audience than documentaries, which have a higher average critics score. Critics consider cinematography, dialogue, direction, production, acting, etc. However, the audience would focus more on the entertainment factor of the storyline. This explains the variation in result for several genres.

The most interesting result was ‘Horror’. We expected to see higher audience scores than critics scores, but we can see the opposite. Audiences would be more entertained by the jump scares and the scary story. Critics probably looked at the visuals, the storyline, the direction and more, which well-made horror movies would have to achieve in order to have the scariest effect on their audience.

So, overall the audience scores are higher than the critics scores for the more entertaining movies, with the surprising exception of ‘Horror’.

From the 70s to now…

Our dataset has records for movies from 1970 till 2012. We thought it would be interesting to see if there were any trends for each of the genre’s audience and critics scores over this entire time, instead of just seeing the average like in the previous graph. We expected to see some trends, like Comedies and Action & Adventure movies having high audience ratings throughout.

Using the ggplot2 package in R, we created a facet grid of line charts for all the genres. The blue lines represent audience scores whereas the red represent critics scores. This was our final result:

To our surprise, we couldn’t detect any significant trends. The lines turned out to be very random with not many interesting highs or lows. We speculated that this could be because there was a high variation in the number of movies and the ratings for them in each year, which resulted in a quite haphazard data visualization.

Conclusion

In the end, deciding whether or not you want to watch a movie is subjective: you need to decide what type of moviegoer you are and why you want to go to the movies, then make your decision from there. If you want to relax and turn your brain off for a few hours, don’t waste your time trying to find the action/comedy/mystery movie with the highest ratings. The Rotten Tomatoes score mainly serves today as a promotional tool, and in general, you’re better off ignoring it and focusing on what your preferences are instead.

Movie Ratings Analysis

Motivation

What about genres?

From the 70s to now…

Conclusion

Written by DataRes at UCLA