Decoding Rotten Tomatoes Scores

10 min read3 days ago

Authors: Deshna Govil (Project Lead), Stella Koh, Maddy Yip, Julian Sandjaja, Henry Zhao

Introduction

In today’s digital age, the opinions of critics are crucial to a movie’s success. Rotten Tomatoes, a popular website used by many to assess the quality of a movie, brands a movie as either Fresh or Rotten based on a critic score from 1 to 100%. Yet frequently, the score leaves us puzzled.

We’re often eager to watch an anticipated movie only to be surprised by its surprisingly low Rotten Tomatoes score or, alternatively, left wondering at a critically acclaimed film that just did not work for us. This article takes a deep dive into what influences ratings and the relationships between critic and audience scores to decode what’s behind the numbers.

How have trends changed over time?

Over the years, mainstream trends have shifted as the tastes of consumers have changed. We aim to examine the evolution of consumer and critic reception to movies based on Rotten Tomatoes ratings by analyzing box office results and movie consensus. For ease of analysis, we have broken down the movie timeline into 3 roughly even chunks–before 1998 (the year Rotten Tomatoes was created), 1998–2010, and 2010–2020.

The plot above shows the top grossing movies, defined as earning worldwide box office greater than 500 million 2008-dollars, divided by audience and critic rating over the years. Prior to 1998, we see that the large majority of top earning movies were classified as “Fresh” (with more than 60% positive reviews), particularly amongst the critics. Several of these movies ended up within the top 10 grossing movies of all time, such as Gone With The Wind and Titanic. Post 1998, the ratings of these top earning movies varied significantly, with many classified as “Rotten” (less than 60% positive reviews).

Now, while a potential factor at play may be selection bias of movies added before Rotten Tomatoes was created, the site actually contains a substantial amount of movies from the 1980s and 90s, where we still do not see the same pattern of low-rated but high-earning movies. Thus, it seems that the explosion of movies in the early 2000s, likely due to lower barriers to entry with technological advancements, similarly led to lower standards of movie quality.

To get further insights into potential reasons for the changing rating-to-earning dynamic, we analyzed the content of critic consensus. Many adjectives of choice have remained the same across the periods, but with different frequencies. Movies created before 1998–prior to the creation of Rotten Tomatoes–were overwhelmingly described by critics as “classic.” These older films tended to withstand the passage of time and received recognition for high quality. In the period between 1998–2010, we begin to see the rise of debate between “predictable” versus “original” movies. Again, the rapidfire release of movies around this time, likely due to better technology and a rising industry, perhaps led to lower standards and quality of movies. In the recent decade (2010–2020), critic consensus has moved away from the originality debate, rather focusing more on the execution of plot as “strong” and the cast as “talented.” What has remained consistent over time, however, is humor. Across all three periods, the adjective “funny” has remained within the top 3 most commonly used adjectives, indicating that no matter the era, people enjoy humor.

What content rating is preferred?

Additionally, we wanted to determine how a movie’s content rating affects its score. The above bar graph illustrates the percentage distribution of each Tomatometer status by content rating. Tomatometer status is granted to movies based on their tomatometer score, which solely reflects critic ratings. The special title of “Certified Fresh” is only granted to films that have a critic score of 70% or above. As seen in the graph, NC-17 (no children 17 and under) rated films have the highest “Certified Fresh” percentage, as well as a fairly high percentage of Fresh movies. Interestingly, *NR (non-rated) films have the highest percentage of Fresh movies. NR, G, and NC-17 have high Certified Fresh and Fresh percentages, indicating a preference from critics. In contrast, PG, PG-13, and R-rated films have the highest “Rotten” film percentages and lowest percentages of “Certified Fresh” and “Fresh” films, suggesting that these films are received more poorly by critics.

*It is important to note that although NR films are technically non-rated, they are generally uncut versions of films that may contain more mature content than what was initially released.

In addition to Tomatometer status, we also analyzed the differences between average audience and critic scores by content rating. As seen above, critics tend to prefer non-rated films; however, these films received only the third-highest average audience rating. Meanwhile, audience members tended to prefer G-rated movies. Generally, a good indicator of an accurate Rotten Tomatoes rating is when the audience and critic ratings are within about 5–7% of each other. In this case, the difference in percentage for non-rated and PG-13 rated films is almost 10%, indicating that audience members and critics tend to slightly disagree when rating these two types of movies. In contrast, G, PG, NC-17, and R-rated films all maintain fairly similar average Tomatometer and audience scores, suggesting that these types of films are similarly received by audiences and critics. By analyzing trends by content rating, we can see what types of films each movie viewer tends to prefer and therefore predict how differently (or similarly) they will receive these films, thus revealing what influences a film’s popularity.

How do audiences and critics differ?

In analyzing scores, it is also important to recognize the differences between critic and audience scores, and their discrepancies. Taking a look at the trends of average scores over time between critics and audiences, it is apparent that audience scores are almost always lower than that of critics. A few potential explanations could include higher expectations from highly anticipated movies, different demographic groups between audiences and critics, review bombing — sometimes audience scores are skewed by intentional low ratings, and an analytical vs. emotional perspective between critics and audiences respectively.

However, although audience scores are almost always lower than that of critics, the scores follow the same trends as each other — meaning that the scores typically increase and decrease in the same respective years. It is likely that is due to certain blockbuster movies being released in certain years, and general good or bad years within the film industry.

Taking a closer look at the distribution of all critic and audience scores, we can see that they both follow a left-skewed distribution, with audience scores having a greater interquartile range. We see similar trends with this in all movie genres, except for the Western, Gay and Lesbian, and Sports and Fitness genres.

As shown in the above diagrams, the distribution of Western and Gay and Lesbian genres vary significantly from the standard left-skewed distribution, as with all other genres. The Western genre has similar trends between critic and audience scores, but has a much wider range of scores than typical. The Gay and Lesbian genre, on the other hand, has audience scores much lower on average than critic scores.

From a broader stance, the critic and audience ratings can be visualized for each movie in the website. This visual represents all of Rotten Tomatoes, with each point representing a film with its corresponding critic and audience rating. Darker shades indicate higher frequencies of those particular score combinations. The dotted line represents scores in which both audiences and critics share the same score, implying that the further a point is from this line, the larger the disparity between the two groups’ ratings.

From the visual, most points cluster near the dotted line, suggesting consistent scores between audience and critic; however, there seems to be ratings significantly distant from the line, suggesting a clear divide in opinions. Films that appeal primarily towards fandom enjoyment will appear towards the top left of the graph, indicating higher audience ratings in comparison to critic scores. One notable example is the movie Venom, with a 30% critic score and an 80% audience score.

On the other hand, films that appeal more to critics than audience would appear in the lower right corner of the visual, signifying films that seem to have stronger critical structure but may lack simple enjoyment. Films like The Last Jedi, with a 91% critic score and a 41% audience score, and interestingly enough, Sausage Party, with an 82% critic score and a 50% audience score. These films feature aspects throughout its plot or characters that brings some negativity towards casual audience viewers of that fandom, but bring an overall interesting technicality towards its story that critics may appreciate. The observations seen in this visual can be further explored from looking through genres.

The genre “Anime” is an example that displays an inconsistent difference of ratings between critics and audiences. As shown in the heat map, there is a higher frequency of positive ratings from critics compared to audiences, indicated through the number of points residing under the dotted line.

While this could be a result from a low sample size, it may also be explained by how differently critics and audiences perceive anime. The anime fandom is a large, global community, primarily connected digitally through the internet. Typically, this community places priority on the consistency of a character and their lore and depth. If a film fails to portray a certain aspect consistently, fans will feel more harsh on the film than they should be. Additionally, the “hive mind” principle exists, as fans of a certain anime fandom may adopt the opinions of others solely due to the fact that they are widely shared within the community. As a result, more bias exists within the audience as they hold a specific judgment towards this genre compared to critics, who may evaluate these movies on a more technical criterion.

The “LGBTQ” genre represents a genre in which audience ratings are more positive compared to critics. This is an inclusive genre that represents LGBTQ ideas and experiences. These films may appeal to audience members who value the ideas of this community, as they feel represented. As a result, this will lead to higher ratings from audiences due to the feeling that their ideologies have been represented.

This finding may seem to contrast with previous ideas mentioned about audience scores being lower overall than critics’ scores. However, this visual displays film ratings in comparison to both audience and critics, showing that audiences generally have stronger opinions on specific films in this genre compared to critics. Although critics may also show some positive bias, they generally evaluate films based on technical criteria, reducing positive bias that would have affected their scores.

What about other critic sites?

Moreover, we wanted to see how Rotten Tomatoes compares to other critic sites. We took a look at how the gross revenue of a movie affects critic scores on both Rotten Tomatoes and IMDb.

In the graph above, we see that Rotten Tomatoes critic scores fluctuate greatly across different levels of revenue. On the other hand, IMDb ratings are much more stable and lower than Rotten Tomatoes, especially in movies with lower gross revenues. This variability in Rotten Tomatoes scores suggests greater diversity in their critic opinions and that their critical approval varies regardless of a movie’s box office performance. After gross revenue increases beyond $300 million, a divergence appears where Rotten Tomatoes scores remain higher and still fluctuating while IMDb ratings stabilize around a lower average. This trend suggests that successful movies receive a broad range of critical reviews in Rotten Tomatoes but more consistent rating feedback in IMDb.

Lastly, we explored how Rotten Tomatoes and IMDb differ in respect to the original language of a movie.

We note how IMDb scores are consistently lower for each language compared to Rotten Tomatoes, remaining at an average of approximately 8/10, indicating that there may be a more stringent set of criteria that IMDb requires for rating a movie high. Persian movies score the highest in Rotten Tomatoes, with the foreign film Darbareye Elly scoring 99%. The movie discusses a young woman’s disappearance after being introduced to a suitor in Iran, potentially indicating Rotten Tomatoes’ critical acclaim for movies that address complex topics. For English movies, those produced in the UK score the highest on Rotten Tomatoes yet lowest on IMDb revealing perhaps different regional preferences.

Conclusion

Overall, we found that the following attributes correlate to higher ratings on Rotten Tomatoes:

NR, G, and NC-17 receive the highest critic ratings.
Funny movies are always a hit!
Audiences tend to be more biased to their emotional responses from a movie.
Hardship-themed movies score higher.

Nevertheless, ratings on Rotten Tomatoes aren’t a perfect science, but we can do our best to predict their outcomes to determine how movies can be better tailored to meet both critical and audience expectations.