What’s the Recipe to a 5-Star Restaurant?

DataRes at UCLA
8 min readJun 19, 2022

--

By: Joyce Jeon (Project Lead), Kyle Lee, Darren Sohn, Robi Chatterjee, Tracy Charles

Introduction

Yelp is one of the most popular social networking platforms, where users are able to discover crowd-sourced reviews on millions of businesses, primarily restaurants, and write personalized reviews for different businesses and establishments. Users rate businesses on a scale of 1 through 5 stars, where 1 star indicates an unsatisfactory rating of the business and 5 stars indicates a spectacular rating of the business.

Many businesses depend on these yelp reviews to drive business, as establishments with higher and more ratings can be associated with increased revenue. Naturally, businesses strive for 4 to 5-star reviews, and customers tend to go to restaurants with higher stars.

So, what makes a business/restaurant 5-stars worthy? Are there certain patterns and characteristics that distinguish 5-star businesses from lower-starred businesses? To uncover more information, our team analyzed the Yelp dataset, a “subset of our businesses, reviews, and user data for use in personal, educational, and academic purposes” (yelp.com). Our team specifically analyzed and cleaned the Yelp “business” dataset and the “review” to discover relationships, patterns, and factors that contribute to a 5-star business rating.

BODY:

Most Common Words in a 5-Star + 1-Star Review:

By creating a word cloud, it was possible to easily visualize the most common words that appear in 5-star and 1- star reviews. To create the word cloud, irrelevant words were categorized as stop words to avoid filler words such as “said”, “us”, and “go” from being ranked in the top 20 reviews of both 5-star and 1-star reviews.

The most common words in 5-star reviews were “great”, “good”, “service”, and “time”. Surprisingly, the word “great” had significantly more appearances in reviews than any other word. To be specific, the word “great” appeared 2590 times, whereas “even”, the 20th most common word in 5-star reviews, appeared only 576 times. Looking at the bar chart, good reviews tend to have reviews that emphasize customer service and hospitality: “service” appeared 1235 times, “friendly” appeared 821 times, “staff” appeared 785 times, and “nice” appeared 701 times. A restaurant is unlikely to become 5-stars if it solely provides good food; customer service is just as important as taste.

Contrastingly, the most common words in one-star reviews were “one”, “time”, “service”, “even”, and “never”. Just like in the 5-star reviews, customers continue to value good service, as service keywords such as “service”, “minutes”, and “experience” appeared 479, 294, and 192 times, respectively. Compared with 5-star reviews, there are no words that have significantly more occurrences than other words; they all have approximately 200 to 500 occurrences in the dataset.

Using these graphs in conjunction with each other can tell us a lot about how each of the cuisine types compares to each other. In terms of the number of reviews, American restaurants have by far the most reviews left by customers with around 13,000 reviews in total. The closest cuisine type would be Mexican restaurants with around only 4000 and the lowest number being Korean restaurants with about 500 reviews. This generally would make sense, being that there would be more American restaurants in America while other ethnic cuisines tend to be scarce in some areas of America.

If we look at the cuisine type against the average rating, then we see the opposite trend. Korean restaurants have an average rating of 3.973 stars and Vietnamese restaurants have an average rating of 3.856 stars. American restaurants have an average rating of 3.517 stars and the lowest average rating by cuisine type is Chinese restaurants at 3.389 stars. Cuisine types that have a lower number of reviews generally tend to have higher average ratings than cuisine types that have a higher number of reviews.

One possible explanation for this is the Law of Large Numbers. Comparatively, since American restaurants have almost 3 times the amount of reviews as Korean restaurants, it is unfair to accurately compare the two. In theory, the American restaurant reviews should average out to their true average rating because of how many reviews were left. The Korean restaurant could have simply gotten lucky with the small number of reviews they have received resulting in a higher rating than the true average rating.

Another explanation could be something beyond numbers, such as a difference in service culture for each cuisine restaurant. Many of the asian cultures have a heavy emphasis on respect that can often be seen in things such as the service industry. This could explain why the top three cuisines types for average rating are all of asian descent.

We can take a look at the overall star ratings of each state to determine if there is a certain state most where businesses have a high rating, and thus, more positive responses from customers. However, since the dataset used is only limited to 27 states, many of which only list 1 business, we can only draw inferences on select states based on a minimum number of businesses. Therefore, we cleaned the dataset and thus selected states with at least 10 businesses listed. By comparing the average star ratings of businesses in each state, we see that California among all the other states listed in our limited data has the highest rating of almost 4.00. It is followed by Nevada with an average of 3.73, and that is followed closely by Louisiana and Idaho.

Due to the aforementioned lack of information from across all states of the United States, we can only assume that the rest of the states follow a similar trend. From the map, we can see that states on the West Coast have a higher average rating than those on the East Coast.

This visualization is a frequency histogram of all the businesses’ average star ratings (1–5), one representing the worst experience and five representing the best experience. As every business had been rated by at least one customer in the dataset, I took the average star rating for every business to consider. From the fact that five-star ratings hold the largest density of around 45% of all ratings, customers felt compelled to leave a rating on Yelp in order to acknowledge that the business provided them with a noticeably great experience. On the other hand, two and three-star ratings both have the lowest frequencies at around 9000 and 10000 ratings, respectively. This means that customers tend to not leave a rating on a business if their overall experience was neither absolutely perfect nor horrid, which makes sense as there wouldn’t be a significant difference from sharing an average, expected experience.

This visualization is another frequency histogram of all the reviews’ sentiments (negative, positive, neutral). Using the VADER lexicon, I performed sentiment analysis on each Yelp review to determine a review’s overall sentiment from its polarity and intensity. Putting the frequencies of the reviews’ sentiments in this visualization also shows that customers would feel compelled to not only leave a 5-star rating but also to take the time to write a positive review. While the amount of positive reviews is immense, the amount of negative reviews are still around 100,000 times larger than that of neutral reviews. This additionally shows that customers are least likely to leave both a neutral star rating and review, compared to a positive or negative rating and review. This makes sense due to the fact that an average experience does not have as much significance as a uniquely great or poor experience to share and let future customers know. All in all, these findings convey that people tend to leave a Yelp rating if their experience was either really good or really bad, as opposed to a normal, average experience.

The above visualization shows the distribution of star ratings among restaurants that offer a delivery option. The average star rating for restaurants with a delivery option is 3.41 stars.

This visualization shows the distribution of star ratings among restaurants that do not offer a delivery option. The average star rating for these restaurants is 3.67 stars. On average restaurants with no delivery option have a higher star rating than restaurants with a delivery option. There are several potential reasons for this difference. One reason is that restaurants without delivery options have better food. Many high-quality restaurants may not have a delivery option because delivering the food would decrease the quality of the food. Another, related, reason why restaurants with delivery options get worse scores on average is that delivered food may not be as good as food eaten at the restaurant. Reviews of restaurants without a delivery option are all based on food eaten at the restaurant, while reviews of restaurants with delivery may be based on delivered food, which may not receive as high of a rating due to lower quality. Finally, reviews may also be based on delivery service. Restaurants that offer delivery have one more aspect that can be criticized that restaurants without delivery do not have to worry about. If a restaurant has poor delivery service then it may receive lower star reviews.

Conclusion:

In conclusion, we found that words such as “time” and “service” were used frequently in reviews, suggesting that customers are concerned with receiving food in a timely manner and with good service. These words were particularly frequent among one-star reviews, which implies most reviewers’ complaints were related to poor service or late food delivery. Comparing ratings between restaurants that offer delivery and restaurants that do not offer delivery showed that on average restaurants without delivery received higher star ratings. This observation further supports our claim that poor ratings are most often caused by poor service or late delivery, as restaurants that do not offer delivery and therefore cannot have late or poor delivery service, received more positive reviews. Overall, most reviews were positive both according to our sentiment analysis as well as our star rating histogram, which shows a much higher frequency of four and five-star reviews than one or two-star reviews. We also found that while average reviews may not vary much from state to state, California does have on average the best reviews.

--

--

No responses yet