Genius YouTube Advice: How to become the next Mr. Beast (REAL)

DataRes at UCLA
14 min read1 day ago

--

Authors: Justine Constantino (Project Lead), Allison Chen, Anshul Chennavaram, Kaera Mitchen, Aiden Nguyen

With the growing use of social media platforms such as TikTok and Instagram, it is quickly becoming the main form of how we consume news and entertainment. According to a Pew Research Center survey of 5,733 U.S. adults (2023), most U.S. adults use YouTube (83%) and Facebook (68%). With YouTube on top as the most used platform, it is also the top platform for teens, with roughly nine-in-ten teens saying that they use YouTube according to a Pew Research Center survey.

Therefore, with the increase of the content creation space, it serves opportunities for many others to desire to become a part of that space and potentially make it big. Our project aims to analyze YouTube data, looking at different trends and attributes of popular YouTube channels all across the world to answer the question: How can someone make it big on YouTube as a content creator?

To answer this question, we looked at it from a longitudinal perspective and narrowed the content creation process into three different steps: Phase One: Idea Generation, Phase Two: Pre-Upload, and Phase Three: Post-Upload. As for our data, we utilized data from various Kaggle datasets that contain Mr. Beast thumbnails, YouTube trending videos from various years, and title datasets. We also web scraped from Socialblade.com the top 100 YouTube channels of each country and compiled their data into one dataset.

Phase One: Idea Generation

During the Idea Generation phase, we envisioned that someone who wants to enter this space would be interested in what type of video audiences are interested in. They would make a video in this category. To explore this, we looked at the top channels from every country and what type of content they create.

The pie chart above represents the genres from the top 100 YouTube channels from every country in the world in 2024. The top five categories are as follows: music (24.1%), people (19.6%), entertainment (16.5%), games (9.65%), and film (5.08%). However, upon looking at the dataset, the music, entertainment, and film categories mainly have more channels as most are related to entities that have large followings outside of YouTube. For example, the music genre consists of music artists from top labels. Therefore, those categories were excluded.

Based on the pie chart, it can be concluded that the top genres for content creation are people (39.7%), games (19.5%), education (8.46%), comedy (6.85%), and sports (6.37%).

Phase Two: Pre-Upload

After the person has made the video, they move onto the Pre-Upload phase. The Pre-Upload phase consists of thinking about the thumbnails for the videos and title generation.

First, we will explore the importance of thumbnails.

This histogram reveals the intriguing relationship between the presence of human faces in video thumbnails and viewer engagement, as measured by view counts. The data, collected from MrBeast’s channel from a Kaggle dataset, is presented in two distinct categories: videos with faces in their thumbnails (blue) and videos without faces (red).

The histogram demonstrates a higher frequency of videos with faces in the lower view count ranges, suggesting that while these videos are more common, they do not necessarily guarantee higher viewership. Interestingly, videos without faces show less frequency but maintain a more consistent presence across various view count segments, including the higher ranges.

This visual data suggests that while using faces in thumbnails might be a common strategy to attract initial attention, it does not always translate to higher viewership. This could be indicative of viewers’ preference for content clarity over mere attraction to human faces. Content creators might consider these findings to strategically design their thumbnails based on the intended audience and content type, potentially balancing the inclusion of faces with clear, engaging visual cues that accurately represent the video content. However, given that this data was only analyzed from MrBeast’s YouTube channel data, it should not be immediately generalizable.

Now, we will look at titles.

Before analyzing specific attributes of the titles, the word cloud below provides a quick glimpse into the top 200 phrases among video titles.

It seems like the top titles of top performing videos are titled with “Official” such as in “Official Video,” “Official Trailer,” and “Official Music.” There are also a lot of titles that are of artists and creators such as “Cardi B,” “Nicki Minaj,” and “Jake Paul”.

Are there any specific titles that would guarantee views? This prompts us to explore and analyze title characteristics.

A dataset of over 100,000 videos was used to conduct the following analysis on titles. For simplicity, it was cleaned to remove titles with accents and diacritics as titles containing emojis may have been incorrectly processed and would lead to complications throughout the analysis. This alteration of the dataset should be noted as the dataset might contain a higher ratio of videos published with an English title.

To break down the video titles, we calculated the word length, percentage of uppercase characters, percentage of punctuation, and percentage of vowels in titles. These attributes will serve as independent variables we use to predict video views.

The graph above looks to be bimodal, with a spike in the number of views when the title length is between 7 to 9 words and another spike in views when the title is 17 words. The videos with the most likes were mostly top-performing videos with title lengths of 7 to 8 words.

The scatterplot above graphs the percentage of uppercase letters in titles against views. The visualization demonstrates right skewed normal distribution with most videos with a high number of videos having approximately 10% to 20% of its titles being uppercase letters.

Next, we explored the relationship between the percentage of punctuations in the title and view number. Again, we observed a scatterplot that resembles a right-skewed normal distribution. Most videos have less than 20% of titles being punctuation and from a 3% to 10% punctuation ratio, we would observe an increase in views when the percentage of punctuation increased.

Lastly, most videos seem to have 20% to 40% of their titles being vowel characters. It is interesting to observe that the scatterplots all resemble a skewed normal distribution. However, that also exhibits that the attributes selected and the number of views do not have a linear relationship. To further explore how the attributes of a video title play into the number of views the video can garner, we decided to train predictive regression and classification models on the dataset.

A Gradient Boosting Regressor was trained to predict view counts based on the aforementioned title attributes. The accuracy of the model, however, was only 38.5%. Below is a graph of our feature importance.

Let’s see if by categorizing our view counts and utilizing a classification model, we can train and have a better machine-learning model that can predict our view counts.

We categorized the views into three categories, labeled 0, 1, and 2 — as low, medium, and high view counts. View counts between 0 and 100,000 were labeled as 0; view counts from 100,000 to 1,000,000 are labeled as 1; view counts higher than 1,000,000 are labeled as 2. Below are boxplots showing the distribution of each title characteristic by view category.

Some notable aspects of the boxplots are as follows:

  • The median title length for videos with lower view counts is much higher than that of videos with higher view counts.
  • The interquartile range for the ratio of uppercase letters for lower view counts is much greater, showing more variability in uppercase ratio among videos with lower view counts.
  • Videos with higher view counts have a higher median for their ratio of punctuation.
  • Vowel ratio seems to be relatively similar across all view counts, which is interesting as vowel ratio was the most important feature in our previous model.

In addition to the aforementioned attributes, we also wanted to conduct a sentiment analysis on the titles. However, as sentiment analysis processes more slowly throughout the dataset, we decided to perform the analysis on 3000 randomly sampled rows of the dataset. The logistic regression model is then trained on 70% of the 3000 sampled rows of the dataset using the title attributes and sentiment analysis to categorize the view counts. Here is a quick glance at the mean scores for each sentiment category based on the view category.

Category “neutral” receives the highest mean scores throughout the 3000 sampled titles and within each view category. However, we want to underscore how View Category 1 has a balanced mean “negative” and “positive” sentiment scores while View Category 0 (videos lower view counts) has higher mean “negative” sentiment score than that for “positive.” For View Category 2 (videos with higher view counts), the mean “positive” sentiment score is much higher than that of “negative.”

The boxplots listed below will provide a more detailed visualization for the distribution of sentiment scores for each sentiment category.

The logistic regression model provides the following equation:

view_category = -1.0465 + 0.0267(title_length) + 0.7124(uppercase_ratio) — 0.5659(punctuation_ratio) — 0.0948(vowel_ratio) + 0.1181(negative) + 0.3244(neutral) — 0.4733(positive)

The overall accuracy of the model rests at 48.89%. It is quite low and only when other attributes that are more correlated with views such as “likes” is added as an independent variable would the model’s accuracy increase to above 70%. This is reasonable and highlights how there are a lot of confounding variables we are unable to take into account when considering video titles alone. A good video title cannot guarantee a high number of views. YouTube searches are prone to timely trends, personal interests, and — most of all — the YouTube algorithm, which in itself considers numerous other factors such as subscribers, country of publication, and more.

Let’s take an even closer look at titles.

Let’s try to figure out what the best keywords are to use in a given video. Here, we will parse through all of the YouTube titles for 5-Minute Crafts, a popular YouTube channel with over 80 million subscribers (they are definitely doing something right).

After deleting stop words and words like “hack” and “crafts” that are related to the channel itself, here are the results of our analysis.

Here, we decided to visualize the top 25 keywords with a bubble chart, larger bubbles representing keywords with higher views and smaller bubbles indicating lower views.

The keywords generally fall into four categories: general concepts, specific attributes, action-oriented words, and a special category I’ll reveal below.

The general concepts include words like “ideas”, “make”, “crafts”, “tricks”, “diy;” the specific attributes are words like “simple”, “crazy”, “smart”, “cool;” and action-oriented words like “save”, “try”, “know.”

Notice how many of the keywords relate to creativity and crafting, highlighting the channel’s focus on DIY and creative projects. This shows that, at least for 5-Minute crafts, the words that correlate with the most views are on brand for what their videos serve to do, which is show DIY videos.

Now, for the BIG reveal…

Numbers.

Yes, numbers.

Here, you can see the varied use of numbers. The number, “25,” has nearly 2 billion combined YouTube views from its use in YouTube titles. This highlights how numbers can have a significant impact on attracting viewers to your videos.

Further, notice how the top ⅗ videos are all round numbers, whose number of views add up to 5.25 billion views. These top numbers suggest that people might be drawn to titles that have numbers, especially round numbers, potentially due to psychological factors such as the appeal of round numbers of their use in popular contexts (ex: ages, milestones, etc.).

Phase Three: Post-Upload

Finally, after the person uploads their YouTube video with the most optimal thumbnail and title, what will the rest of YouTube think? During the Post-Upload phase, we will consider audience retention and feedback.

We used a Kaggle dataset of 600,000+ YouTube comments on trending videos from 2017 to assess how audiences interact with popular videos. Comments are a great insight into how videos are perceived by an audience and the types of emotions that the content invokes. For this project, we decided to focus on the emotional makeup of comment sections so that we can utilize sentiment analysis.

We created a Word cloud of the most common words across comments (larger words in the cloud appear more frequently in the dataset). The most commonly used words appear to mostly lean towards positivity (“amazing”, “good”, etc.), with ‘Love’ appearing the most frequently. Neutral descriptive words are also very common such as “video”, “people”, “make”, “one”, etc. Although they aren’t AS frequent as positive descriptors, negativity and explicit words also appear in the cloud (“mean”, “bad”, “f**k”, “sh*t”). We also get a glimpse of how the time period of our dataset may affect our results, as “Trump” was enough of a trending topic in 2017 that it is the only name featured among the Top 100 words in our dataset.

We decided to go further and applied a sentiment analysis model (the same one used in our “Pre-Upload” section) to all the comments in our dataset. We want to see if there is an association with the views on a video and the ratio of positive to negative, or emotional to neutral comments it receives. For context, the sentiment analysis model we used is a roBERTa-base model that has been pre-trained on Twitter data. To simplify our analysis, we classified each comment according to its highest sentiment category score (“positivity”, “neutrality”, “negativity”).

For example, the text from the following comment was classified as “Positive”:

While the following is “Neutral”:

One thing to note is that “negativity” with this model does not necessarily always translate into hate or “troll” comments within this context, as a message that is expressing anger, sadness, or disgust will get a high “negative” score regardless of whether or not the sentiment is directed towards the content creator. So while we cannot reasonably estimate the hate a video received, we can analyze how comment sections reflect view count in terms of the general emotions exhibited by commenters. We hypothesize that videos with comment sections that are more negative than positive would have higher views since we’d anticipate that content that brings more negativity (polarizing or fear provoking content, sad videos, etc.) would be more “clickbait-y”. We also theorize that comment sections that are more emotional in general (positive or negative) compared to neutral would have higher views since content that is generally uninspiring and does not invoke any strong emotions for an audience would be less worthy of “going viral”.

Confirming what we hypothesized based on keywords in our word cloud, across all trending videos in our dataset, negative comments are not as prevalent as positive and neutral ones.

To analyze the relationship between comments and view count, we calculated proportions corresponding to the emotional makeup of the comment section for every video with at least 10 comments.

First, we looked at the proportion of positive comments out of all emotional (non-neutral) comments. A proportion of 0.0 indicates that all the comments of a video were negative, while 1.0 indicates that all the comments of a video were positive.

From our plot, it does not appear that there is any relationship between whether a video has more negative or positive comments and the views it receives.

Next, we decided to look at emotions in general by calculating the proportion of emotional (positive + negative) comments out of the entire comment section. A proportion of 0 indicates that all comments on a video were neutral, while 0.5 (the highest observed for this dataset) indicates that half of the comments were neutral and the other half were negative or positive.

Since proportions seem to be clustered with fewer observations below 0.25, we cut out those value to come up with a reasonable estimate based on the most commonly occurring proportion range. In context, it would make sense that proportions would fall within a certain range since it would be unusual to see a comment section where less than 25% of the comments exhibit emotion; or anything aside from an objective fact or neutral descriptor such as “this is a video”.

The smooth line indicates that there may be a slight relationship between the proportion of emotional comments and view count of a video, but surprisingly, it appears to show that more videos with more emotional comments appear to trend lower in view count.

Thus, it appears that there is no real evidence that the emotions exhibited by a comment section are associated with view count. For a content creator, it may be worthwhile to focus on other aspects of video creation and viewer engagement rather than baiting certain emotional reactions out of their audience.

Conclusions

So, how much do these factors actually contribute to YouTube success?

Not as much as you think!

Ultimately, our trends discovered that there is no specific relationship between the different factors that we assessed throughout the article. It may sound disappointing, but it is important to factor in the variation that occurs for example, like psychological habits and trends. How about the contribution of other social media factors bringing more people in from outside of YouTube?

Many of the insights we pulled are based on patterns within a certain time period. Therefore, we can attribute YouTube success to many different factors, but nothing can be 100% constant since the content on YouTube is constantly changing, people are liking different genres of content, and maybe the algorithm will push something one week and immediately bury it the next.

However, there is one constant that we believe exists! That is, one should play to the trends, meaning that they create content based on what is popular. That is a given, however, if we consider the algorithm, it may not even pick your video up.

In short, we can think and quantify these different factors all we want, but a huge aspect of YouTube success is pure luck. There is no one size fits all recipe!

References:

roBERT-a Sentiment Model:

Francesco Barbieri, Jose Camacho-Collados, Luis Espinosa Anke, and Leonardo Neves. 2020. TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1644–1650, Online. Association for Computational Linguistics.

--

--