Authors: Anya Smolentseva (Project Lead), Tony He, Kevin Ngo, Olivia Weisiger, Jordyn Fuchs
First premiering in 2000, Survivor has garnered attention from millions of fans worldwide with its dramatic back stabbings, dangerous living conditions, and entertaining tribal councils. In this reality TV show hosted by Jeff Probst, contestants are marooned in a remote location and must outwit, outplay, and outlast each other to win a cash prize. They face physical, mental, and social challenges, form alliances, and vote out competitors during Tribal Council meetings. The game revolves around adapting to the environment, strategizing, and building social relationships. The game culminates at the final Tribal Council where a jury of eliminated players vote for the winner based on gameplay and social interactions.
Unsurprisingly, thousands of applicants compete for one spot out of 20 every season of Survivor. Wanna know what it takes to be a survivor? In this article, we will explore and analyze several aspects of the show such as what it takes to win, the importance of winning challenges and more.
Predicting If a Survivor Would Return or Not
Alright, Survivors! Here’s what we’ve got. To determine whether a participant would make a comeback for another season, we dug deep into the data. We used dimensionality reduction techniques to trim down the features, keeping around 12 to 18 components that accounted for 90% of the variance.
Now, we had to get rid of redundant stuff. For example, challenge win percentage? Gone. Challenge appearances? Outta here. We wanted to focus on what truly mattered: challenge wins. After stripping away the unnecessary, we unleashed a random forest classification model where we found the three key factors that influenced a participant’s return. It all boiled down to Challenge Wins, individual challenge reward wins, and individual challenge wins.
Exploring further, we built a Support Vector Machine (SVM) model to predict a participant’s return. This model gave us an accuracy of 81% with a standard deviation of 3.52%.
Now, take a look at these visualizations of the classifier. The purple dots on the left show every survivor on the map that we predict will return. The pink dots on the right show every survivor on the map that we predict will not return. The dots that fall in the island(green area) are people that actually returned while the dots that fall in the water(blue area) are people that actually did not return.
So, based on our analysis, we can confidently say that Challenge Wins reign supreme, followed by individual challenge reward wins and individual challenge wins when it comes to a participant’s chances of returning. The more challenges you win, the more likely we will see you again!
Using our SVM predictive model, we can give you a sneak peek into the Season 39 participants that will return in the future. Get ready because we’ll be welcoming back Tommy Sheehan and Noura Salman! It’s gonna be one epic return, folks!
Throughout the 44 seasons of Survivor, many twists have been added to keep gameplay interesting. Above all is the concept of exile on the show, meaning any number of days a contestant was technically still in the game but not participating in the regular game with their fellow contestants. Exile has many names in Survivor, including “Exile Island,” “Redemption Island,” “Ghost Island,” and “Edge of Extinction.” This component of the game added great nuance to an already socially and physically demanding game while providing players with additional chances to find hidden advantages or remain in the game after being voted out, with the promise of being presented with an opportunity to earn their way back into regular gameplay at some point.
From this added factor of the game comes many questions surrounding how this component affects gameplay. Are players who spend any number of days in exile more likely to win the game, due to increased chances of finding immunity idols or other hidden advantages? Or, are all the benefits and additional chances that come with days spent in exile opposed by the social isolation that exile accompanies, which is dangerous in a game that relies heavily on a player’s ability to build and maintain relationships?
To begin answering these questions, we can explore a line chart of total days spent in exile per season with an exile component. We can also color the lines by sex to see if there is a disparity between sex and total days spent away from the game.
From this line chart, we can see that the total number of days spent on exile varies less by sex and more by season. For the majority of seasons, the total number of days spent on exile is roughly even across sex. There are a few exceptions, as seen in seasons 22 and 27, but on average, sex is not a significant predictor of the total number of days a player will spend on Exile Island.
This plot also helps us see the extent to which Exile Island played a role in each season. We can see that in earlier seasons, the exile component was less prevalent, but as the show continued for many more seasons, Exile Island played a more significant role, as seen by the massive trend upward in total days spent on exile during seasons 38 and 40. The popularity and increased usage of Exile Island as a key component in later seasons can potentially explain the phenomena of Survivor itself becoming increasingly popular and growing its fanbase. As more “superfans” of the show begin to play, the element of Exile Island provides players the opportunity to redeem themselves after being voted out and demonstrate their devotion to winning the game they love.
Next, we can explore how spending and number of days on Exile Island affects a player’s game in terms of receiving an increased number of votes cast against the player (VAP).
Though the trend of this simple logistic regression model does not strongly follow the ‘S’ shape as we would expect, VAP was a statistically significant predictor of exile status. From the plot, we can see that as the VAP increases, so does the probability of the player spending at least one day on Exile Island. This trend is expected, seeing as many players are sent to exile because they have been voted out (they received a majority of votes against them at tribal council).
However, there are other potential factors that play a role in this trend. Since Survivor is a deeply social game, the concept of Exile Island and being isolated from the rest of the group is perceived as detrimental to a player’s game. Typically, this allows for other players to develop relationships and strategies without the exiled player, and this model validates those fears associated with Exile Island, since non-isolated players seem to plot together and vote out players that have spent time away from the game. Additionally, since time spent in exile implies chances at finding advantages or idols, this can place a target on a player’s back if they are able to earn their way back into the game.
Now that we have done a preliminary investigation of how exile affects gameplay, we can look deeper into the winners of seasons of Survivor that had an exile component, to see if there is any association between spending time in exile and winning. For our purposes, we will consider winning as placing top 3, since these players all go to “Final Tribal Council,” in which the jury of voted out players cast a vote for the player they deem worthy of winning.
We can see that out of the seasons where exile is a component, 64.71% of winners spent no days in exile, they participated in regular gameplay the entire season. As the total number of days spent in exile increases, the percentage of winners per that number of exile days drastically decreases. It is clear that although Exile Island offers players redemption and the opportunity to earn idols and advantages, most winners tend to not spend time in exile. So, if you ever find yourself playing Survivor, avoid Exile Island and focus on building relationships, winning challenges, and securing idols and advantages in the regular game.
Predicting Survivor Winners
In Survivor, factors such as strategy, social gameplay, physical strength, and sometimes a lot of luck influence winners. Sometimes almost considered random, we attempted to quantify how much these factors really contribute to a survivor’s win.
For the purposes of this research question, winners were classified as contestants who placed either 1st, 2nd, or 3rd. After training several logistic regression models, the variables VFB (Votes for boot –number of times in which player cast vote for person voted out correctly), InICW (Individual Immunity Challenge Wins), TChW ( total tribal/team challenge wins), nonVFB (number of times player voted incorrectly at Tribal council), and TVA (votes against player) were found to be the most significant factors that contributed to whether or not a contestant won. In the first figure, the confusion matrix depicts how accurate our model was in using these variables in predicting winners.
Our model was accurate 62% of the time in predicting winners, and was accurate 99% of the time in predicting non-winners. While these metrics could be improved either through data manipulation or employing sampling techniques, for the purposes of this research question we will use this model.
From this graph, we can see that the ability to vote correctly is the most important factor for winners, followed by the ability to win individual and team challenges, voting incorrectly, and the amount of votes against each player. In the game of Survivor, a contestant’s ability to vote correctly is directly tied to the strength of their alliances as well as their social and strategic gameplay. As we can see, on average winners tend to vote correctly far more often compared to non-winners, confirming what many fans know that it is extremely important for contestants to be in-the-know when it comes to voting.
An interesting observation is that the amount of votes against each player is the least important out of these factors. In the game of Survivor, contestants who receive more votes against them are usually considered “threats” and strong competitors in the game. Based on these results, we can conclude that it is far more important to stick with a strong alliance than to be a strong competitor.
As we can see that throughout the seasons, non-winners and winners on average have had a similar amount of votes cast against them. While there are certain seasons where winners have had significantly more votes cast against them, the general trend shows that most winners had few votes cast against them, indicating that they were either strong competitors with tight alliances or weak competitors that were not considered threats.
Overall, if you’re planning on applying to Survivor, regardless of whether or not you are a threat whether it be physically or socially, focus on building those strong alliances to really up your game.
Challenges in Survivor play a major role in allowing players to display their individual athletic and intellectual abilities. Success in both individual and team challenges can have major benefits for competitors, including rewards and immunity from votes. However, overachieving in these competitions can also have some drawbacks, as other players may begin to see consistent winners as a threat.
We wanted to take a look at potential correlations between individual challenge wins and the number of times a player escaped tribal without a single vote. The plot to the left shows that players ranging from 8–10 individuals wins had the highest number of tribals where they received zero votes (vote-free tribals). The immunity aspect of individual wins could help raise this number. In addition, winning a middle-range amount of challenges could help players do well in the game without standing out too much as a strong competitor. Players who won a higher number of challenges were less likely to have many vote-free tribal councils, as they were probably seen to be a threat by other players.
We also wanted to see how challenge wins would contribute to a player’s overall survival average. This average is a quantitative measure of overall success in challenges, tribal councils, and jury votes. A higher average indicates that a competitor played a successful game.
In this plot, there appears to be a positive trend between the survival score and challenge win percentage, indicating that those who were successful in challenges also played a strong game overall. These results suggest that successful Survivor players must have strong athletic and intellectual abilities- two components that are tested in individual challenges. However, it is important to note that this trend is not linear. The variations in the plot signify the overall randomness of Survivor. There is not only one factor that we can use to judge a successful player. For instance, there are many players who won zero immunity challenges but still ended the season with a decent survival score. Although their score was never as high as those who won multiple challenges, many were still in the 5–10 range. Thus, lack of success in individual challenges does not automatically mean a contestant cannot have a successful season. We can search the data for trends, and some may stand out, but ultimately the decisions of the game lay in the hands of the contestants. While some outcomes may be more likely than others, we cannot truly and accurately predict every aspect of a game like Survivor.
Is winning team challenges more important than winning individual challenges?
Since there are both team and individual challenges in Survivor, we wanted to investigate which type of challenge was better at determining a contestant’s placement in the competition.
Examining the barplot above, the team challenge wins of the top 9 contestants are roughly the same on average, being around 5 to 6 wins. After that, the total team wins of those who placed higher than 9th gradually decreases.On the other hand, the individual challenge wins gradually decreased immediately after 1st place. This is different because the team challenge wins remain relatively stable until 9th place. This can be explained by many factors, such as the top 9 winning together as a team to survive, meaning they would have a similar number of wins.
However, this bar graph does not show how effective individual or team wins are at predicting placement. To answer this question, we did a logistic regression to see the importance of each feature. The results of the logistic regression can be seen in the figures below.
As seen in the confusion matrix, there were a lot of false positives, meaning that the model predicted a contestant would place top 3, when in reality, they didn’t. One possible explanation of this was that when tuning the parameters of the logistic regression, the scoring method used was f1, which is a balance of low false positives and false negatives. Another issue addressed was the imbalanced data, since only 120 out of the 731 contestants placed top 3. A stratified split was done to ensure that the proportion of contestants that placed top 3 was approximately the same in the training and testing sets, but it didn’t improve the model’s performance. Although a hypothesis test for significance wasn’t conducted, InChW (individual challenge wins) had a higher coefficient than team challenge wins, meaning that individual challenge wins are probably a better predictor of a top 3 placement than team challenge wins.