Boom or Bust: Notable Factors in Predicting College Players’ NFL production
Authors: Nikhil Dewitt (Project Lead), Aadi Malaviya, Satvik Lakamsani
Every year in April, teams gather to use a limited amount of picks on players they think can help them win games. While some players are drafted tobe the final piece of the puzzle, others are viewed less as finished products and more as long-term developmental pieces. Despite investing thousands in scouts and tests to try to project how a draft prospect will turn out, teams quite often end up wasting prime draft picks on players that are either mediocre or simply not ready for the NFL. This article aims to look at different metrics and see how they align with NFL production.
Predicting Whether a Quarterback will Bust
For our first point of analysis, we focused on quarterbacks, since in the modern NFL they are the primary factor in how well a team does, and even mediocre play can easily set a low ceiling.
First, we set out to define what makes a quarterback not a bust. We focused on quarterbacks drafted in the first two rounds, as there’s too much randomness after that to draw meaningful conclusions. While players drafted earlier generally have the ability to showcase their talent if they have it, players drafted later often don’t get enough playing time to even show whether they can contribute.
We defined the following metrics to classify whether a QB is a bust or not. (Note, these metrics focus on detecting base-level production over MVP caliber play, as very few prospects would be considered busts if they didn’t play at that level). To be considered a “success”, a quarterback must meet at least 3 of these five criteria within their first five years:
- ≥ 50 NFL starts
- ≥ 50% win percentage
- ≥ 85 passer rating
- ≥ 7.0 yards per attempt
- ≥ 7.0 adjusted yards per attempt
Whether a quarterback is considered a bust or not is debatable when it comes to certain players. We knew we wanted to incorporate a wide variety of metrics and statistics while ensuring that a quarterback didn’t have to meet every single metric.
We started off by including two metrics showing that the quarterback was contributing to winning; they were getting a good amount of starts (since quarterbacks who suck are typically benched way before they reach this), and they have at least a 50% win rating. We also included another metric stating their passer rating had to be at least 85, since by the end of five years a quarterback without this rating is typically out of the league. Finally, we included two metrics analyzing a player’s ability to make progress with their attempts to eliminate players who got way more playing time than their ability suggests, largely because their team needed a cheap system quarterback.
Next, we trained a simple Random Forest Classifier to predict whether a quarterback was a bust or not. Initially, we chose to use ten different features in our model. We looked at height, weight, QB rating, completion percentage, passes attempted and completed, yards per attempt, interceptions, touchdowns, and passing yards. However, ultimately we decided to include only five of them in our visualization representing the importance of each since some of them reflected the same characteristic.
Mutual information importance is a feature selection tool that aims to measure how much removing one variable from a simple model decreases your ability to get reliable predictions. Here, this measures whether removing one variable (let’s say height), significantly decreases your ability to predict the result. Random forest importance is a metric that determines how often a feature is used in the many splitting trees in the model that was trained. One flaw of random forest importance is that it is often biased towards features with large deviations, but quarterback features generally follow similar trends , so this will not affect the results.
The above visualization illustrates the mutual information and random forest importances for five selected features. Passing yards is the clear strongest indicator that a player won’t bust, and this makes sense since most prospects with low volume have very low floors and are more often drafted for a ceiling they have a low chance of reaching. By contrast, prospects that consistently pass the ball a lot but may not have the athleticism to have an mvp ceiling are far more likely to contribute to the league in some way.
Next, we decided to examine how much draft position is correlated with pro bowl selections and how it compares across different positions.
Which Positions Provide the Most Value?
Initially, we considered using draft position relative to the others at that position. However, we decided to ultimately go with absolute draft position (the pick the player was selected at). If a quarterback is the first selected in a draft with no star talent, they don’t have the same expectations as one drafted much earlier.
We conducted a simple correlation analysis to determine this. In the original dataset, there were more positions like fullbacks which we removed due to them not being that relevant in modern football, and nose tackles which we merged with defensive tackles. In addition, we focused on players drafted in 2000 or later because scouting wasn’t as effective in the early days of the NFL. (Note, since earlier draft picks are considered “better prospects”, we initially had a negative correlation, and this plot depicts the absolute value of that).
Guards had the highest correlation between draft position and success. Running backs (RB) also had a strong correlation between how high they were drafted and their performance in the NFL. This makes sense because physical traits that are noticeable by scouts play a big role in both of these players’ success. Unsurprisingly, very few punters (P) were drafted and their draft position had little correlation to their outcomes. Despite being the highest paid positions, wide receivers (R) and quarterbacks (QB) don’t have the strongest correlations between absolute draft positions and pro bowl selections. In particular for quarterbacks, many prospects that aren’t actually that qualified get selected high because teams are always looking for their “quarterbacks of the future”. In addition, many wide receivers may look better or worse than they actually are due to stellar or poor college QB play, or their pro bowls could be tied to having a great quarterback throwing to them.
A Look at the Combine
Finally, we conducted some analysis on the NFL Combine. Every march, players participate in a wide variety of drills and measurements to assess their physical traits and athleticism, and in some cases, players can move up and down a lot based on one measurement. Here, we aim to look at which physical metrics have the largest and smallest correlations.
To try to understand each feature’s importance for each position, we trained a simple random forest model, and then identified each feature’s importance.
Height didn’t matter much for most prospects, except for quarterbacks. Short quarterbacks continue to get dinged for not being able to see over defenses and hence get sacked constantly, so this makes sense. Weight was considered far more important across all positions. For some speed-based drills, there was little correlation at all for bulkier positions like lineman and centers. A center © running a fast 40 yard dash may make the news, but it is not going to do much for his draft stock. However, most drills had uniform importance across the spectrum of positions, showing that athleticism correlates well with high draft position for almost every prospect.
Conclusion
This article looks at a few different metrics across both college and testing to see if there are meaningful correlations; while there are some noticeable patterns, there is also a lot of variance despite teams spending more on scouting than ever. This goes to show the high variability in players’ outcomes; many players look statistically fine but simply don’t have the intangibles to make it in the league.
Despite having access to more data and metrics than ever, teams still struggle to figure out which prospects will succeed in the NFL and which ones won’t. In addition, general managers continue to get blinded by athleticism that most models show won’t translate and draft these raw talents anyways. Many busts had either character or injury issues that simply didn’t emerge before they got drafted.