(Warning, a number heavy post lies ahead.)
When it comes to college football, everyone has an opinion on the most effective way to win games. Some think a tenacious rush defense is the best way to come out victorious while others believe that a full on aerial assault is the superior way to go. While everyone has their opinion, very little research has been done on the subject.
If you remember back to last summer, I briefly imitated the model
used by Brian Burke of AdvancedNFLStats.com and came up with simple correlations to determine the best efficiency numbers (i.e. per attempt rather than per game numbers) for measuring college football teams. At that point in time, I concluded that offensive/defensive pass yards per attempt, offensive/defensive rush yards per attempt, and offensive/defensive turnover percentage were the best measures of success. However, I have since updated the data with 2012's numbers and also redefined the previous statistics to include more variables
, giving a better indication of how teams win. There are 723 observations in the analysis.
First, I'll explain the change in how variables were measured. Previously, pass yards per attempt were defined as total pass yards divided by passing attempts. However, a very important part of the passing game was left out of this traditional calculation, that variable being sacks. Since sacks are undeniably a part of the passing game, I've added them to the calculation on both sides of the ball. This new statistic with sacks and sack yards included is called true pass yards per attempt, and it individually correlates higher to team success than its basic pass yards per attempt counterpart. The second change to the original data came with offensive/defensive turnover percentage. Whereas the previous model measured it as total drives divided by total turnovers, the new model measures it as total plays divided by total turnovers. I did this because team drive data isn't available for years before 2011. The new measure also correlates slightly higher. Now that we've went over the new definitions, here's the correlation summary for all six stats when measured against winning percentage.
Keep in mind the closer to 1 or -1 the higher the correlation.
The numbers above suggest that defense, in all three categories, is more important than offense. Also, if you were to rank upon higher correlations, passing would be the highest rated event, followed by rushing and turnover percentage. While these numbers give a good baseline analysis on how teams win, this isn't done in the most complete manner as all things aren't held constant in the analysis. In order to correct this, I've run a simple regression to find out which variables are the most important when all things are held constant.
Below is the chart containing correlation coefficients along with the R Square value.
First, every single independent variable in the regression is statistically significant at the 1% level.
The R square value is high, illustrating that 76.8% of variance in team winning percentage can be explained using the above variables. Naturally, there are other factors which help explain how teams win, but this model explains a significant portion of team winning percentage. With the above coefficients, we can create a linear model to predict a team's winning percentage.
Win%= .520 + (.076 * Off. True PYPA) + (-.070 * Def. True PYPA) + (.069 * Def. TO%) + ...
While the above model is a decent fit, we need to take it one step further and standardize the coefficients.
Why? Because variables like offensive rush yards per attempt and offensive turnover percentage aren't measured on the same scale. To get standardized values, the excel function "standardize" is used to put the data into z-scores (how many standard deviations above or below average a data point is). The below chart contains those values and reveals the true relative importance of the statistics.
From this model, once all things are held constant, we can conclude that efficient passing on offense is the most meaningful variable when measuring against winning percentage.
If a team finds itself one standard deviation above average in Off. True PYPA while being average in all other categories, they'd be expected to increase their winning percentage by .081 points. The same technique is used to evaluate the other coefficients as well. For example, if a team rushed the ball one standard deviation better than average while being average in all other categories, they'd be expected to increase their winning percentage by .031 points. Knowing that passing on offense and defense are the most important factors when it comes to winning percentage, it should benefit teams to favor the pass more in play calling and recruiting.
This doesn't mean teams should abandon their current strategies, after all, some teams are very successful with turnovers and rushing, but according to this model the passing game shows more relative importance.
What does this mean for Kentucky going forward? As I'm sure you're aware by now, Mark Stoops has brought in one of the nation's top offensive minds to manage the offense. In 2012, Neal Brown's Texas Tech offense was a model of efficiency, ranked 13th in total offense. They averaged 7.3 true yards per passing attempt (28th nationally), 4.56 yards per rush (48th), and lost the ball on 2.4% of plays (68th). While their ability to rush and hold the ball weren't elite, they still had one of the nation's best offensive attacks because of their passing efficiency. In the future when talent has increased, Mark Stoops should have the Cats towards the upper half of the FBS as his Seminole defense was ranked 2nd nationally in total defense. This was accomplished by allowing 4.1 true yards per passing attempt (1st nationally), 2.74 yards per rush (4th), and forcing turnovers on 2.4% of plays (68th). Once talent is up to par, Wildcat fans should feel confident going forward as Stoops and Brown do the most important things well on both sides of the ball.