SB Nation college football editor Jason Kirk asked me to write about strength of schedule, and I’ll be honest: it pissed me off. Nothing personal, of course. We just all have our triggers.
I hate strength of schedule arguments more than I hate red zone fade routes, five-hour football games, and that this dumbass sport I love continues to get away with a bunch of nonsense because of a cynical, decades-old definition of “student athlete.”
I hate these arguments partly because of the contradiction. Let me explain.
In my work with advanced stats, the S&P+ ratings, and so forth, I field certain categories of insult pretty frequently. Here are two of the most common:
- “Watch the games, nerd!” The insinuation: if you’d get your nose out of the spreadsheets, dork, you’d see that Team A is clearly better than Team B.
- “Team A ain’t played nobody!” The insinuation: How could Team A possibly be any good when they haven’t played a team that clears someone’s arbitrary bar of goodness?
“You saw what happened in the games I watched, right?” the first one says. The second says to watch one particular game and ignore the others.
Football’s going to have a sample size issue no matter what, so we should milk whatever meaning we can out of every game we’ve got.
Arguing about strength of schedule the wrong way means limiting the sample even further and acting like we can’t get meaning out of every play of every game. We can. Honest.
Strength of schedule is at the heart of virtually every college football argument between October and January each year. Hell, it’s a point of debate for every college sport. The schedules are too varied and not comprehensive enough. That the power conferences are moving toward three-game non-conference schedules in football makes this national connectivity even worse.
At the pro level, there are strong and weak divisions, sure. But the schedules are still infinitely more balanced. In the MLB, everybody plays everybody in their league at least a few times, with some cross-league games. In the NBA and NHL, everybody plays everybody at least once. There is connectivity.
With 130 FBS teams and 12 games, that simply isn’t an option for college football. So we play who we play, and we yell about who our rivals ain’t played.
How strength of schedule determines the national champion, sort of
Arguments are one thing, but college football’s national title is more directly affected by strength of schedule than that of any other major sport. It’s baked right into the College Football Playoff selection process.
When circumstances at the margins indicate that teams are comparable, then the following criteria must be considered:
* Championships won
* Strength of schedule
* Head-to-head competition (if it occurred)
* Comparative outcomes of common opponents (without incenting margin of victory)
We believe that a committee of experts properly instructed (based on beliefs that the regular season is unique and must be preserved; and that championships won on the field and strength of schedule are important values that much be incorporated into the selection process) has very strong support throughout the college football community.
(A digression, but they should clarify that they’re looking at conference championships. A fan could read that and assume Alabama’s going to get permanent preferential treatment for its 114 claimed national titles.)
It is decreed that the committee takes strength of schedule into account, but it intentionally doesn’t say how. It frowns on advanced analytics — from the same rules document: “Nuanced mathematical formulas ignore some teams who ‘deserve’ to be selected;” God forbid — and gives no alternative guidance. So the committee ends up going with things like “wins over top-25 teams” or “assuring there’s no way in hell a team from a Group of 5 conference will ever get in.”
By now, though, some are figuring out pretty clearly how strength of schedule is taken into account.
ESPN’s stats team has created both forward- and backward-looking measures to assess the difference between the “best” teams and those typically determined most deserving of a title shot. As it turns out, the backwards view is pretty effective at mirroring committee action. The Strength of Record measure has proven quite adept at figuring out how the committee will rank teams.
Everything that goes into FPI’s SOS rankings — opponent strength, game site, distance traveled and rest — is used to capture the difficulty of a team’s schedule. Thus, amassing a number of “good wins,” no matter how the game was won, will boost a team’s Strength of Record.
Despite the committee’s mantra of selecting the “four best teams in the country,” it appears that in the first two years of playoff selection, the committee favored team accomplishment over team strength. So if you are trying to predict what the committee will do, take a look at strength of record, because seven of eight teams to make the playoff ranked in the top four of that metric before playoff selection. Then FPI can be used to predict which teams will ultimately come out on top.
The committee insists it is looking for the “best” team. It is not. Kirby Hocutt, chairman of the CFP committee as of 2016, has conflated “best” and “most deserving” on a number of occasions. An example:
Q: Are you looking for the four best teams or the four most-deserving teams? Is there a difference?
A: You have to take into account the entire season. The season doesn’t start in October. Everybody has 12 regular-season opportunities, and the committee is watching. At the end of the year, we want to make sure we have the four very best teams over the course of the entire season.
They do not. And that’s fine, I guess.
Two major problems, however: a one-point win is not a 30-point win, and you don’t have to wait till someone plays a good team to start learning.
Take these two pieces as an example.
“Are they tested?” is just a box you check. While you can certainly find examples of teams that look great against awful teams, then stumble when punched in the mouth for the first time (Louisville), you can usually glean just as much from how a team dominates bad competition as from how it plays against really good teams. Picking Team A simply because it is more tested than Team B is usually a recipe for making bad picks.
The early-season stats suggested that, despite not playing a top team yet, Florida State was pretty incredible. The Seminoles went out and left no doubt on Saturday night in Clemson.
The best teams, the most likely championship teams, are the ones that handle their business early and put games out of reach before luck, chance, fumbles, and vital offensive pass interference calls can impact the outcome.
According to the F/+ rankings, the Seminoles have been just barely good enough to survive No. 9 Clemson at home (without Winston), No. 15 Louisville on the road, and No. 19 Notre Dame at home. They survived No. 44 Oklahoma State on a neutral field, and they pulled away from No. 53 NC State in the fourth quarter. They pummeled No. 76 Syracuse and eventually got around to doing the same to No. 89 Wake Forest.
They have, in other words, solidified that they should be ranked around seventh to 12th in these ratings. They were 11th heading into this week. […] Without sustained improvement, and without the ability to play a full 60 minutes at a high level, they will in no way be a favorite to beat two top-four teams in the College Football Playoff and win the national title.
The 2013 team that hadn’t played nobody, but that was destroying its opponents, went on to win the national title.
The 2014 team that was winning, but not impressing the numbers, eked out a Playoff bid with seven one-possession wins (five against teams that went 8-5 or worse) but got embarrassed.
To its credit, the CFP committee did dock FSU a bit for its lackluster performance. The unbeaten Noles were third in the rankings behind two one-loss teams. But these two FSU teams are ultimate examples for this simple truism:
You can learn something from every game, if you try.
That’s the point of using advanced stats in the first place, be it S&P+ or any other flavor. You set the baseline depending on the opponent(s) at hand, and you compare the actual output to that adjusted expectation. It fills in what your eyes are missing. (And with 800-plus college football games in a season, your eyes are always missing something.)
Your record does matter. Even as an advanced stats loyalist, I’m not exactly going to call for a three-loss team to get a CFP spot, even if said team was unlucky in every loss and ranks first in S&P+. Wins and losses aren’t particularly predictive in and of themselves, but they still have to mean something. Even the best team shouldn’t get in, if it’s not high on the most-deserving list.
So what if we actually tried to combine the two worlds? What if we used a “best” measure to begin approximating what “most deserving” truly is?
What if we took the Strength of Record idea and added an extra level of team quality to it?
Introducing Résumé S&P+.
I’m going to introduce my takes on two pretty familiar concepts.
There are countless ways to measure one’s strength of schedule, but I’m going to choose one most directly tied to the national title race. Makes sense, since “strength of schedule” is right there in the mission statement.
Below are each FBS team’s rankings in three categories:
- S&P+, an overall team efficiency rating system you can read more about here. It can be used to predict wins and losses going forward.
- Strength of Schedule (SOS), which amounts to how well the average top-five team (according to S&P+) would fare, in terms of win percentage, against your schedule. The lower the number, the harder the schedule.
- Résumé S&P+, which looks at a team’s season scoring margin and compares it to what the average top-five team’s scoring margin would likely be against the schedule at hand. If the number is positive (and for most, it won’t be), that means said team is faring better then the typical top-five team. Instead of any advanced stats or win probability info, I’m adhering strictly to actual margins.
To use the current top Résumé S&P+ team as an example, let’s look at Alabama.
The Crimson Tide are second in overall S&P+ right now. Their schedule strength ranks just 82nd; if the average top-five team played Alabama’s previous opponents a countless number of times, they would win about 86 percent of those games.
Compare that to the schedule of someone like Maryland. The average top-five team would have a win percentage of only about 74 percent against that slate.
Résumé S&P+, however, shows us that when it comes to actual performance — actual points scored and allowed — the Crimson Tide have still laid waste to this schedule at a clip far greater than the average top team would have. That earns them a place atop this list for now. (This is a wide table, because it has to be; if it’s not showing well on your phone, consider taking a look on another device later.)
Teams won’t always be in the order we’d expect. Notre Dame has played well against a harder overall schedule than Georgia has, and that’s reflected in the SOS and Résumé S&P+ rankings. However, S&P+ would pick UGA to beat ND if they were to play again right now, as reflected in the S&P+ rankings.
|San Diego State||7-2||1.8||59||85.3%||78||-14.2||53|
|New Mexico State||3-5||-1.5||78||87.5%||103||-24.9||98|
|San Jose State||1-8||-15.6||126||85.9%||85||-44.7||130|
As of now, 13 teams have positive Résumé S&P+ ratings. That’s more than one would expect if you’re comparing performances to that of an average top-five team. But here’s where I remind you that the average top-five team at the moment is not nearly as good as it has been in previous years. If the sport’s top tier begins distancing itself from the field, a few positive averages will turn negative.
By the way, to those who want to moan about incentivizing running up the score in this measure (since we’re using scoring margins), suck it up.
Margin of victory is infinitely more informative than “did they win?” It just is. Besides, it’s odd to suddenly care about hurt feelings when I’m pretty sure telling half of FBS they don’t have a shot at the national title no matter how well they play is more hurtful to those feelings than winning 59-0 instead of 49-0.
Best versus most deserving. How you’ve played versus who you’ve played. Maybe there’s a way to tie together these worlds after all.
I’m still mad at Jason, though.