It's the first week of January, which means it's Super Bowl prediction time! Last year, I unveiled the second version of a mathematical model (model V2.0) for predicting which two of the 12 playoff teams would represent their conferences in the Super Bowl. While it was only half-right when predicting Super Bowl LI participants (it was correct on the Patriots, but forecast the wrong NFC birds of prey -- the Seahawks, not the Falcons), a slight modification of the model (model V2.0m) was right on target for predicting the Patriots' (record-setting) six-point victory:
Including the correct Super Bowl LI prediction, model V2.0m has correctly predicted the winner in 69 percent (38 of 55) of the postseason matchups since the 2012 season, which ended in the Harbaugh Bowl (Super Bowl XLVII). Model V2.0m was developed by comparing the "Ratings" generated by model V2.0 with actual game scores and making an adjustment for home-field advantage. So if we know the Super Bowl Ratings and we know where the game is being played, we can estimate the score differential and the likelihood of the favored team winning. To keep things straight, let me clarify: I'm discussing two related but distinct models, model V2.0, which predicts the teams that will advance to Super Bowl Sunday, and model V2.0m, which predicts the outcomes of *individual postseason games -- and which was NOT developed for predicting Super Bowl participants.
(*If the game is played at a neutral stadium, home-field advantage is ignored. Stat geek side note: It turns out that according to V2.0m, home-field advantage translates to an advantage of about 2.6 points scored per game, which is similar to other estimates of average home-field advantage.)
Let's first take a look at V2.0 predictions for Super Bowl LII in Minnesota before getting into individual game predictions:
Unlike my predictions from the last two years, which differed from popular opinion (in one or both conferences), this year's primary Super Bowl predictions are a bit less surprising. For the second year in a row, the Patriots are in a tier of their own and will be tough to beat. The Vikings have the highest rank among NFC teams, but the Saints are a very close second. Perhaps the most surprising result is the rating for the Philadelphia Eagles, who have the top NFC seed but lost Carson Wentz late in the season, and are rated slightly below three other teams in their conference. (For those keeping track, the combined score of model V2.0 through Super Bowl LI, as defined in detail last year, is now nine out of a possible 10, which means that of the 10 teams that participated in the last five Super Bowls, the model identified nine teams correctly with either the primary or secondary picks for each conference.)
There is a large second tier featuring many teams of similar strength. There should be plenty of exciting games between these squads, which mostly hail from the NFC. Unfortunately for those teams finally ending long playoff droughts, chances are quite high that the Bills (who last made the playoffs in 1999), Jaguars (2007) and Titans (2008) will not reach the Super Bowl. And for those Falcons fans looking for redemption from last year's crushing loss in Super Bowl LI, unfortunately, chances are slim that Atlanta will beat the odds in three consecutive games to get through the NFC bracket.
As mentioned before, Model V2.0 incorporates metrics made available by Pro Football Reference, like expected points contributed by the offense, simple rating system (or SRS) and offense simple rating system (OSRS). It incorporates playoff seeding and several statistics related to defense such as defensive turnovers, passing and rushing touchdowns allowed, and rushing yards allowed.
Additionally, the same caveats apply with V2.0 and V2.0m, in that they do not take into account any recent injuries because cumulative statistics across each season were used to make predictions. This year's major late-season personnel setbacks among top contenders include the ACL injury to Wentz, though the Eagles have managed to continue winning behind backup Nick Foles. The Bills and the Titans might not fare as well without their respective starting running backs, LeSean McCoy and DeMarco Murray -- McCoy's availability is up in the air as he deals with an ankle injury, while Murray has been ruled out with a knee injury. Injuries to key receivers Antonio Brown (Steelers), Marqise Lee (Jaguars) and Chris Hogan (Patriots) have kept them out of recent games. Those who are able to play will likely not be at full capacity. And these are just a few of the injuries that could disrupt these postseason predictions.
Despite several high-profile injuries, model V2.0 might perform better than it did last year, given the smaller number of injuries to the top-rated teams heading into the playoffs. The model predicts an NFL first that will surely be a memorable storyline: The Vikings would be the first team to play in a Super Bowl in its home stadium. The home-field advantage will help even the matchup against the AFC favorites (and reigning Super Bowl champions) from New England.
However, don't discard the secondary picks, who are the next-most likely to make an appearance in Minnesota. Both the Saints (led by veteran QB Drew Brees and the dynamic running back duo of Alvin Kamara and Mark Ingram) and the Chiefs (who are back to their winning ways after a rough patch in the middle of the season) have had some players returning from injury and are poised to make a deep run.
Nasir Bhanpuri, PhD, is a clinical informatics data scientist at Virta Health based in San Francisco. He has been applying analytics and modeling techniques to address challenges in a wide range of fields, including sports, healthcare, fitness, nutrition, music, education, neuroscience, and robotics.