The Power of Next Gen Stats: Introduction to Completion Probability (Part I)
Leveraging Tracking Data to Contextualize the Difficulty of a Throw
Not all passes are created equal. A quarterback receives the same credit for a completion whether the pass traveled 60 yards downfield to a receiver in double-coverage, or whether the pass traveled 2 yards behind the line of scrimmage to an open running back in the flat. Next Gen Stats player tracking data can be leveraged to add context to each passing play.
Next Gen Stats' new metric for the 2018 season, Completion Probability, seeks to improve on the limitations of raw box score statistics and is intended to contextualize passing and receiving performance on a per play basis, to account for the level of difficulty of a throw.
The Most Improbable Catches of Week 1
Armed with Next Gen Stats data available only through player tracking technology, the model estimates, what is the probability a pass is complete based on the factors of the play? We take a look inside the three most improbable completions from Week 1 of the 2018 season.
In Week 1 of the 2018 season, Aaron Rodgers connected with Geronimo Allison in the back of the end zone for a 39-yard touchdown in the fourth quarter of the Packers win against the Bears. The pass had just a 14.7% Completion Probability, the most improbable completion of the week based on the following in-play factors: The pass traveled 60.3 yards in the air from the location of Rodgers at the time of throw to Allison at the time of the catch. Allison had 0.9 yards of separation from Kyle Fuller at the moment of catch and Rodgers had 2.1 yards of separation from Jonathan Bullard when he released the ball. All of those factors, among several others, attributed to the low Completion Probability estimate.
Case Keenum found Demaryius Thomas late in the fourth quarter for a game-winning 4-yard touchdown to beat the Seahawks. The pass had just a 15.6% Completion Probability based on the following in-play factors: Keenum was traveling at 15.52 MPH at the time of throw, Thomas had 1.8 yards of separation between himself and Shaquill Griffin, and at the time of the catch, Thomas was nearly a full-yard out of bounds (relative to the sideline) while keeping his feet in-bounds for the game-winning score.
In the first quarter of their Week 1 win against the Texans, Tom Brady connected with Rob Gronkowski for a 21-yard touchdown to give the Patriots an early lead. The pass had just an 18.8% Completion Probability based on the following in-play factors: Brady's pass to Gronkowski traveled 34.1 yards in the air, Gronkowski had two defenders within a single yard when the pass arrived, and Gronkowski was just 1.3 yards from the sideline at the time of the catch.
How The Completion Probability Model Works
With the help of machine learning based models leveraging Amazon Web Services' Sagemaker platform, Completion Probability is measured using more than 10 different in-play factors collected by Next Gen Stats player-tracking devices. Those inputs include pass air distance (from quarterback to receiver), air yards, the distance between the receiver and the nearest defender, the distance between the quarterback and the nearest pass rusher, the speed of the quarterback at throw, among several other metrics.
Each in-play factor in the Completion Probability model has a direct relationship with the likelihood a pass is complete or incomplete. We can evaluate these relationships by plotting each in-play factor against Actual Completion Percentage to better understand each factor's effect on the outcome of a play...
- Air Distance: As distance between the quarterback at the time of the throw to the location of the receiver at the time of the catch increases, the likelihood of a completion decreases. Passes traveling more than 40 air distance yards have roughly a 20% chance of completion, while passes traveling 10 air distance yards have a roughly 80% chance of completion.
- Target Separation: As the distance between the receiver and nearest defender increases, the likelihood of a completion also increases. The thickness of the each data point of the plot shows the density of passes for each level of target separation which suggests the majority of passes come with less than 4 yards of target separation.
- Sideline Separation: As the distance between the receiver and the sideline decreases, the likelihood of a completion also decreases. The probability of a completed pass decreases rapidly at 5 yards of sideline separation. Controlling for all other factors, passes to the sideline just inside the white paint have a roughly 30% chance of completion.
- Pass Rush Separation: As the distance between the quarterback and nearest pass rusher at the time of the throw decreases, the likelihood of a completion also decreases. A quarterback throwing with no defenders around has a higher probability of a completed pass compared to a quarterback with a pass rusher within a few yards at the time for the throw.
- Passer Speed: As the speed of the quarterback at the time of the throw increases, the likelihood of a completed pass decreases. Speed below 8 MPH has little effect on the probability of a completion, however, as the speed of the quarterback increases above 8 MPH, the chance of completion decreases dramatically.
- Time to Throw: As the duration of time increases from snap to throw, the likelihood of a completed pass decreases. Most passes occur between 2 and 3 seconds after the snap, and the probability of a completion declines significantly after 3 seconds.
The model uses in-play data points collected by Next Gen Stats player tracking technology on over 36,000 pass attempts dating back to the 2016 season and was validated against a random sampling of 10% of the pass attempts. Using only data not included in the model, we find the relationship between the Actual Completion Percentage and Completion Probability is strong, with an r-squared value of 0.98. For reference, "0" represents no correlation while a value close to "1" represents a perfect correlation.
Completion Probability adds context to a passing play never before realized until now. In a follow-up article, we will describe how Completion Probability can be leveraged to improve upon the traditional box score statistic, Completion Percentage as a measure to evaluate the performance of a quarterback.