Wednesday, November 16, 2016

Why Cy Young, why?

The recent public outcry regarding the election of Rick Porcello the Cy Young award winner in the American League has brought the unfairness of the system that balances ranked votes to my attention

The ranks of all the voting sportswriters can be seen on the BBWAA website, but here's a visualization of the voting breakdown for the top 5 pitchers.


The human brain is quite adept at pattern recognition, and the immediate focus of mine is drawn to the mode, or peak, for each pitcher:

Verlander - 1st (14)
Porcello  - 2nd (18)
Kluber - 3rd (12)
Britton - 4th (9)
Sale - 5th (10)

However, this ranking does not match the ranking as determined by the Cy Young formula:

Points = 7 * (# 1st) + 4 * (# 2nd) + 3 * (# 3rd) + 2 * (# 4th) + 1 * (# 5th)

Porcello = 137 points
Verlander = 132 points
Kluber = 98 points
Britton = 72 points
Sale = 40 points

Admittedly, I have no idea how these weights were set, but one should be wary of any (weighted) average of ranks.  The most glaring reason is that the separation between first and second is likely to be larger than that of second and third, and so-on down the line.

However, while this formula captures this in the weighting of first and second, it still considers 2nd, 3rd, 4th, and 5th place to be equidistant from each other, which seems like a major failing.

In fact, if one considers that a 4th place rank is worth twice as much as a 5th place rank, and suggests that first and second be weighted in the same proportion (8 and 4 respectively), we would have had a Verlander victory (146 to 145 points). Just a little food for thought about the mathematical underpinnings of taking a linear combination of ordinal variables.

Wednesday, September 7, 2016

Preseason NFL kicking trends


With the recent changes to thetouch-back rules taking effect this 2016 NFL preseason, one can begin to probe the data for trends in both returner and kicker strategy.  To this end, a python script was written to scrape the play-by-play data from the 2015 and 2016 seasons for every instance of the word “kicks” or “kickoff” from the ESPN play-by-play data.  The result was 604 and 556 respective kickoffs from 2015 and 2016 preseason games.  After excluding onside kicks (9 and 6 respectively), kicks that were called back due to penalty (3 and 4 respectively), and kicks from locations other than the 35 yard line (18 and 19 respectively), we’re left with 574 and 527 standard kick-offs in 2015 and 2016 respectively:

Year
G
KOs
Onside KOs
From non-35
No Play
Standard KOs
Returns from EZ
Returns from field
Touchbacks
OOB
2015 Pre
64
604
9
18
3
574
220
87
261
6
2016 Pre
64
556
6
19
4
527
156
142
229
0

We can now compare the average starting field position (FP) following pre-season NFL kicks between 2015 and 2016:

Starting FP of "standard" KOs
Year
Kicks
Avg.FP
SD.FP
p-value
2015
574
20.8
7.3
< 0001
2016
527
23.1
7.5


It’s no surprise that the average starting field position is significantly different after increasing the touchback spot by a full five yards, but it is interesting that the difference is only 2.3 yards.  Furthermore, we can compare the field position in 2016 of kicks into and out of the end zone (EZ):

Starting FP by EZ kick location in 2016
Year
N
Avg FP
SD
p-value
Kick to EZ
382
23.3
7.3
0.364
Out of EZ
145
22.6
7.8


Though the difference is not statistically significant, a kick short of the end zone yielded 0.67 fewer yards (2 feet) this pre-season.  In preseason games, players are vying for a roster spot and are likely being instructed to return kicks at a higher percentage than we should expect in the regular season.  To this end, let’s examine the return and kick rates:


Returner Decision
Kicker Decision
Year
Stat
Kneel
Return
p-value
Year
Stat
out of EZ
Kick to EZ
p-value
2015
freq
261
220
0.124
2015
freq
87
481
< 0.0001
rel %
54.3%
45.7%

rel %
15.3%
84.7%

2016
freq
229
156

2016
freq
142
385

rel %
59.5%
40.5%

rel %
26.9%
73.1%


Indeed, the 5% bump in touch-back rate may only be a difference due to random noise. However, the 11.6% increase in kicks short of the end-zone is a big enough difference to prove a systematic difference between kicker behaviors between 2015 and 2016.  As the informed reader knows, preseason is about the process of player evaluation and may not be reflective of regular season trends, so kickers and returners may do drastically different things next week, but rest assured that this data will be evaluated in a prospective manner as the season unfolds.

Thursday, July 14, 2016

Confidence Interval Calculator


Instructions: Enter the frequency of the event of interest (numerator) and the total events(denominator).

Example 1: Taysom Hill completed 88 passes (numerator) of 121 attempts (denominator) in 2020, so his completion rate is 72.7% (95% CI: 64.2-79.9%). We are 95% confident that Hill's true completion rate (under similar conditions) is better than 60%.

Example 2: Jameis Winston completed 7 (numerator) of 11 pass attempts (denominator), his completion rate is 64% (95% CI: 35%-85%), so we are NOT 95% confident that Winston's true completion rate (under similar conditions) is better than 60%.

Numerator:
Denominator:



Probability:
Percentage:
Lower 95%: %
Upper 95%: %

More on the data source and modeling methods here

Wednesday, July 13, 2016

SAVVAGE score calculator


Instructions: Enter the significant combine measures to estimate NFL pro-bowl potential of tight-end, or SAVVAGE score

Height (inches):
Weight (pounds):
40-yard Dash (s):
Vertical Leap (in):



High Point Potential:
Average Momentum:
Estimated Pro-bowl Odds: 1 in
Estimated Pro-bowl Probability: %

More on the data source and modeling methods here

Wednesday, June 29, 2016

Effect of Weight gain on SAVVAGE score

After much deliberation, I've decided to re-brand my Size Adjusted Probowl Prediction score with a catchier acronym.  The metric detailed previously on this blog will be known from henceforth as Size Adjusted Velocity and Vertical Athletic Greatness Estimate, or SAVVAGE for short.  While I had some hope to add college production numbers to the metric, I've realized that the frequency of crossover athletes entering the NFL gives the simple version more value.  Please be aware this model is still being validated using prospective data analysis, but the model correctly predicted both Tyler Eifert and Travis Kelce in the validation phase of this metric already.

To explore the change in SAVVAGE due to weight gains, let's consider a couple of intriguing wide-receivers who, like just like Shannon Sharpe, converted to tight-end in their early NFL years.  Take these numbers with a grain of salt as applying this model to these players is an extrapolation since the training set was all NFL combine participants in the tight-end position from 2000-2011.

Darren Waller Combine Values (2015):
Vertical: 37 inches
Height: 78 inches
High Point:  115 inches
Dash:  4.46 seconds
Weight: 245 pounds
Avg Momentum: 54.9
Probowl odds: 1 in 4.3
SAVVAGE score: 18.7%

Darren Waller in 2016:
Weight: 260 lbs
Avg Momentum: 58.3
Probowl odds: 1 in 0.78
SAVVAGE score: 58%

Under the rather generous assumption that Waller maintained his ridiculous 37 inch vertical and 4.46 speed, this would be the 2nd highest known SAVVAGE score to date, being second only to Vernon Davis.   Despite the questionable assumptions, the change observed in his SAVVAGE score really illustrates the value of weight and speed for success as an NFL tight-end.  Still as it recently came to light that Waller tested positive for performance enhancing drugs, it's possible that most of this weight is muscle and he may have retained this ridiculous athleticism for the most part.

Let's also Niles Paul, another converted wide-receiver out of Nebraska, of slightly smaller stature and less ridiculous athleticism.  After gaining 26 pounds over his 5 years in the league, it's a bit absurd to think he's still as fast as he was at 224.  However, the goal is to explore how a hypothetical weight change can affect one's SAVVAGE score.

Niles Paul (2011 combine):
Vertical: 34.5 inches
Height: 73 inches
High Point:  107.5 inches
Dash:  4.51 seconds
Weight: 224 pounds
Avg Momentum: 49.7
Probowl odds: 1 in 479
SAVVAGE score: 0.2%

Niles Paul (2015):
Weight: 241 pounds
Avg Momentum: 53.4
Probowl odds: 1 in 64
SAVVAGE score: 1.5%

Niles Paul (2016):
Weight: 250 pounds
Avg Momentum: 55.4
Probowl odds: 1 in 22
SAVVAGE score: 4.3%

While Paul still doesn't quite top the 6% threshold that has historically done well at predicting Probowl appearances, it does go to show what a difference that weight plays in this size-speed equation.  The reader is encouraged to plug the combine metrics of an NFL tight-end into this calculator and play with the numbers a bit to see what changes correspond to big changes in their SAVVAGE score.

Friday, June 17, 2016

Bases Batted in 2015 MLB

As a follow-up to a previous post, I sat down to compute the number of bases generated by MLB player at bats as documented in the 2015 play-by-play data from Retrosheet.org.  Due to data limitations, the National League was neglected, though the error margins suggest that we're dealing with a large enough sample with the American League to serve as reasonable estimates for all MLB.

Here's are the aggregate stats detailing the frequency and probability of base-runner distribution, stratified by the base-runner positions:

2015 - AL None R@1st R@2nd R@3rd R@1,2 R@1,3 R@2,3 R@1,2,3
Frequency 48498 15041 6234 2007 5403 2271 1509 1756
Probability 58.6% 18.2% 7.5% 2.4% 6.5% 2.7% 1.8% 2.1%
Margin of Error 0.4% 0.6% 0.7% 0.7% 0.7% 0.7% 0.7% 0.7%

Runner movements after each at-bat with runners on base were also tabulated, which are included merely for the purpose of making these results reproducible/verifiable:

FROM 1ST R@1st R@2nd R@1,3 R@1,2,3
Runner lost 2560 765 372 232
No Advance 7623 2993 1142 893
Batted to 2nd 2704 924 421 367
Batted to 3rd 1269 392 199 157
Batted Home 835 312 122 94
Hm - unearned 50 17 13 13
Hm - unearned team 0 0 2 0
FROM 2ND R@2nd R@1,2 R@2,3 R@1,2,3
Retired Side 134 176 22 34
Runner lost 3843 3182 976 1016
Batted to 3rd 1184 1011 243 327
Batted Home 992 944 238 341
Hm - unearned 79 89 29 36
Hm - unearned team 2 1 1 2
FROM 3RD R@3rd R@1,3 R@2,3 R@1,2,3
Retired Side 36 56 39 79
Runner lost 1307 1383 938 1015
Batted Home 608 737 474 591
Hm - unearned 56 94 57 71
Hm - unearned team 0 1 1 0

After excluding unearned advancement home, all remaining base advancements were given a positive weight of 1, with the appropriate negative weight applied for runners lost from 1st, 2nd, and 3rd respectively (-1, -2, -3) due to a force-out, tag-out, or when the side was retired.  In this way, the total bases gained/lost after each of these at bats was tabulated under each runner condition:

2015 - AL R@1st R@2nd R@3rd R@1,2 R@1,3 R@2,3 R@1,2,3
Frequency 15041 6234 2007 5403 2271 1509 1756
Total Bases 5187 2900 500 4426 1382 1032 2026
E(bases) 0.34 0.47 0.25 0.82 0.61 0.68 1.15
Margin of Error 0.017 0.021 0.028 0.018 0.026 0.028 0.025

Using these averages as approximations for the expected bases from each condition, we can might compare the results of a batted bases of an individual player to that of the league average, but that seems like a matter of another post.


Friday, June 3, 2016

New Touchback Rule May Increase Kick-Returns

While the kick return is perhaps the most exciting play in football, it is also one of the most dangerous.  You don't have to be Sir Isaac Newton to know that players hurling themselves at each other while running full-speed in opposite directions is an accident waiting to happen.  The particularly heinous detail here is that special team players are often the very minimum salaried players that the NFL should work the hardest to protect.


As such, hoping to increase touchbacks and decrease injury, the NFL moved the kickoff location forward from the 30 to the 35 yard line in 2011. Indeed, as illustrated in the associated figure, this resulted in a dramatic decrease in the relative percentage of kicks returned.  As the margin of error bars illustrate, the percentage of kicks returned has been steadily declining ever since 2011, with even the 2015 season appearing statistically significantly lower than that of 2014. Note that the data was derived from aggregate kickoff info obtained on www.footballdb.com.


However, there are questions as to if this improvement was enough.  This figure was generated in an attempt to visualize all kick returns from the 2015 season, which is sorted by location that the kick was originally fielded. This data was extracted from downloaded play-by-play information complied by nflsavant.com, which has the added benefit of allowing us to extract when an injury occurred on particular kick return or touchback.  Note that cases of falling on an onside kick and lateral situations were excluded due to the respective simplicity and complexity, but pose perhaps greater risk. 

In fact, in the three years of play-by-play data available on nflsavant.com, there were still 53 injuries on 2974 kicks (1.75% injury rate) vs. 13 injuries on 3297 touchbacks (0.33% injury rate).  These respective percentages imply that the risk of an injury is 5 times greater on a return than on a touchback.  And as the above figure indicates, more than 50% of returns are coming out of the end-zone, so if we could give returners greater incentive to down the ball, we might expect the number of injuries to decrease by around 50%.

Maybe cutting 53 injuries by a factor of 2 doesn't sound like a great saving to the average fan, but let's remember how terrible each of these injuries can be.  In a week 4 kick return last season, the Cowboy's Lance Dunbar tore his ACL/MCL/patellar tendons on a return to begin the second half. It is important that we remember that there is an individual behind each of these statistics that is suffering a life-altering and potentially career ending injury.


In an effort to further protect players, the NFL announced this off-season that they would be modifying the location of touchback to the 25 yard-line after a kickoff in the 2016 season.  The hope is that this probationary rule provides an additional 5 yard incentive for returners to take a lower risk touchback.  If we increase touchbacks, it will theoretically reduce injuries in special teams players.



In fact, if we stratify the returns from the past three years based on those started from the end-zone and those from outside, we see some very different distributions in play. Clearly returners have incentive to down a kickoff caught in the end zone.  However, with the median of the "out of EZ" distribution lying at 25 yards exactly, it's NOT clear that kickers would have incentive to keep booming their kicks to the end zone and beyond.



Still, not all kicks in the above distributions are created equal.  The difference between deep kicks and shallow ones should have a dramatic effect on the resulting starting field position.  If we further stratify our data by position the kick was fielded, we can bin how the return fared relative to the new 25 yard touchback location.  Again, the error bars are 95% margins of error that allow us to tell if, in spite of the variation due to randomness, the average starting field position after the return is at or beyond the 25 yard line more than 50% of the time.



This breakdown suggests that the average kicker may minimize his opponents starting field position by attempting to kick exclusively between the 0 and 5 yard line.  Still, as with all observational data, we may be dealing with a bias of NFL kickers only attempting to pin it in the 0-5 range with less than average returners.  If this is attempted with more skilled returners in the league, they may find this strategy is actually less than optimal.  In the end, some Bayesian methods based on prior return yardage may help kickers decide in real time which returners will get a booming kick and which a mortar.  However, at the onset, many returners are unknown NFL commodities, so the prior of league average may suffice.  Thankfully the league is using the next year to evaluate the rule in a probationary period so they may be able to easily abandon the rule if it has the adverse affect than intended.  Still, this may be after we have already put some special players at additional risk.


Tuesday, May 24, 2016

Visualization of all NFL Kick Returns from 2015


In the interest of seeing how recent changes to the NFL kickoff rule may change the return game, I acquired play-by-play data of all NFL kickoffs from 2015.  After excluding touch-backs, returns with penalties, onside kicks with no attempted return, and end-of-game laterals, we were left with 1040 kicks returns, which I sorted by starting position then yardage to plot each return.  Each line above corresponds to a single return from the starting point (catch) to finish (end of return).

I'll analyze this data in greater depth in the near future, but some observations from the figure:
-Approximately half of the kicks returned started in the end-zone
-Kicks 9 yards deep in the end-zone never made it past the 25 yard line last year.
-Returns for touchdowns are shaded toward better starting position(-1 or better), but only slightly.


Wednesday, May 11, 2016

Adjusting the Dalton Scale

Andy Dalton has long been identified by NFL analysts as the dividing line between passable and non-serviceable quarterbacks in the league.  However, some stellar play from Dalton in the past year may suggest that he has surpassed his own standard as the definition of QB mediocrity (though it should be noted that NFL mediocrity is something to which many QBs aspire and have never attained).

To this end, let's break down Dalton's numbers from the most recent season and compare them to his past statistics.  Using data from Foxsports.com and a little mathematical manipulation (addition and subtraction to ensure mutually exclusive events), we arrive at the following table:

Years Incomplete,
INT
Incomplete,
no INT
Complete,
< 20 yds
Complete,
20-39 yds
Complete, 40+ yards
'11-'14 66 788 1127 130 44
2015 7 135 203 41 11

While the counts of events is useful for transparency and ensuring these results can be reproduced, it is not very helpful in conducting an apples-to-apples comparison of these events, which we'll accomplish by way of reducing these numbers to relative percentages before displaying them:



The associated chi-square statistic (chi-sq=12.73, df = 4, p-value = 0.013) that tests if these two distributions are statistically different was significant (p < 0.05), meaning that it unlikely that random variation is driving the observed differences here.  The most notable difference is the 4.3% improvement of completions in the 20-39 yards category that actually contributes the majority (72.5%) to the significant chi-square statistic of 12.73.  So perhaps we can conclude that Andy Dalton's improved deep ball has moved him past the status of a merely passable signal-caller.

Still, I'm skeptical that this comparison is for a full 16 games from 2011-2014 but only the  first 13 games of 2015.  As such, let's explore if this may be a confounding factor in the previous analysis.  Just to vary our methods a little, let's compare Dalton's passer rating from the first 12 games of every season, as he was actually injured early in the 13th game:

Games
Played
Avg
QBR
SD of QBR p-value of t-test
2015 12 111.6 26.8 0.007
'11-'14 48 86.2 28.2

So the observed difference of 25.4 (±18.4 for 95% Confidence interval) in Passer Rating in 2015 pushes Dalton well past the mediocre, even after adjusting for potential differences in early season performance.  So we'd still conclude that Andy Dalton played significantly better in 2015 than in 2011-2014 using Passer Rating.  While this 2015 performance may still turn out to be an anomaly, perhaps we should start looking for a new standard of merely passable quarterback in the NFL.

Tuesday, May 3, 2016

Weapons of Mass Destruction: tight-end depth charts

While the media was buzzing in a pre-NFL draft frenzy, Ozzie Newsome was calmly and quietly stockpiling weapons of mass destruction by adding a couple tight-ends to the roster. The most notable addition was wooing the 35 year old Ben Watson away from the Saints, with the second most intriguing prospect being Darren Waller, a converted wide-out from Georgia Tech, who gained 15 pounds in the off-season.

retrospective analysis of combine participant tight-ends suggested that four key variables interact to affect size adjusted athleticism in terms of the High Point Potential (Height + Vertical) and Average Momentum in the 40 yard dash (Weight/Dash).  These two factors were found to strongly correlated with pro-bowl selection in the data.  While this model is still being validated to confirm its predictive value, I'm optimistically christening it with a convenient and appropriate acronym, BEAST, for Better Estimate of Athleticism by Size Transformation.  Here is the current state of the Raven's already deep TE rotation (depth chart courtesy of Rotoworld).  Athleticism is presented in terms of 40 yard dash and vertical leaps alongside height and weight.

Name Age Height Weight Dash Vert High-Pt Avg_PBEAST
Ben Watson 35 76 258 4.57 35.5 111.5 56.5 17.6%
Crockett Gillmore 24 78 270 4.89 33.5 111.5 55.2 9.9%
Maxx Williams 22 76 250 4.78 34.5 110.5 52.3 1.8%
Dennis Pitta 30 76 245 4.72 34 110 51.9 1.3%
Darren Waller 23 78 260 4.46 37 115 58.358.0%

The aforementioned retrospective data analysis suggested that 6% is the optimal cut-point differentiating eventual pro-bowl caliber athletes in the tight-end position from their more work-a-day counterparts.  By this criteria, it looks like the Ravens roster has 3 potential pro-bowlers!  However, some caution should be taken with Gillmore, as the 10 pounds he's gained since joining the league is the difference between exceeding 6% or not, and we don't have updated data on his vertical leap or dash-time since gaining this weight.  A similar caveat may apply to Darren Waller, though his score far exceeded the 6% cutoff (~18%) before the 15 pounds he added recently.

Baltimore wasn't the only city stockpiling weapons of mass destruction.  The Patriots acquired Martellus Bennett from the Bears and signed Clay Harbor, the free agent out of Jacksonville.  The year before (2015) they made a similar acquisition of converted offensive tackle, Michael Williams.  Based on these recent moves, here's the current state of the Patriots depth at TE:

Player Name Age Height Weight Dash Vert High-Pt Avg_P ~Prob(PB)
Rob Gronkowski 26 78 264 4.68 33.5 111.5 56.4 17.2%
Martellus Bennett 29 79 273 4.68 34 113 58.3 45.9%
AJ Derby 24 77 255 4.72 ? ? 54.0 ?
Michael Williams 25 78 304 5.19 25.5 103.5 58.6 8.0%
Clay Harbor 28 75 250 4.69 40 115 53.3 8.8%

It looks like the Patriots may roster as many as 4 potential pro-bowler tight-ends based on the previously established 6% rule.  Again, these numbers may be somewhat misleading, as the computations driving the estimated pro-bowl probability are built upon combine numbers where Bennett and Williams have both put on significant weight since joining the league.  In both cases, this weight made the difference in exceeding the 6% threshold, again based upon the sketchy assumption that weight gain didn't affect speed and jumping ability.  In Bennett's case, his estimate is so high that some sensitivity analysis suggests that he's likely over the 6% threshold even if both his speed or vertical has suffered as much as a 5% reduction (Update: Michael Williams suffered a torn ACL in OTAs in June and is still waiting for swelling to reduce before attempting corrective surgery).

In conclusion, the Ravens and Patriots appear to be stacked in the tight-end position.  However, due to non-synchronicity in the timing of available data, it may be only the top and the bottom of the depth chart that exceed the 6% threshold for both teams.  This is a less exciting prospect, as holding a potential top 10 tight end with an athletic project tight-end waiting in the wings is pretty much the goal of every organization.