Sportistician: 2017

Friday, July 21, 2017

Phil Dawson 2016 Kickoffs

This is a bit of a data dump, but it's an interesting study in Phil Dawson's incredible 2016 season, in which he placed 41.5% of his kickoffs between the 0 and 5. His touchbacks are highlighted. I pulled these plays from NFL.com play-by-play using a Python script that searches for all instances of "kick" or "kicks":

(15:00 - 1st) P.Dawson kicks 63 yards from SF 35 to MIA 2. J.Grant to MIA 23 for 21 yards (A.Burbridge; M.Wilhoite). PENALTY on MIA-N.Hewitt
(9:46 - 1st) P.Dawson kicks 65 yards from SF 35 to MIA 0. K.Drake to MIA 21 for 21 yards (R.Streater; B.Bell).
(7:34 - 3rd) P.Dawson kicks 67 yards from SF 35 to MIA -2. K.Drake pushed ob at SF 24 for 78 yards (J.Tartt).
(7:42 - 4th) P.Dawson kicks 45 yards from SF 35 to MIA 20. K.Drake MUFFS catch
(2:15 - 4th) P.Dawson kicks 65 yards from SF 35 to MIA 0. J.Grant pushed ob at MIA 14 for 14 yards (B.Bell).
(1:32 - 2nd) P.Dawson kicks 64 yards from SF 35 to ARZ 1. B.Golden to ARZ 18 for 17 yards (J.Tartt).
(8:40 - 3rd) P.Dawson kicks 60 yards from SF 35 to ARZ 5. B.Golden to ARZ 23 for 18 yards (J.Tartt).
(1:55 - 4th) P.Dawson kicks 63 yards from SF 35 to ARZ 2. A.Ellington to ARZ 15 for 13 yards (M.Wilhoite).
(15:00 - 3rd) P.Dawson kicks 63 yards from SF 35 to LA 2. P.Cooper to LA 22 for 20 yards (C.Bradford). PENALTY on LA-B.Hager
(5:06 - 4th) P.Dawson kicks 69 yards from SF 35 to LA -4. P.Cooper to LA 30 for 34 yards (R.Streater).
(0:31 - 4th) P.Dawson kicks 62 yards from SF 20 to LA 18. P.Cooper to LA 42 for 24 yards (V.Sunseri; S.Draughn).
(14:52 - 1st) P.Dawson kicks 62 yards from SF 35 to DAL 3. L.Whitehead to DAL 31 for 28 yards (G.Hodges
(15:00 - 1st) P.Dawson kicks 61 yards from SF 35 to ARZ 4. J.Nelson to ARZ 43 for 39 yards (A.Burbridge).
(4:12 - 2nd) P.Dawson kicks 55 yards from SF 35 to ARZ 10. J.Nelson pushed ob at ARZ 24 for 14 yards (J.Hamm).
(1:54 - 3rd) P.Dawson kicks 65 yards from SF 35 to end zone
(0:03 - 1st) P.Dawson kicks 50 yards from 50 to TB 0. R.Smith to TB 6 for 6 yards (S.Draughn
(8:03 - 4th) P.Dawson kicks 59 yards from SF 35 to TB 6. A.Humphries to TB 23 for 17 yards (S.Skov).
(2:40 - 1st) P.Dawson kicks 55 yards from SF 35 to NE 10. Cy.Jones to NE 23 for 13 yards (M.Wilhoite).
(4:56 - 2nd) P.Dawson kicks 54 yards from SF 35 to NE 11. J.Develin to NE 25 for 14 yards (J.Shepherd; B.Bell). PENALTY on SF-R.Streater
(14:46 - 3rd) P.Dawson kicks 63 yards from SF 35 to NE 2. Cy.Jones to NE 22 for 20 yards (M.Wilhoite).
(14:50 - 1st) P.Dawson kicks 62 yards from SF 35 to NYJ 3. N.Marshall pushed ob at SF 44 for 53 yards (J.Ward). PENALTY on NYJ-A.Brown
(13:57 - 1st) P.Dawson kicks 25 yards from SF 35 to NYJ 40
(10:38 - 1st) P.Dawson kicks 58 yards from SF 35 to NYJ 7. N.Marshall to NYJ 24 for 17 yards (J.Tartt).
(5:39 - 2nd) P.Dawson kicks 64 yards from SF 35 to NYJ 1. N.Marshall to NYJ 25 for 24 yards (R.Streater).
(14:55 - 3rd) P.Dawson kicks 64 yards from SF 35 to SEA 1. J.McKissic to SEA 23 for 22 yards (J.Tartt).
(5:37 - 4th) P.Dawson kicks 46 yards from 50 to SEA 4. P.Richardson to SEA 14 for 10 yards (C.Bradford). PENALTY on SF-J.Tartt
(15:00 - 1st) P.Dawson kicks 65 yards from SF 35 to end zone
(9:41 - 2nd) P.Dawson kicks 57 yards from SF 35 to SEA 8. P.Richardson to SEA 27 for 19 yards (J.Tartt). SF-J.Ward was injured during the play. He is Out.
(7:50 - 4th) P.Dawson kicks 65 yards from SF 35 to end zone
(0:56 - 4th) P.Dawson kicks onside 26 yards from SF 35 to SEA 39. L.Willson MUFFS catch
(6:49 - 1st) P.Dawson kicks 60 yards from SF 35 to SD 5. C.Mager to SD 26 for 21 yards (M.Wilhoite).
(6:55 - 2nd) P.Dawson kicks 56 yards from SF 35 to CHI 9. D.Thompson to CHI 37 for 28 yards (J.Ward). FUMBLES (J.Ward)
(1:56 - 2nd) P.Dawson kicks 51 yards from SF 35 to CHI 14. P.Lasike to CHI 19 for 5 yards (A.Burbridge).
(15:00 - 3rd) P.Dawson kicks 65 yards from SF 35 to end zone
(15:00 - 1st) P.Dawson kicks 60 yards from SF 35 to CAR 5. T.Ginn pushed ob at CAR 28 for 23 yards (A.Burbridge). PENALTY on CAR-S.Thompson
(11:21 - 1st) P.Dawson kicks 62 yards from SF 35 to CAR 3. T.Ginn to CAR 20 for 17 yards (K.Reaser).
(10:02 - 2nd) P.Dawson kicks 60 yards from SF 35 to CAR 5. T.Ginn pushed ob at CAR 22 for 17 yards (J.Tartt).
(12:38 - 4th) P.Dawson kicks 62 yards from SF 35 to CAR 3. T.Ginn MUFFS catch
(12:31 - 4th) P.Dawson kicks 62 yards from SF 35 to CAR 3. T.Ginn to SF 38 for 59 yards (A.Burbridge).
(7:51 - 4th) P.Dawson kicks 60 yards from SF 35 to CAR 5. T.Ginn pushed ob at CAR 29 for 24 yards (K.Reaser).

Was inspired by this article from USA today

Saturday, March 4, 2017

SAVVAGE scores for 2017

Using a logistic regression model based on historical data of combine results and subsequent pro-bowl selections, it was determined that the probability of achieving a pro-bowl can be estimated from combine height, weight, vertical leap, and forty-yard dash times for any incoming tight-end combine participant. The resulting score (0-100%) is known as the Size Adjusted Vertical and Velocity Greatness Estimate (SAVVAGE). Though it is still in the validation phase, the predetermined 6% threshold was found to predict both Travis Kelce and Tyler Eifert as future pro-bowlers.

So here are the numbers for the 2017 draft class:

Name	Hgt	Wgt	Dash	Vert	High-Pt	Avg_P	SAVVAGE
Engram, Evan	75.0	234	4.42	36	111.0	52.94	2.8%
Howard, O.J.	78.0	251	4.51	30	108	55.65	5.4%
Kittle, George	76.0	247	4.52	35	111	54.65	6.7%
Daniels, Darrell	75.0	247	4.55	32	107	54.29	2.1%
Hodges, Bucky	78.0	257	4.57	39	117	56.24	43.4%
Everett, Gerald	75.0	239	4.62	37.5	112.5	51.73	2.2%
Smith, Jonnu	75.0	248	4.62	38	113	53.68	6.6%
Njoku, David	76.0	246	4.64	37.5	113.5	53.02	5.4%
Carter, Cethan	75.0	241	4.68	?	?	51.50	[ < 6%]
Sprinkle, Jeremy	77.0	252	4.69	29	106	53.73	1.2%
Shaheen, Adam	78.0	278	4.79	32.5	110.5	58.04	27.7%
Orndoff, Scott	77.0	253	4.84	27	104	52.27	0.3%
Roberts, Michael	76.0	270	4.86	30	106	55.56	3.2%
Plinke, Hayden	76.0	264	4.97	28	104	53.12	0.5%

As compared to last year's prospects, this TE class looks much more athletic, with four of the fourteen participants looking like future pro-bowlers. It's fairly safe to say that Cethan Carter does not project as a future pro-bowler because he needs a 42.5 inch vertical to crack the 6% threshold. If any notable tight-end hopefuls are missing that you'd like to evaluate their pro-bowl potential or would like to speculate on missing measurements or added weight, you can use the calculator on this site. Since these results are adjusted for size, a little added weight can make a big difference.

Sunday, February 26, 2017

Measurables of Starting NFL tight ends

After Rob Gronkowski's season ending injury in 2016, Martellus Bennett put up arguably the best season of his career with the Patriots. While this may just be the Tom Brady effect, the discerning fan begins to wonder if Bill Belichick doesn't have some crystal ball that predicts which NFL tight ends have the potential to be starters and which do not.

While we don't have a crystal ball, we do have a mountain of data at our disposal thanks to metrics guru Jim Cobern. With his collaboration, a list of 230 potential NFL tight ends was compiled using combine data and college production statistics from 1999 through 2011 and classified into long-term starters (64 or more games started) and non-starters (less than 64 games started). Sixty-four games seems a reasonable definition of a long-term starter as it corresponds to 4 uninjured seasons of starts.

Exploratory data analysed included the following factors:

-Year Drafted
-Height (feet)
-Weight (lbs)
-Arm length (inches)
-Hand size (inches)
-Reps of 225 lb bench press to failure
-Market Share of Yards in College*
-Strength of College Schedule*
-Player Age on Draft Day*
-Player Explosiveness Score*: formulated from vertical, broad jump and mass density
-Player Speed Score*: formulated from prospect's 40 yard dash
-Player Flexibility Score*: formulated from short shuttle, 3-Cone, and mass density
*normalized to all positional peers since 1998

And here is a quick summary of incomplete factors in these 230 observations:

Factor	Missing	Missing%
Hands	122	53.0%
Arms	119	51.7%
FLX	32	13.9%
Bench	28	12.2%
SOS	25	10.9%
EXP	16	7.0%
SPD	1	0.4%

Regrettably, the amount of missing data precludes meaningful inclusion of either arm or hand size in a multivariate analysis. Also, upon finding that Speed, Explosiveness and Flexibility scores were co-linear, the choice was made to use speed score exclusively due to it's 99.6% completeness. Furthermore, bench press was treated categorically as no bench press, 0-20 reps, 21 or more to allow for inclusion of the 28 cases of missing data for that factor of interest. Strength of Schedule, on the other hand, was found to statistically insignificant between starters and non-starters (p=0.365), so it was discarded as a potentially informative factor to increase the effective sample size (n=229).

The data was split into a training set (years 1999-2006) for model building and testing set (years 2007-2011) for validating this model. Of the 144 prospects in the training set, twenty-three became eventual long term starters. Forward and Backward selection models yielded the following concordant Maximum Likelihood estimates for the coefficients of the log linear odds of being a starter in the training set:

Analysis of Maximum Likelihood Estimates
Parameter		DF	Estimate	Standard Error	Wald Chi-Square	p-value
Intercept		1	-6.3632	1.1993	28.1529	< .0001
MSY		1	2.1063	1.0637	3.9213	0.048
Age		1	2.2931	1.0912	4.4156	0.036
SPD_Score		1	2.5358	1.1034	5.2816	0.022
benchtype	0	1	-0.2840	0.9623	0.0871	0.768
benchtype	2	1	1.1796	0.5967	3.9087	0.048

For any prospect, the implied log odds of this model can be converted to a probability of being a long-term starter. A receiver-operating characteristic (ROC) curve analysis was employed to determine the optimal cut-point for this probability to balance the sensitivity (ruling in) and specificity (ruling out) of such a dichotomous rule. It happened that using a cut-point of at least 15% correctly classified 72.2% of the sample as a potential starter or not. The false positive rate of 66% of this rule left a bit to be desired, but the false negative rate of 5.5% implies we can be relatively certain when it classifies a player as a non-starter.

Be warned, this ROC curve analysis is really just a fancy type of data snooping: we visualize the balance of sensitivity and specificity, essentially looking every cut-point from 0% to 100% and look for the one that performs best! So this method doesn't mean much unless the model has predictive power in an independent data set. To this end, we used the same log-linear model above in the 2007-2011 data to estimate the probability of being a starter and applied this "rule of 15%":

Yes

Total

Computing the odds ratio associated with this contingency table suggest that a starter is 11.6 times more likely (95% CI 3.4-38.9) to belong to the list of individuals passing the rule of 15%. However, we'd be prudent to probe how much better our rule would do against a random guess in identifying starters and non-starters.

Starters = 29% (23/85)
True Positive = 30% (11/37)
False Positive = 70% (26/37)

So the rule of 15% improved the chances of finding starters by only 1% in the test set (29% to 30%), which is a little disappointing, but 30% really isn't far from the 33% true positive test rate that we observed in the training data.

Non-starters = 74% (63/85)
True Negative = 98% (47/48)
False Negative = 2% (1/48)

However, the rule of 15% improved our chance of ruling out non-starters by 24% (74% to 98%)! As it happened, Brandon Pettigrew was the only prospect that the model failed to identify as a eventual starter (goodbye to all the Oklahoma/Lions homers who are now dismissing the model entirely). For the unbiased readers who press onward, here's a list of all the other player in the validation set, classified by their model selection and starter status.

True Positives (Passed rule of 15% with GS > 64 at end of 2016 season):

Year	Name	Bench	MSY%	Age%	SPDScore%	P(starter)
2007	Brent Celek	19	77.30%	79.79%	58.37%	0.19
2007	Zach Miller	16	76.77%	96.72%	31.75%	0.15
2007	Greg Olsen	23	72.97%	83.86%	95.44%	0.67
2008	Martellus Bennett*	18	90.42%	98.03%	80.42%	0.46
2010	Rob Gronkowksi	23	93.18%	99.08%	78.90%	0.74
2010	Ed Dickson	23	88.85%	65.62%	89.92%	0.62
2010	Jermaine Gresham*	20	78.87%	90.42%	87.83%	0.40
2010	Jimmy Graham	24	26.90%	24.67%	96.39%	0.17
2011	Charles Clay	18	89.90%	81.63%	71.29%	0.31
2011	Lance Kendricks	25	89.37%	46.59%	78.33%	0.44
2011	Kyle Rudolph	19	82.81%	96.19%	40.87%	0.20

*Projected 2017 Free Agent as of 2/27/2017

False Positives (Passed rule of 15%, but GS < 64 at end of the 2016 season):

Year	Name	Bench	MSY%	Age%	SPDScore%	P(starter)
2007	Scott Chandler	16	84.78%	92.13%	67.30%	0.32
2007	Ben Patrick	22	83.33%	67.85%	70.72%	0.48
2007	Clark Harris	21	83.07%	63.91%	54.56%	0.36
2008	Dustin Keller	26	81.63%	17.98%	93.92%	0.34
2008	Kellen Davis II*	22	77.56%	72.83%	91.83%	0.61
2008	Darrell Strong	17	68.90%	88.45%	80.04%	0.30
2008	Gary Barnage	22	63.91%	71.39%	73.57%	0.42
2009	Travis Beckum	28	96.06%	79.92%	79.28%	0.66
2009	Jared Cook*	23	80.31%	85.43%	94.30%	0.70
2009	Cameron Morrah	24	56.04%	83.99%	75.10%	0.46
2010	Aaron Hernadez	30	94.36%	99.74%	82.89%	0.77
2010	Dennis Pitta	27	87.80%	1.84%	62.93%	0.15
2010	Dorin Dickerson	24	78.22%	85.04%	96.58%	0.70
2010	Andrew Quarless	23	72.05%	95.67%	78.14%	0.62
2010	Anthony Miller	19	66.67%	78.61%	78.71%	0.24
2010	Dedrick Epps	19	50.52%	90.68%	60.08%	0.15
2010	Jeff Cumberland*	24	45.01%	56.43%	98.86%	0.39
2010	Michael Hoomanawanui	25	42.39%	91.47%	75.86%	0.43
2011	Rob Housler	22	87.93%	51.71%	97.53%	0.58
2011	D.J.Williams	20	85.70%	69.95%	91.63%	0.35
2011	Julius Thomas	16	79.27%	62.34%	75.48%	0.21
2011	Zach Pianalto	22	77.03%	88.85%	66.35%	0.54
2011	Virgil Green	23	68.64%	65.88%	95.25%	0.55
2011	Luke Stocker	27	55.51%	64.44%	83.46%	0.40
2011	David Ausberry	23	39.11%	17.59%	96.77%	0.18
2011	Jordan Cameron*	23	21.65%	66.14%	95.82%	0.31

*Projected 2017 Free Agent as of 2/27/2017

If Thomas, Cameron, Cook, and Barnage can stay healthy for another season or two, we'll likely add another 4 players to the true positives, bumping it to a more healthy 40.5% in the validation set. To comment, it's notable the list includes 5 players that spent some time with the Patriots (Gronkowski, Hernandez, Hoomanawanui, Chandler, Bennett). It will be interesting to see if that list grows at all in the coming free-agency period, to confirm my suspicions that the Patriots are systematically better at selecting tight-ends than some other franchises.

True Negatives (Failed rule of 15% and GS < 64 at end of the 2016 season):

Year	Name	Bench	MSY%	Age%	SPDScore%	P(starter)
2007	Jonny Harline	15	81.89%	3.02%	31.18%	0.02
2007	Chad Upshaw	16	79.00%	61.29%	19.01%	0.06
2007	Martrez Milner	19	74.67%	66.27%	52.85%	0.13
2007	Joe Newton	20	70.47%	20.34%	18.44%	0.02
2007	Kevin Boss	19	65.88%	43.70%	37.26%	0.05
2007	Daniel Coats	34	63.25%	54.86%	26.43%	0.13
2007	Dante Rosario	20	56.69%	73.62%	46.39%	0.09
2007	Anthony Pudewell	15	41.34%	7.35%	2.85%	0.01
2007	Michael Allan	19	5.25%	15.88%	66.92%	0.01
2008	Evan Moore	16	82.55%	42.13%	21.67%	0.04
2008	Tom Santi	14	74.15%	75.72%	45.44%	0.13
2008	Jermichael Finley	20	71.92%	98.43%	23.57%	0.12
2008	John Carlson	20	67.98%	9.06%	21.29%	0.02
2008	Jacob Tamme	18	64.04%	51.57%	79.85%	0.14
2008	Derek Fine	24	61.42%	2.23%	44.68%	0.06
2008	Joey Haynos	17	58.79%	14.57%	20.72%	0.01
2008	Craig Stevens	27	33.20%	14.96%	93.73%	0.14
2008	Adam Bishop	21	28.74%	25.72%	4.75%	0.02
2008	Brad Cottam	24	17.19%	24.54%	90.87%	0.12
2008	Kolo Kapanui	23	5.38%	3.28%	15.21%	0.01
2009	James Casey	.	90.16%	0.13%	60.27%	0.04
2009	Davon Drew	17	88.45%	25.59%	61.22%	0.09
2009	Bear Pascoe	.	76.90%	49.74%	8.75%	0.02
2009	John Phillips	.	74.80%	89.63%	40.68%	0.12
2009	Shawn Nelson	19	69.69%	18.77%	89.54%	0.10
2009	Kory Sperry	20	63.65%	7.74%	46.77%	0.03
2009	Cornelius Ingram	21	58.14%	10.10%	69.20%	0.12
2009	Jared Bronson	.	50.26%	3.94%	76.43%	0.03
2009	Dan Gronkowski	26	43.83%	4.72%	45.06%	0.05
2009	Anthony Hill	21	36.61%	4.20%	50.38%	0.05
2009	Richard Quinn	24	19.55%	69.42%	48.29%	0.12
2010	Garrett Graham	20	81.76%	13.52%	62.55%	0.06
2010	Tony Moeaki	18	62.47%	60.24%	71.10%	0.13
2010	Colin Peek	19	48.29%	9.97%	14.45%	0.01
2010	Nate Byham	.	46.72%	91.08%	31.37%	0.06
2010	Jim Dray	17	24.80%	42.39%	27.19%	0.02
2010	Mandel Dixon	.	18.64%	20.21%	96.96%	0.03
2010	Fendi Onobun	.	9.45%	23.62%	98.67%	0.03
2010	Brody Eldrige	.	9.32%	53.81%	79.09%	0.04
2011	Lee Smith	25	75.85%	23.36%	24.33%	0.08
2011	Cameron Graham	18	75.46%	80.97%	4.37%	0.06
2011	Schuylar Oordt	18	73.49%	9.45%	89.35%	0.09
2011	Daniel Hardy	18	69.16%	15.49%	53.23%	0.04
2011	Allen Reisner	14	60.89%	71.65%	27.00%	0.06
2011	Weslye Saunders	19	50.13%	79.27%	68.44%	0.14
2011	Charlie Gantt	27	49.48%	35.43%	34.98%	0.08
2011	Larry Donnell	.	35.56%	74.02%	34.60%	0.03

In summary, "all models are wrong, some models are useful", but it appears we've got a fairly useful model that can help separate the chaff from the wheat when it comes to singling out potential long terms starters. NFL franchises, and fantasy football players, should be wary of acquiring prospects that fail our rule of 15%.