After Rob Gronkowski's season ending injury in 2016, Martellus Bennett put up arguably the best season of his career with the Patriots. While this may just be the Tom Brady effect, the discerning fan begins to wonder if Bill Belichick doesn't have some crystal ball that predicts which NFL tight ends have the potential to be starters and which do not.
While we don't have a crystal ball, we do have a mountain of data at our disposal thanks to metrics guru Jim Cobern. With his collaboration, a list of 230 potential NFL tight ends was compiled using combine data and college production statistics from 1999 through 2011 and classified into long-term starters (64 or more games started) and non-starters (less than 64 games started). Sixty-four games seems a reasonable definition of a long-term starter as it corresponds to 4 uninjured seasons of starts.
Exploratory data analysed included the following factors:
-Year Drafted
-Height (feet)
-Weight (lbs)
-Arm length (inches)
-Hand size (inches)
-Reps of 225 lb bench press to failure
-Market Share of Yards in College*
-Strength of College Schedule*
-Player Age on Draft Day*
-Player Explosiveness Score*: formulated from vertical, broad jump and mass density
-Player Speed Score*: formulated from prospect's 40 yard dash
-Player Flexibility Score*: formulated from short shuttle, 3-Cone, and mass density
*normalized to all positional peers since 1998
And here is a quick summary of incomplete factors in these 230 observations:
 
 
 
 
  | Factor | Missing | Missing% | 
  | Hands | 122 | 53.0% | 
  | Arms | 119 | 51.7% | 
  | FLX | 32 | 13.9% | 
  | Bench | 28 | 12.2% | 
  | SOS | 25 | 10.9% | 
  | EXP | 16 | 7.0% | 
  | SPD | 1 | 0.4% | 
  | 
 | 
 | 
 | 
Regrettably, the amount of missing data precludes meaningful inclusion of either arm or hand size in a multivariate analysis. Also, upon finding that Speed, Explosiveness and Flexibility scores were co-linear, the choice was made to use speed score exclusively due to it's 99.6% completeness. Furthermore, bench press was treated categorically as no bench press, 0-20 reps, 21 or more to allow for inclusion of the 28 cases of missing data for that factor of interest. Strength of Schedule, on the other hand, was found to statistically insignificant between starters and non-starters (p=0.365), so it was discarded as a potentially informative factor to increase the effective sample size (n=229).
The data was split into a training set (years 1999-2006) for model building and testing set (years 2007-2011) for validating this model.  Of the 144 prospects in the training set, twenty-three became eventual long term starters.  Forward and Backward selection models yielded the following concordant Maximum Likelihood estimates for the coefficients of the log linear odds of being a starter in the training set:
| 1 | -6.3632 | 1.1993 | 28.1529 | < .0001 | 
| 1 | 2.1063 | 1.0637 | 3.9213 | 0.048 | 
| 1 | 2.2931 | 1.0912 | 4.4156 | 0.036 | 
| 1 | 2.5358 | 1.1034 | 5.2816 | 0.022 | 
| 1 | -0.2840 | 0.9623 | 0.0871 | 0.768 | 
| 1 | 1.1796 | 0.5967 | 3.9087 | 0.048 | 
For any prospect, the implied log odds of this model can be converted to a probability of being a long-term starter. A receiver-operating characteristic (ROC) curve analysis was employed to determine the optimal cut-point for this probability to balance the sensitivity (ruling in) and specificity (ruling out) of such a dichotomous rule.  It happened that using a cut-point of at least 15% correctly classified 72.2% of the sample as a potential starter or not.  The false positive rate of 66% of this rule left a bit to be desired, but the false negative rate of 5.5% implies we can be relatively certain when it classifies a player as a non-starter.
Be warned, this ROC curve analysis is really just a fancy type of data snooping: we visualize the balance of sensitivity and specificity, essentially looking every cut-point from 0% to 100% and look for the one that performs best!  So this method doesn't mean much unless the model has predictive power in an independent data set.  To this end, we used the same log-linear model above in the 2007-2011 data to estimate the probability of being a starter and applied this "rule of 15%":
Computing the odds ratio associated with this contingency table suggest that a starter is 11.6 times more likely (95% CI 3.4-38.9) to belong to the list of individuals passing the rule of 15%. However, we'd be prudent to probe how much better our rule would do against a random guess in identifying starters and non-starters.
Starters = 29% (23/85)
True Positive = 30% (11/37)
False Positive = 70% (26/37)
So the rule of 15% improved the chances of finding starters by only 1% in the test set (29% to 30%), which is a little disappointing, but 30% really isn't far from the 33% true positive test rate that we observed in the training data.
Non-starters = 74% (63/85)
True Negative = 98% (47/48)
False Negative = 2% (1/48)
However, the rule of 15% improved our chance of ruling out non-starters by 24% (74% to 98%)! As it happened, Brandon Pettigrew was the only prospect that the model failed to identify as a eventual starter (goodbye to all the Oklahoma/Lions homers who are now dismissing the model entirely).  For the unbiased readers who press onward, here's a list of all the other player in the validation set, classified by their model selection and starter status.
True Positives (Passed rule of 15% with GS > 64 at end of 2016 season):
 
 
 
 
 
 
 
  | Year | Name | Bench | MSY% | Age% | SPDScore% | P(starter) | 
  | 2007 | Brent Celek | 19 | 77.30% | 79.79% | 58.37% | 0.19 | 
  | 2007 | Zach Miller | 16 | 76.77% | 96.72% | 31.75% | 0.15 | 
  | 2007 | Greg Olsen | 23 | 72.97% | 83.86% | 95.44% | 0.67 | 
  | 2008 | Martellus Bennett* | 18 | 90.42% | 98.03% | 80.42% | 0.46 | 
  | 2010 | Rob Gronkowksi | 23 | 93.18% | 99.08% | 78.90% | 0.74 | 
  | 2010 | Ed Dickson | 23 | 88.85% | 65.62% | 89.92% | 0.62 | 
  | 2010 | Jermaine Gresham* | 20 | 78.87% | 90.42% | 87.83% | 0.40 | 
  | 2010 | Jimmy Graham | 24 | 26.90% | 24.67% | 96.39% | 0.17 | 
  | 2011 | Charles Clay | 18 | 89.90% | 81.63% | 71.29% | 0.31 | 
  | 2011 | Lance Kendricks | 25 | 89.37% | 46.59% | 78.33% | 0.44 | 
  | 2011 | Kyle Rudolph | 19 | 82.81% | 96.19% | 40.87% | 0.20 | 
*Projected 2017 Free Agent as of 2/27/2017
False Positives (Passed rule of 15%, but GS < 64 at end of the 2016 season):
 
 
 
 
 
 
  | Year | Name | Bench | MSY% | Age% | SPDScore% | P(starter) | 
  | 2007 | Scott Chandler | 16 | 84.78% | 92.13% | 67.30% | 0.32 | 
  | 2007 | Ben Patrick | 22 | 83.33% | 67.85% | 70.72% | 0.48 | 
  | 2007 | Clark Harris | 21 | 83.07% | 63.91% | 54.56% | 0.36 | 
  | 2008 | Dustin Keller | 26 | 81.63% | 17.98% | 93.92% | 0.34 | 
  | 2008 | Kellen Davis II* | 22 | 77.56% | 72.83% | 91.83% | 0.61 | 
  | 2008 | Darrell Strong | 17 | 68.90% | 88.45% | 80.04% | 0.30 | 
  | 2008 | Gary Barnage | 22 | 63.91% | 71.39% | 73.57% | 0.42 | 
  | 2009 | Travis Beckum | 28 | 96.06% | 79.92% | 79.28% | 0.66 | 
  | 2009 | Jared Cook* | 23 | 80.31% | 85.43% | 94.30% | 0.70 | 
  | 2009 | Cameron Morrah | 24 | 56.04% | 83.99% | 75.10% | 0.46 | 
  | 2010 | Aaron Hernadez | 30 | 94.36% | 99.74% | 82.89% | 0.77 | 
  | 2010 | Dennis Pitta | 27 | 87.80% | 1.84% | 62.93% | 0.15 | 
  | 2010 | Dorin Dickerson | 24 | 78.22% | 85.04% | 96.58% | 0.70 | 
  | 2010 | Andrew Quarless | 23 | 72.05% | 95.67% | 78.14% | 0.62 | 
  | 2010 | Anthony Miller | 19 | 66.67% | 78.61% | 78.71% | 0.24 | 
  | 2010 | Dedrick Epps | 19 | 50.52% | 90.68% | 60.08% | 0.15 | 
  | 2010 | Jeff Cumberland* | 24 | 45.01% | 56.43% | 98.86% | 0.39 | 
  | 2010 | Michael Hoomanawanui | 25 | 42.39% | 91.47% | 75.86% | 0.43 | 
  | 2011 | Rob Housler | 22 | 87.93% | 51.71% | 97.53% | 0.58 | 
  | 2011 | D.J.Williams | 20 | 85.70% | 69.95% | 91.63% | 0.35 | 
  | 2011 | Julius Thomas | 16 | 79.27% | 62.34% | 75.48% | 0.21 | 
  | 2011 | Zach Pianalto | 22 | 77.03% | 88.85% | 66.35% | 0.54 | 
  | 2011 | Virgil Green | 23 | 68.64% | 65.88% | 95.25% | 0.55 | 
  | 2011 | Luke Stocker | 27 | 55.51% | 64.44% | 83.46% | 0.40 | 
  | 2011 | David Ausberry | 23 | 39.11% | 17.59% | 96.77% | 0.18 | 
  | 2011 | Jordan Cameron* | 23 | 21.65% | 66.14% | 95.82% | 0.31 | 
*Projected 2017 Free Agent as of 2/27/2017
If Thomas, Cameron, Cook, and Barnage can stay healthy for another season or two, we'll likely add another 4 players to the true positives, bumping it to a more healthy 40.5% in the validation set.  To comment, it's notable the list includes 5 players that spent some time with the Patriots (Gronkowski, Hernandez, Hoomanawanui, Chandler, Bennett). It will be interesting to see if that list grows at all in the coming free-agency period, to confirm my suspicions that the Patriots are systematically better at selecting tight-ends than some other franchises.
True Negatives (Failed rule of 15% and GS < 64 at end of the 2016 season):
 
 
 
 
 
 
 
 
  | Year | Name | Bench | MSY% | Age% | SPDScore% | P(starter) | 
  | 2007 | Jonny Harline | 15 | 81.89% | 3.02% | 31.18% | 0.02 | 
  | 2007 | Chad Upshaw | 16 | 79.00% | 61.29% | 19.01% | 0.06 | 
  | 2007 | Martrez Milner | 19 | 74.67% | 66.27% | 52.85% | 0.13 | 
  | 2007 | Joe Newton | 20 | 70.47% | 20.34% | 18.44% | 0.02 | 
  | 2007 | Kevin Boss | 19 | 65.88% | 43.70% | 37.26% | 0.05 | 
  | 2007 | Daniel Coats | 34 | 63.25% | 54.86% | 26.43% | 0.13 | 
  | 2007 | Dante Rosario | 20 | 56.69% | 73.62% | 46.39% | 0.09 | 
  | 2007 | Anthony Pudewell | 15 | 41.34% | 7.35% | 2.85% | 0.01 | 
  | 2007 | Michael Allan | 19 | 5.25% | 15.88% | 66.92% | 0.01 | 
  | 2008 | Evan Moore | 16 | 82.55% | 42.13% | 21.67% | 0.04 | 
  | 2008 | Tom Santi | 14 | 74.15% | 75.72% | 45.44% | 0.13 | 
  | 2008 | Jermichael Finley | 20 | 71.92% | 98.43% | 23.57% | 0.12 | 
  | 2008 | John Carlson | 20 | 67.98% | 9.06% | 21.29% | 0.02 | 
  | 2008 | Jacob Tamme | 18 | 64.04% | 51.57% | 79.85% | 0.14 | 
  | 2008 | Derek Fine | 24 | 61.42% | 2.23% | 44.68% | 0.06 | 
  | 2008 | Joey Haynos | 17 | 58.79% | 14.57% | 20.72% | 0.01 | 
  | 2008 | Craig Stevens | 27 | 33.20% | 14.96% | 93.73% | 0.14 | 
  | 2008 | Adam Bishop | 21 | 28.74% | 25.72% | 4.75% | 0.02 | 
  | 2008 | Brad Cottam | 24 | 17.19% | 24.54% | 90.87% | 0.12 | 
  | 2008 | Kolo Kapanui | 23 | 5.38% | 3.28% | 15.21% | 0.01 | 
  | 2009 | James Casey | . | 90.16% | 0.13% | 60.27% | 0.04 | 
  | 2009 | Davon Drew | 17 | 88.45% | 25.59% | 61.22% | 0.09 | 
  | 2009 | Bear Pascoe | . | 76.90% | 49.74% | 8.75% | 0.02 | 
  | 2009 | John Phillips | . | 74.80% | 89.63% | 40.68% | 0.12 | 
  | 2009 | Shawn Nelson | 19 | 69.69% | 18.77% | 89.54% | 0.10 | 
  | 2009 | Kory Sperry | 20 | 63.65% | 7.74% | 46.77% | 0.03 | 
  | 2009 | Cornelius Ingram | 21 | 58.14% | 10.10% | 69.20% | 0.12 | 
  | 2009 | Jared Bronson | . | 50.26% | 3.94% | 76.43% | 0.03 | 
  | 2009 | Dan Gronkowski | 26 | 43.83% | 4.72% | 45.06% | 0.05 | 
  | 2009 | Anthony Hill | 21 | 36.61% | 4.20% | 50.38% | 0.05 | 
  | 2009 | Richard Quinn | 24 | 19.55% | 69.42% | 48.29% | 0.12 | 
  | 2010 | Garrett Graham | 20 | 81.76% | 13.52% | 62.55% | 0.06 | 
  | 2010 | Tony Moeaki | 18 | 62.47% | 60.24% | 71.10% | 0.13 | 
  | 2010 | Colin Peek | 19 | 48.29% | 9.97% | 14.45% | 0.01 | 
  | 2010 | Nate Byham | . | 46.72% | 91.08% | 31.37% | 0.06 | 
  | 2010 | Jim Dray | 17 | 24.80% | 42.39% | 27.19% | 0.02 | 
  | 2010 | Mandel Dixon | . | 18.64% | 20.21% | 96.96% | 0.03 | 
  | 2010 | Fendi Onobun | . | 9.45% | 23.62% | 98.67% | 0.03 | 
  | 2010 | Brody Eldrige | . | 9.32% | 53.81% | 79.09% | 0.04 | 
  | 2011 | Lee Smith | 25 | 75.85% | 23.36% | 24.33% | 0.08 | 
  | 2011 | Cameron Graham | 18 | 75.46% | 80.97% | 4.37% | 0.06 | 
  | 2011 | Schuylar Oordt | 18 | 73.49% | 9.45% | 89.35% | 0.09 | 
  | 2011 | Daniel Hardy | 18 | 69.16% | 15.49% | 53.23% | 0.04 | 
  | 2011 | Allen Reisner | 14 | 60.89% | 71.65% | 27.00% | 0.06 | 
  | 2011 | Weslye Saunders | 19 | 50.13% | 79.27% | 68.44% | 0.14 | 
  | 2011 | Charlie Gantt | 27 | 49.48% | 35.43% | 34.98% | 0.08 | 
  | 2011 | Larry Donnell | . | 35.56% | 74.02% | 34.60% | 0.03 | 
 
In summary, "all models are wrong, some models are useful", but it appears we've got a fairly useful model that can help separate the chaff from the wheat when it comes to singling out potential long terms starters.  NFL franchises, and fantasy football players, should be wary of acquiring prospects that fail our rule of 15%.