Monday, December 21, 2015

Why (N-1) degrees of freedom?

This post is a diversion from my previous work in sports statistics to delve into education matter, which has been piquing my interest more and more these days.

Image result for google images standard deviation formulaWhen I tell people I'm a statistician, most people tell me that they had the (dis)pleasure of taking a required statistical course at some point in their academic career.   Since the standard intro course is anything but user-friendly, I usually get comments to the effect of "I did alright in the class, but I don't remember anything."  However, occasionally one gets a very good question like the following:  "Why is (n-1) the degrees of freedom in the standard deviation formula rather than n?"

This question shows some understanding of the concept of a degree of freedom:  if we have n observations (or dimensions) that may change, why aren't there n degrees of freedom?  The technical answer (which is of little use in conversation) is that the sample average (x-bar above) is a mathematically related to these n observations (a linear combination thereof); with a linear dependence between the parameters in the formula,we lose one degree of freedom.  However, if we didn't need to estimate the population mean (perhaps it was known or we exhaustively sampled a small population to find it) then we would divide by n, as intuition dictates.

Some statisticians argue that using n-1 is a bit pedantic as they usually deal in large samples and  n - 1 tends to n asymptotically,  However, it is instructive (and makes for better conversation) to consider what would happen with a small sample.  In particular, what if we sampled a single object?  Why, we would know nothing of the variation!  If we attempted to compute the standard deviation, we would arrive at a division by zero error using the above formula, consistent with the conclusion that the variation is unknowable when sampling only a single object.  However, if instead had a population of one member, the population mean would be none other than the value of that single observation.  In such a (boring) population, there is no variation at all, so variance and standard deviation should be zero.  Indeed, if we compute the variance/SD using the divide by n version of the formula above, we divide 0 (difference between observation and the population mean) by 1 (the degree of freedom).

Thus a complex situation can be better understood by reducing it to an extreme, a technique that is common in both logic and mathematics.

Thursday, June 25, 2015

Retrospective Study of NFL Tight End Success

In the last post, we saw positive correlation between several size adjusted combine scores and NFL success, quantified in terms of the season average receiving yards and touchdowns over the course of a tight-end's career.  For this post, we're going to change our methods slightly in defining NFL success in terms of pro-bowl selection(s) of a tight end.

I know that some of you may be thinking that pro-bowl selection is SUCH a popularity contest.  Yes, that may be true, but pro-bowl selection of a tight end is usually conditioned upon impressive season lines (receptions,  yardage, and touchdowns) as well, so its an easy way to reduce three continuous variables to a binary outcome (Y/N to pro-bowl status).

Furthermore, the strength of using a binary outcome is that this fits nicely into the realm of a case-control study from epidemiology (my day job), so I can easily use the predicted log odds-ratios to estimate the probability (albeit badly as no model is perfect) of a player with a particular draft profile having played in at least one pro-bowl at some point in their career.

Using my factors (size-adjusted combine statistics) and outcome of interest (Pro-bowl appearance status), I began modeling.  Bivariate analysis suggested that size adjusted 40 time, vertical leap, broad jump, and agility were all significantly different between eventual pro-bowlers and all yet-to-be-pro-bowlers.  Correlation analysis suggested that size adjusted 40 time and broad jump were significantly correlated.  Since 40 yard dash is completed by nearly all combine participants, it was decided to exclude size adjusted broad jump from the modeling.  Finally, exploratory logistic regression methods (backwards and forward selection) were applied to the 1999-2011 combine data to approximate the odds of a player with the particular combine profile being selected to a pro-bowl (NOT a replacement) at some point in their career based on size adjusted 40 times, vertical leap, and agility times.  Two-way interactions factors were also included as potential covariates.

The resulting logistic regression model depended primarily upon 2 factors:
1.  Weight adjusted 40 yard Dash Time (Average Momentum)
2.  Height complemented Vertical Leap (High Point Potential)

An ROC-curve analysis suggested that the best cutoff for predicting a legitimate pro-bowl appearance was a probability of 0.06, which ruled out 142 of the 188 (75.5% Specificity) combine tight-end hopefuls who never made it to a Pro-Bowl and correctly identified 9 (italics) of the 12 hopefuls who made it to a Pro-bowl (75% Sensitivity) listed below:

Player Name Prob(PB)
Vernon Davis 66.2%
Jimmy Graham 55.6%
Jordan Cameron 27.4%
Greg Olsen 24.6%
Marcedes Lewis 18.3%
Dallas Clark 16.1%
Jason Witten 12.0%
Rob Gronkowski 8.4%
Alge Crumpler 7.8%
Julius Thomas 4.2%
Todd Heap 2.7%
Chris Cooley 2.7%

The remaining hopefuls is a bit too long to list, but still contains some notable names with their pro-bowl hopes alive (Virgil Green, Martellus Bennett, Owen Daniels), but many are either free agents or retired already.  As a reference, I'll provide a nifty table for tabulation of the estimated probability of a pro-bowl based on high-point potential (vertical + height) and average momentum (Weight/Dash), but you can also access the interactive calculator to compute this Size-Adjusted Vertical and Velocity Athletic Greatness Estimate (SAVVAGE) for yourself:

        Wgt/40-yd
Hgt+Vert
45.0 46.0 47.0 48.0 49.0 50.0 51.0 52.0 53.0 54.0 55.0 56.0 57.0 58.0 59.0 60.0
100 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.1% 0.1% 0.2% 0.3% 0.5% 0.9% 1.5% 2.6% 4.3% 7.1%
101 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.1% 0.1% 0.2% 0.4% 0.7% 1.2% 1.9% 3.3% 5.4% 8.9%
102 0.0% 0.0% 0.0% 0.0% 0.0% 0.1% 0.1% 0.2% 0.3% 0.5% 0.9% 1.5% 2.5% 4.2% 6.9% 11.2%
103 0.0% 0.0% 0.0% 0.0% 0.0% 0.1% 0.1% 0.2% 0.4% 0.7% 1.1% 1.9% 3.2% 5.3% 8.7% 14.0%
104 0.0% 0.0% 0.0% 0.0% 0.1% 0.1% 0.2% 0.3% 0.5% 0.9% 1.4% 2.4% 4.1% 6.7% 11.0% 17.3%
105 0.0% 0.0% 0.0% 0.0% 0.1% 0.1% 0.2% 0.4% 0.6% 1.1% 1.9% 3.1% 5.2% 8.5% 13.7% 21.3%
106 0.0% 0.0% 0.0% 0.1% 0.1% 0.2% 0.3% 0.5% 0.8% 1.4% 2.4% 4.0% 6.6% 10.7% 17.0% 25.8%
107 0.0% 0.0% 0.0% 0.1% 0.1% 0.2% 0.4% 0.6% 1.1% 1.8% 3.0% 5.1% 8.3% 13.4% 20.9% 31.0%
108 0.0% 0.0% 0.1% 0.1% 0.2% 0.3% 0.5% 0.8% 1.4% 2.3% 3.9% 6.4% 10.5% 16.6% 25.4% 36.6%
109 0.0% 0.0% 0.1% 0.1% 0.2% 0.4% 0.6% 1.0% 1.8% 3.0% 5.0% 8.2% 13.1% 20.5% 30.5% 42.7%
110 0.0% 0.1% 0.1% 0.2% 0.3% 0.5% 0.8% 1.3% 2.3% 3.8% 6.3% 10.3% 16.3% 24.9% 36.1% 49.0%
111 0.0% 0.1% 0.1% 0.2% 0.4% 0.6% 1.0% 1.7% 2.9% 4.8% 8.0% 12.8% 20.1% 29.9% 42.1% 55.3%
112 0.1% 0.1% 0.2% 0.3% 0.5% 0.8% 1.3% 2.2% 3.7% 6.2% 10.0% 16.0% 24.4% 35.5% 48.4% 61.5%
113 0.1% 0.1% 0.2% 0.3% 0.6% 1.0% 1.7% 2.8% 4.7% 7.8% 12.6% 19.7% 29.4% 41.5% 54.7% 67.3%
114 0.1% 0.2% 0.3% 0.4% 0.8% 1.3% 2.2% 3.6% 6.0% 9.8% 15.6% 24.0% 34.9% 47.8% 60.9% 72.6%
115 0.1% 0.2% 0.3% 0.6% 1.0% 1.6% 2.8% 4.6% 7.6% 12.3% 19.3% 28.9% 40.9% 54.1% 66.7% 77.4%
116 0.2% 0.3% 0.4% 0.7% 1.2% 2.1% 3.5% 5.9% 9.6% 15.3% 23.5% 34.4% 47.2% 60.3% 72.1% 81.5%
117 0.2% 0.3% 0.6% 0.9% 1.6% 2.7% 4.5% 7.4% 12.0% 18.9% 28.4% 40.3% 53.5% 66.2% 76.9% 85.0%
118 0.2% 0.4% 0.7% 1.2% 2.1% 3.5% 5.7% 9.4% 15.0% 23.1% 33.8% 46.5% 59.7% 71.6% 81.1% 88.0%
119 0.3% 0.5% 0.9% 1.6% 2.6% 4.4% 7.3% 11.8% 18.5% 27.9% 39.7% 52.9% 65.6% 76.5% 84.7% 90.4%
120 0.4% 0.7% 1.2% 2.0% 3.4% 5.6% 9.2% 14.7% 22.7% 33.3% 45.9% 59.1% 71.1% 80.7% 87.7% 92.4%

For the purpose of model validation and the focus of a future post, we will look at a list of tight ends with scores greater than 0.06 from the 2012-2013 draft classes and track them forward in time to see if they reach a pro-bowl.

With many talented tight ends of recent years having a basketball background, we cannot rule out the possibility that a selection bias is driving this model.  With the notable success of Tony Gonzalez and Antonio Gates paving the way for players like Jimmy Graham and Julius Thomas, it's possible that players with impressive athleticism for their size are becoming more and more popular in the time frame sampled.  Opportunity breeds success, so this relationship may be correlation and not causation.  However, for a Pro-bowl selection, a tight-end typically catches at least ten touchdowns and 1000 yards, so it's reasonable to think that impressive high-point potential and large average momentum over the 40 yard dash would give one an advantage in both of these football statistics.

Tuesday, June 23, 2015

Valuation of a Pro-bowl Tight End

Most NFL organizations would hope to roster a pro-bowl caliber tight end.  Still, balancing the cost and benefit of such a personnel choice is worth a look.  To this end, I compiled a list of the 24 NFL 1st team Associated Press (AP) tight ends from the last ~50 years:

Year Name Team
2013 Jimmy Graham NO
2011 Rob Gronkowski NE
2009 Dallas Clark IND
2007 Jason Witten DAL
2004 Antonio Gates SDG
1999 Tony Gonzalez KAN
1994 Ben Coates NE
1993 Shannon Sharpe DEN
1992 Jay Novacek DAL
1991 Marv Cook NE
1988 Keith Jackson PHI
1986 Mark Bavaro NYG
1984 Ozzie Newsome CLE
1983 Todd Christensen RAI
1980 Kellen Winslow SDG
1976 Dave Casper OAK
1974 Riley Odoms DEN
1973 Charle Young PHI
1972 Ted Kwalick SF
1969 Charlie Sanders DET
1966 John Mackey BAL
1965 Pete Retzlaff PHI
1963 Mike Ditka CHI
1962 Ron Kramer GNB

With the exception of Jay Novacek and Todd Heap, the remaining 22 players attained their pro-bowl status with their drafting organization.  As such, I collected the associated team data (wins and total points) from their rookie season and subsequent three years of their drafting organization, choosing a 4 year span to mirror the current structure of the draftee contract in the NFL.  I also collected team data from the prior year to serve as a baseline reference for some paired testing procedures.

To illustrate the team contribution of an eventual pro-bowl tight end, I generated some 95% confidence intervals of the season win percentage.  In our sample it appears that an eventual pro-bowl tight end was often the difference between a losing and winning season:


Comparing the initial 4 year tenure of these eventual AP-1st Team Tight Ends to the baseline data, we see there's a initial bump of about 10% in win percentage during their rookie year, followed by another 9% increase their sophomore season, with only a small drop off of less than 3% in the 3rd and 4th years respectively, which is still significantly better than the team baseline of the year prior to their NFL draft:


We could have conducted a similar procedure based on points per game, but the graphs look pretty much the same, so let's just summarize in tabular form:

Prior Yr
Rookie
2nd Year
3rd Year
4th Year
Mean Pts/Gm
20.7
23.3
24.6
25.0
23.9
Lower 95%
17.7
20.1
20.7
21.8
21.0
Upper 95%
23.8
26.4
28.5
28.3
26.9
Mean change
-
2.5
3.9
4.3
3.2
Lower 95%
-
-0.3
1.3
2.3
0.4
Upper 95%
-
5.3
6.4
6.3
6.0

It's pretty clear from these figures that these AP 1st Team Tight-ends make a significant contribution to the team: to summarize, AP 1st team tight ends make a marginal contribution during their rookie year of an additional 2.5 points per game (95% CI: -0.3 to 5.3) translating to an additional 1.6 games (per 16 game schedule) their rookie year.  Subsequently, these players making the leap in subsequent years to contribute 3.9 (95% CI: 1.3-6.4), 4.3 (95% CI: 2.3-6.3), and 3.2 (95% CI: 0.4-6.0) additional points per game translating to 3.0, 2.6, and 2.4 additional wins over the course of the last three 16 game season in their rookie contracts.  All-in-all this amounts to nearly 10 additional wins over the course of the 4 year structure of current-day rookie contract for a AP-1st team caliber NFL tight end. Estimating the 2015 dollar value of these players based on their overall draft pick numbers resulted in an average contract of a little less than $4 million over 4 years, averaging out to about $400K per win

Tuesday, May 26, 2015

A cross-sectional study of NFL tight ends

Building upon our previous investigation of what defines a premier NFL tight end, the next logical step is to look for trends in active tight ends similar to what we saw in the hall of fame tight ends.

To this end, I used the season statistics combined with the NFL depth charts from the 2014 NFL to compile a list of starting NFL tight ends.  This list was approximately 40 players, as some teams field a 2 TE set while others had a mid-season injury at the position.  I struggled a little with what statistics would be most associate with "NFL success", but I ultimately chose to use the receptions, yardage and TD average per season across their career to capture usage, production, and scoring.

It seemed unfair to include rookies in this comparison, as it's widely acknowledged that the change from a college tight end to a professional is a difficult transition.  After excluding 3 rookies, we were left with a sample of 37 active tight ends.  I collected their career statistics along with combine numbers and filled in some of the gaps in the combine data using their college pro-day numbers (when available).  Here's a statistical summary;

VariableNMeanStd DevMinMax
Reception/Season3737.917.28.078.6
Yards/Season37437.8215.1103.6950.4
TD/Season373.72.30.310.8
DashTime (s)374.690.154.384.91
Height (in)3776 3/41 1/273   80   
Weight (lbs)37253.67.8236.0270.0
Bench (reps)3421.04.512.033.0
Vertical (in)3634 3/53   30   42   
BroadJump (in)32115 2/34 1/4110   128   
ShuttleTime (s)334.390.174.034.84
ConeTime (s)337.120.206.827.67

The logical question at this juncture is "Do any of the combine measurements correlated with NFL success (average receptions, yards, or TDs per season).  The answer: "Yes, but only weakly":

 CorrelationsDashHgtWgtBenchVertBroadShuttleCone
Receptions-0.230.140.090.020.150.280.01-0.22
Yards-0.30*0.130.080.100.210.33*-0.02-0.23
TDs-0.210.170.170.000.220.23-0.01-0.22

It looks like there may be a weak correlation between dash and cone times, as well as the broad jump and vertical jump.  The asterisk (*) denotes a marginal (p < 0.10) statistical significance.  However, we observed in our case series report on hall of fame tight ends that those men were characterized by a remarkable athleticism for their size.  To this end, it might be reasonable to adjust for height and/or weight in these numbers.  As I had a short stint as a physics teacher, I know the laws that govern the collision of bodies, so I tried to adjust in a manner that would have some physical interpretation that was meaningful:

Adj_Dash = Weight/Dash ~ Mass * Velocity = Average Momentum over 40 yard dash
Adj_Vert = Vertical+Height = High Pointing Potential
Adj_Broad = Weight*Broad ~ Mass * Velocity^2 ~ Explosive Kinetic Energy
Adj_Cone = Height/Cone

In adjusted cone, it may be arbitrary whether we choose to adjust for height or weight.  However, I felt like taller individuals (at the same weight), would have a harder time changing directions due to their increased size.

Inst_KEAvg_pHigh_PtAgilityComposite
SEASON_REC0.270.29*0.220.280.38
SEASON_YDS0.31*0.350.280.280.42
SEASON_TD0.280.340.29*0.30*0.38

Again, the asterisk (*) denotes a p-value less than 0.10, and the dagger (†) denotes a p-value less than 0.05.  However, it's worth noting that for any correlation greater than 0.25, the p-value is less than 0.15.  The Composite variable is simply the product of the 4 adjusted individual scales, which may suggest that there is some interaction between these 4 variables.  For a final note, I'll leave the reader with the scatter plot matrix to illustrate the strength of linear association that was detailed in the chart above.  I threw in the standard scoring fantasy points and PPR scoring points just to make it 5 x 5.




And so we can see a weakly linear relationship, but significant relationship between Receptions, Yards, and Touchdowns with the composite variable. 

Thursday, May 14, 2015

By the numbers: What justifies tight end induction to Hall of Fame?

Here's a brief foray into the career statistics of all 8 tight ends currently (as of 5/13/2015) inducted into the Football Hall of Fame:

First and foremost, the number of games and seasons played (ordered by % games missed):
1.  John Mackey played 139 games over 10 seasons (missed 1 game, 0.7%)
2.  Ozzie Newsome played 198 games over 13 seasons (missed 10 games, 4.8%)
3.  Mike Ditka played 158 games over 12 seasons (missed 10 games, 6.00%)
4.  Jackie Smith played 210 games over 16 seasons (missed 14 games, 6.25%)
5.  Charlie Sanders played 128 games of 10 seasons (missed 12 games, 9.2%)
6.  Shannon Sharpe played 204 games over 14 years (missed 20 games, 8.9%)
7.  Dave Casper played 147 games over 11 seasons (missed 21 games, 12.5%)
8.  Kellen Winslow played 109 games over 9 seasons (missed 35 games, 24.3%)

With the exception of Kellen Winslow , it looks like the higher percentage in the latter players might be attributed to the extended season length (extended from 14 to 16 games in 1978), which made a season ending injury more costly.  Also, Mackey's low number might be attributed to the fact that the only game his missed in his 9th season was due to the knee injury that cut his career short.

Related to this, the failure time (consecutive initial games) may also be a useful statistic to examine:
1.  John Mackey missed the last game of his 10th (and final) season (139 games).
2.  Jackie Smith missed the 10th game of his 9th season, (135 games)
3.  Mike Ditka, missed the 9th game of his 7th season (92 consecutive games)
4.  Ozzie Newsome missed the 3rd game of his 5th season (66 games)
5.  Shannon Sharpe missed the 2nd game of his 5th season (65 games)
6.  Charlie Sanders missed game 13 of his 4th season (54 games)
7.  Dave Casper missed the 13th game of his 3rd season (40 games)
8.  Kellen Winslow missed the 9th game of his rookie year (8 games)

Mackey's reputation as a speedster and his ability to make tacklers miss, may help explain his longer streak.  On the other end of the spectrum, Winslow battled through knee injuries his entire career, possibly because he was the largest of the group (6'5", 250 lbs), and may have tried to run through more (and possibly lower) tackles as a result.

And here are their career yardage stats (adjusted for catch number), courtesy of the hall of fame:
1.  Jackie Smith: 7918 yards/480 catches (16.5)
2.  John Mackey: 5236 yards/331 catches (15.8)
3.  Charlie Sanders:  4817 yards/336 catches (14.3)
4.  Dave Casper:  5216 yards/378 catches (13.8)
5.  Mike Ditka: 5812 yards/472 catches (13.6)
6.  Kellen Winslow: 6741 yards/541 catches (12.5)
7.  Shannon Sharpe:  10060 yards/815 catches (12.3)
8.  Ozzie Newsome:  7980 yards/662 catches (12.1)

Again, Mackey has an impressive average, but Jackie Smith actually put up better numbers over the course of the longest career of the bunch.

And here are the numbers of receiving touchdowns, adjusted for catches:
1.  Dave Casper 52 TDs/378 catches (13.8% TD)
2.  John Mackey 38 TDs/331 catches (11.5%)
3.  Kellen Winslow 52 TDs/541 catches (9.6%)
4.  Charlie Sanders 31 TDs/336 catches (9.2%)
5.  Mike Ditka 43 TDS/472 catches (9.1%)
6.  Jackie Smith 40 TDs/480 catches (8.3%)
7.  Shannon Sharpe TDs/815 catches (7.6%)
8.  Ozzie Newsome  47 TDs/662 catches (7.1%)

A final statistic of note might be fumbles, adjusted for number of catches:
1.  Ozzie Newsome  3 fumbles/662 catches (0.5%)
2.  Shannon Sharpe 7 fumbles/815 catches (0.9%)
3.  Charlie Sanders 6 fumbles/336 catches (1.8%)
4.  Dave Casper 7 fumbles/378 catches (1.85%)
5.  Mike Ditka 9 fumbles/472 catches (1.91%)
6.  Kellen Winslow 11 fumbles/541 catches (2.0%)
7.  Jackie Smith 12 fumbles/480 catches (2.5%)
8.  John Mackey 11 fumbles/331 catches (3.3%)

To summarize,  these men tended to have long careers (at least 9 seasons), over which they averaged big gains (range: 12.1-16.5 yards/catch), and caught touchdowns (range: 7.1%-13.8% of catches converted to TD) and protected the ball (range: 0.5%-3.3% fumbles/catches).

It might also be interesting to see how Tony Gonzalez (surely the next tight end inductee) fairs on similar statistics:

Tony Gonzalez:
-Missed 2/272 games over 17 year career (0.08%)
-Missed 1st game of 3rd season (32 straight), but had two 120+ consecutive starts thereafter.
-15,127 yards/1325 catches (11.4 yards/catch)
-111 TDs/1325 catches (8.4% TDs/catch)
-6 fumbles/1325 catches (0.04% fumbles/catch)

The interesting thing about his six fumbles is their distribution over time:  3 in 1998 (sophomore year), 2 in 1999, and 1 in 2006.  So after his 1st three years in the league, he only fumbled once!

Wednesday, May 13, 2015

Case Series of Hall of Fame Tight ends

With the purpose of trend-spotting, it may be an interesting exercise to look at commonalities of the greatest tight ends of all time.  Some soul-searching was required as to the definition of "all-time great", but I thought admission to the Hall of Fame would be a reasonable criteria.  Still, this may leave an active (Antonio Gates) or recently retired players (Tony Gonzalez) out, but it's still a reasonable starting point.  There have been 8 Tight-ends from the "modern era" inducted.

Ordered by year of induction (career years):
1988:  Mike Ditka (1961-1972)
1992:  John Mackey (1963-1972)
1994:  Jackie Smith (1963-1978)
1995:  Kellen Winslow (1979-1987)
1999:  Ozzie Newsome (1978-1990)
2001:  Dave Casper (1974-1983)
2007:  Charlie Sanders (1968-1977)
2011:  Shannon Sharpe (1990-2001)

Immediately, we quite obviously note that Hall of Fame Players tend to have long careers. If you're interested in the details regarding seasons, games, and career statistics to see what justifies admission to the Hall of Fame for a tight end, see this side post.  If you're comfortable taking their "greatness" for granted, read on.

Let's explore what made attributes of these inductees may help explain their hall of fame status:

First, let's compare their frames (sorted by weight to height ratio):
1.  Winslow 77 in, 251 lbs (3.26)
2.  Casper 76 in, 240 lbs (3.16)
3.  Newsome 74 in, 232 lbs (3.13)
4.  Sharpe 74 in, 230 (3.11)
5.  Smith  76 in, 235 lbs (3.09)
6.  Ditka 75 in, 228 lbs (3.04)
7.  Sanders 76 in, 230 lbs (3.03)
8.  Mackey 74 in, 224 lbs (3.02)

The critical reader may be thinking, "why weight to height ratio?"  It was chosen under the assumption that difficulty to tackle should be proportion to weight (mass) and inversely proportional to height (as shorter players have lower center of gravity).  This may be incorrect, but it helps stratify the inductees into heavy, moderate, and light build (relatively speaking). Body Mass Index (BMI) may work equally well, but I worked in biostatistics long enough to have a natural aversion to that popular, but generally poor, metric.

To help put these weight to height values in perspective, let's compare them to the average TE combine participant between 1999 and 2015:  There was a weak downward trend in Weight to Height ratio over those years (see below), so extrapolating backward in time, these 8 men may have been even further below the average for their respective times.  Regardless, it's safe to compare the mean of 3.34 lbs/in and standard deviation (0.13 lbs/in) and draw the conclusion that these men were not as solidly built as the majority of men that tend to play this position nowadays.

weight to height ratio of combine tight-end participants (1999-2015)
In the miscellaneous department, the majority of these men attempted some special teams duties at some point in their NFL careers:
1.  Mackey attempted 9 kick returns for 271 yards (avg. 30.1)
2.  Smith attempted 5 kick returns for 103 yards (avg. 20.6)
3.  Ozzie Newsome attempted 2 punt returns returns for 29 yards (avg 14.5)
4.  Ditka attempted 3 kick returns for 30 yards (avg. 10)
5.  Winslow attempted 2 kick returns for 11 yards (avg 5.5)

With the exception of Mackey and Newsome, the numbers aren't very impressive, but it does speak to the fact that these players must have had the speed and sure-handedness to be worth considering for KR/PR duty by their respective coaches.  Casper, while not on the above list, was reported to have consistently run a barefoot 4.6 second 40 yard dash for pro scouts while at Notre Dame, and a similar time has been attribute to John Mackey as well.

Related to other roles, the inductees often contributed in other positions/roles on the gridiron:
1.  Ditka also played defensive lineman and punter for the Pitt Panthers.
2.  Dave Casper played 5 different positions over his 3 year tenure at Notre Dame.
3.  Jackie Smith punted his first 3 years with the Arizona Cardinals (127 punts, averaged 39.1 yards).
4.  John Mackey played running back for two years at Syracuse, averaging 4.5 yards on 58 attempts.
5.  Charlie Sanders also converted from wide receiver to tight end as a Senior at Minnesota.
6.  Newsome was listed as a wide receiver during his time at Alabama
7.  Shannon Sharpe played WR at Savannah State (Division II) and didn't convert to TE until NFL.  He also played quarterback, running back and linebacker in high school.
8.  Kellen Winslow didn't play multiple positions, but he did attempt 4 halfback passes in his NFL career, completing only the first attempted during the playoffs of 1980 for a gain of 28 yards.

As the tight end position is a hybrid position that requires both blocking and passing, it's not surprising that these men would be serviceable at multiple positions and prove to be dual or tri-athletes in college or high school:

1.  Mike Ditka - Basketball (forward), Baseball (outfield), intramural wrestling at Pittsburgh.
2.  John Mackey - Basketball (forward), track (events unknown) at Syracuse.
3.  Jackie Smith - Track (hurdles) at Northwest Louisiana.
4.  Charlie Sanders - Basketball (forward) at the University of Minnesota.
5.  Shannon Sharpe - Basketball (39.5" vertical), track and field (jumping/throwing) at Savannah St.
6.   Dave Casper played football, basketball, and golf in high school.
7.  Newsome also played basketball and baseball in high school
8.  Kellen Winslow played chess in high school until recruited to the football team his senior year,

To summarize, these 8 inductees were generally durable, sure handed, and surprisingly athletic for their size.  They also appeared to possess extraordinary speed/agility for their size (fast 40 times reported and/or ran track).  A couple of these men, namely Sanders and Sharpe, were known to have supreme jumping ability.  Some were late converts to football or the tight end position and could play a variety of positions, with special teams contributions giving them the time to learn a new position.

In the next post, I hope to explore these trends more quantitatively in a cross-section of starting NFL tight ends to see if we can spot any correlation/association between these factors and NFL success.

A case study of premier NFL tight end Antonio Gates

With Antonio Gates recently clearing the 10,000 yard mark and soon to catch his 100th touchdown pass, there's little doubt that he's worthy of the Pro Football Hall of Fame.  His transition from NCAA basketball to the NFL has drawn a lot of attention from the recruiting world, which may suggest that sheer athleticism is indicative of success as an NFL tight end.  Due to Gates' well-publicized success, nearly every NFL team rosters a "project" tight-end: an sizable yet athletic individual who they hope may eventually contribute at the highest level.

While Gates built a reputation as a basketball NCAA player, he actually got started on a football scholarship.  After putting up a blazing 4.5 forty yard dash time and leaping a ridiculous 39 inches vertically at a high school football combine, Gates got an offer from Michigan State's Nick Saban with the understanding that he could also play basketball.  Between some classroom difficulties over his red-shirt year and Saban asking him to focus on football, Gates bounced around between a few institutions before landing at Kent State where he could focus on his first love of basketball.

Gates' combination of size, strength, and speed helped him lead the Kent State Golden Flashes to the elite 8 his junior year (2002) before losing the Hoosiers of Indiana University. At 6 foot, 4 inches tall, the NBA scouts suggested Gates was too much of a "tweener" to make it in the NBA, so his agent suggested he re-consider football as a career.  Perhaps due to his notoriety with the recent NCAA tourney run, Gates drew interest from more than half the league (19 teams) at Kent State's Pro day, but was offered only a handful of tryouts.  The San Diego Chargers got the first chance, scheduling a workout with Gates that ultimately proved so impressive that they signed him immediately.  

Gates rookie year was overshadowed by the Charger's losing season despite the best efforts of all-around back, LaDanian Tomlinson, who had 2370 yards from scrimmage.  With Tomlinson catching 100 passes that year, he drew a lot of attention from defenders the subsequent year.  Gates was absolutely dominating with the resulting 1-on-1 coverage, and his receiving line of 81-964-13 earned him a Pro-Bowl nomination in this sophomore season.  Gates has put up similar numbers for the last 10 years, with minor dips in production when he was playing through some chronic foot injuries.

Jimmy Graham, a similar late football convert to football, goes so far as to say that Gates "paved the way" for him and others like him.  However, it's possible that recency bias may result in overlooking some of the great players from the past who also spent significant time on the hardwood.  The next post will focus on the 8 players in the football hall of fame in the tight end position to see if there is any trend toward collegiate basketball (or other sports).

Sunday, March 15, 2015

NCAA tourney odds - 2015

Selection Sunday is like the Christmas Day of College Basketball Fans.  After a lot of speculating, we finally get to see where our favorite teams landed or do some statistical estimation regarding roads to the Final Four.  With the bracket set today, we can finally do some math!

Thanks to Ken Pomeroy we can compute a win probability for any match-up based on the teams' respective Pythagorean win percentage.  In turn, using the probabilities from previous rounds, we can estimate the likelihood of a team advancing to any round once the tourney schedule is set.  In theory this allows us to see the most likely winners in each slot of a potentially completed bracket.

Instead of titling by the ridiculously misleading regions, I'll just describe them by where they appear on the standard ESPN bracket (top left, top right, bottom left, bottom right) and list the probability of the team reaching the various rounds of the tournament.  I'll also BOLD the team with the highest win probability of reaching each position so the reader can easily see the most likely outcome:

TOP RIGHT Pyth Round 1 Sweet 16 Elite 8 Final 4 Finals Champs
Kentucky 1 0.9787 97.5% 89.3% 81.1% 67.6% 45.7% 33.5%
Manhattan 16 0.5411 2.5% 0.6% 0.1% 0.0% 0.0% 0.0%
Cincinnati 8 0.8242 56.6% 6.2% 3.2% 1.1% 0.2% 0.1%
Purdue 9 0.7825 43.4% 3.9% 1.7% 0.5% 0.1% 0.0%
West Virginia 5 0.8539 62.8% 37.1% 6.1% 2.4% 0.6% 0.2%
Buffalo 12 0.7759 37.2% 17.2% 1.9% 0.5% 0.1% 0.0%
Maryland 4 0.8289 61.8% 31.0% 4.5% 1.6% 0.3% 0.1%
Valparaiso 13 0.7499 38.2% 14.7% 1.5% 0.4% 0.1% 0.0%
Butler 6 0.8624 47.8% 20.5% 8.7% 1.8% 0.4% 0.1%
Texas 11 0.8725 52.2% 23.4% 10.4% 2.3% 0.6% 0.2%
Notre Dame 3 0.9127 87.5% 53.8% 29.4% 8.3% 2.8% 1.1%
Northeastern 14 0.598 12.5% 2.3% 0.4% 0.0% 0.0% 0.0%
Wichita State 7 0.9044 73.2% 39.3% 21.0% 5.6% 1.8% 0.7%
Indiana 10 0.7762 26.8% 8.5% 2.6% 0.4% 0.1% 0.0%
Kansas 2 0.9111 83.6% 48.6% 26.9% 7.5% 2.5% 1.0%
New Mexico State 15 0.6674 16.4% 3.7% 0.8% 0.1% 0.0% 0.0%

BOTTOM RIGHT Pyth Round 1 Sweet 16 Elite 8 Final 4 Finals Champ
Wisconsin 1 0.9615 95.6% 81.9% 64.5% 36.3% 17.4% 10.5%
Coastal Carolina 16 0.5371 4.4% 1.0% 0.2% 0.0% 0.0% 0.0%
Oregon 8 0.7972 47.3% 7.8% 3.0% 0.6% 0.1% 0.0%
Oklahoma State 9 0.8141 52.7% 9.4% 3.9% 0.8% 0.1% 0.0%
Arkansas 5 0.8416 73.1% 31.7% 7.8% 1.9% 0.4% 0.1%
Wofford 12 0.6617 26.9% 6.2% 0.8% 0.1% 0.0% 0.0%
North Carolina 4 0.9019 79.3% 54.3% 18.8% 6.5% 1.9% 0.7%
Harvard 13 0.7059 20.7% 7.8% 1.1% 0.1% 0.0% 0.0%
Xavier 6 0.8518 52.4% 23.4% 5.2% 1.5% 0.3% 0.1%
BYU 11 0.8395 47.6% 20.3% 4.2% 1.1% 0.2% 0.1%
Baylor 3 0.9038 77.3% 48.7% 15.0% 5.8% 1.7% 0.6%
Georgia State 14 0.7344 22.7% 7.6% 1.0% 0.2% 0.0% 0.0%
VCU 17 0.7499 31.3% 3.4% 1.1% 0.2% 0.0% 0.0%
Ohio State 10 0.868 68.7% 13.6% 6.7% 2.1% 0.5% 0.1%
Arizona 2 0.9674 97.7% 82.7% 66.8% 42.7% 22.1% 14.2%
Texas Southern 15 0.4111 2.3% 0.3% 0.0% 0.0% 0.0% 0.0%

TOP LEFT Pyth Round 1 Sweet 16 Elite 8 Final 4 Finals Champ
Villanova 1 0.9571 96.7% 81.5% 60.2% 35.8% 23.0% 10.0%
Lafayette 16 0.4285 3.3% 0.5% 0.0% 0.0% 0.0% 0.0%
North Carolina State 8 0.8111 51.7% 9.5% 3.5% 0.8% 0.2% 0.0%
LSU 9 0.8006 48.3% 8.5% 3.0% 0.7% 0.2% 0.0%
Northern Iowa 5 0.908 84.9% 54.2% 20.6% 8.4% 3.8% 1.0%
Wyoming 12 0.6373 15.1% 3.9% 0.5% 0.1% 0.0% 0.0%
Louisville 4 0.8756 78.7% 37.3% 11.7% 3.9% 1.5% 0.3%
UC Irvine 13 0.6559 21.3% 4.6% 0.6% 0.1% 0.0% 0.0%
Providence 6 0.8458 56.9% 22.1% 5.8% 1.7% 0.5% 0.1%
Dayton 11 0.8057 43.1% 14.2% 3.1% 0.7% 0.2% 0.0%
Oklahoma 3 0.915 89.1% 61.3% 24.3% 10.4% 4.9% 1.4%
Albany 14 0.5686 10.9% 2.3% 0.2% 0.0% 0.0% 0.0%
Michigan State 7 0.8786 61.7% 16.5% 8.0% 2.7% 1.0% 0.2%
Georgia 10 0.8178 38.3% 7.4% 2.7% 0.7% 0.2% 0.0%
Virginia 2 0.9587 95.3% 75.4% 55.8% 33.8% 22.0% 9.7%
Belmont 15 0.5363 4.7% 0.8% 0.1% 0.0% 0.0% 0.0%


BOTTOM LEFT Pyth Round 1 Sweet 16 Elite 8 Final 4 Finals Champ
Duke 1 0.9395 93.3% 71.0% 45.2% 26.3% 12.6% 4.5%
North Florida 12 0.5258 6.7% 1.2% 0.2% 0.0% 0.0% 0.0%
San Diego State 8 0.8468 57.7% 17.3% 6.9% 2.4% 0.6% 0.1%
St. John's 9 0.8019 42.3% 10.4% 3.4% 0.9% 0.2% 0.0%
Utah 5 0.9275 74.0% 52.0% 27.7% 14.8% 6.4% 2.0%
Stephen F. Austin 12 0.8183 26.0% 12.2% 3.7% 1.1% 0.2% 0.0%
Georgetown 4 0.867 83.6% 33.7% 12.8% 4.8% 1.4% 0.3%
Eastern Washington 13 0.5618 16.4% 2.1% 0.2% 0.0% 0.0% 0.0%
SMU 6 0.8734 62.5% 29.7% 11.0% 4.3% 1.3% 0.3%
UCLA 11 0.8056 37.5% 13.6% 3.6% 1.0% 0.2% 0.0%
Iowa State 3 0.9047 86.5% 53.9% 23.8% 11.0% 4.0% 1.0%
UAB 14 0.5979 13.5% 2.8% 0.3% 0.0% 0.0% 0.0%
Iowa 7 0.8594 55.3% 16.7% 7.6% 2.7% 0.8% 0.1%
Davidson 10 0.8319 44.7% 11.8% 4.8% 1.5% 0.4% 0.1%
Gonzaga 2 0.944 93.8% 70.5% 48.7% 29.0% 14.5% 5.4%
North Dakota St. 15 0.5258 6.2% 1.0% 0.1% 0.0% 0.0% 0.0%
With so many games, we expect to see some "unlikely" winners into the round of 32, but as the tourney progresses, these unexpected winners will eventually fall out.  Looking into the latter rounds, it seems that Kentucky, with the overall 1 seed, has a gaudy 1 in 3 chance of winning the tourney.