Tuesday, November 11, 2014

Diet/Nutrition Data Analysis

Admittedly, this post will be a bit dry.  Just a link to a publication of a diet study I helped analyze that was conducted by a nutritionist and fertility specialist targeting overweight/obese women with Poly-Cystic Ovarian Syndrome.  Just meant to serve as an illustration of my developing time series skills.  Enjoy!

Saturday, October 4, 2014

Interview with David J. Hand

Thought I should share the interview that came from of the amazing opportunity that I had to correspond with David J. Hand, recent author of The Improbability Principle, which can be read here.

The limitations of space in the article resulted in a seemingly dry exchange, but I found correspondence with Dr. Hand to engaging and enlightening.  Below are some of the more whimsical questions on the back-end of the interview that didn't make it into the article:

(PW) I am awed at the result of this complex genetic algorithm as the law of selection couples with the other laws, regardless of the origins.  On a related note, do you think these laws have any bearing on the possibility of intelligent life elsewhere in the Universe?          

(DH) I think your reaction to the power of evolution is a great illustration of my earlier point: the fact that we understand the mechanism in no way detracts from the sense of awe it inspires.

I’m sure the laws do have a bearing on the possibility of intelligent life elsewhere in the Universe. The law of truly large numbers in particular must apply. This says that, given enough opportunities, any event, no matter how improbable, must occur. Our galaxy has some 400 billion stars, but other galaxies contain many more - some even 100 trillion stars. And there are thought to be around 200 billion galaxies in the observable universe. Multiply these numbers together and you get a result which is so truly large that it seems simply inconceivable that there is not intelligent life elsewhere. Unfortunately, however, other truly large numbers, such as the fact that the radius of the observable universe is some 46.5 billion light years, means that we’ll probably never interact with such life. 

(PW) Hmm, assuming all alien life is as bound by space and time as we are, I’d hazard that you didn’t take out that insurance policy on alien abduction you mention in your book!  In all seriousness, did you come across any other interesting but improbable insurance policies while researching The Improbability Principle?

(DH) I have come across some unusual insurance policies, both while writing the book and subsequently. Examples include insurance against silent film star Ben Turpin’s crossed eyes becoming uncrossed, against being struck by a meteorite, against giving birth to the second incarnation of Christ, and Bette Davis insuring her waistline against growing too large. I wonder what actuarial calculations underlie some of those policies!

(PW) Yes, I believe Lloyds of London was behind at least one of those policies.  Perhaps the fine folks at Significance can persuade some actuary there to divulge some of trade secrets to put us in the know.  Speculation aside, I do have it on good authority that you own an extensive dice collection, so what are some of the crown jewels of your collection?

(DH) Ah! Well, I have some spherical dice (they have sliding weights inside, so they really do end up pointing in one of six directions), weighted cubic dice, dice with double sixes (on opposite faces), a die with a hundred faces, non-transitive dice, irregular dice (I call them my DiskWorld dice), dice made of plastic, wood, metal (including gold and silver), marble, granite, glass, and bone, and also some rather wonderful jewelled dice. Not to mention many others.

(PW) How fascinating!  My Father has a quote collection, An Encyclopaedia of Compelling Quotations, which he began with a single quote from his favourite book.  If you had a similar experience, which die/pair of dice inspired you to start your collection? 

(DH) That’s an interesting question! I don’t think I can recall the first ones. Certainly, the asymmetric ones were amongst the earliest - their sheer absurdity struck me - along with a rather beautiful set of carefully engineered casino dice. But I was also very taken by a credit card sized sheet of metal, with an unfolded die etched on it, which could be pushed out and folded into a cube - for those emergencies when you find yourself without a die.

(PW) How funny to face such a “die-er” situation!  I wonder if the folks behind www.random.org would be interested in making your collection available online so you (or anyone) can have access to any dice whenever the need should arise.  So do you carry this collapsible die around with you and how often does the need ever arise for a 6 sided randomization device while on the go?

(DH) I hadn’t seen the www.random.org website before. Thanks for drawing it to my attention. Much easier than bending the metal sheet into a cube. As to how often the need for such an emergency die arises - why I need it all the time! Doesn’t everyone?

(PW) How funny!  In spite of your possibly hyperbolical need for a die, I understand you don’t care for games of chance.  Any reason why?


(DH) The problem with games of chance is that I know the odds. Once, on a trip to Las Vegas for a conference, my wife insisted on playing one of the games, ignoring my sage warnings that the odds were against her, so she was bound to lose. I only managed to convince her to stop when she’d become bored, after multiplying her initial stake ten times.

(PW) Sure, though I suppose expectation isn’t everything, as the excitement of variance is what keeps the casinos in business (as much as it does the statisticians).  Speaking of business, I don’t want to take up too much of your valuable time, except to impose one final question:  Should we be expecting another book in the future?  If so, can you give us any teasers on the topic?

(DH) I always have several potential books buzzing around in my head. It’s just a question of finding the time to put them on paper. Unlike many people, I’m very lucky in that I enjoy writing. I do have another book coming out later this year - on measuring national wellbeing, co-authored with my colleague Paul Allin. Though it’s more technical than The Improbability Principle. Beyond that, there are various possibilities, and we shall have to see which crystallizes first.    

Thursday, September 4, 2014

The Home Court Advantage

As the title suggests, this is regarding basketball.  Specifically, men's Division I college basketball.  This effort was part of my master's report, which ultimately was an exercise in data mining rather than statistical analysis (though some straight-forward statistical modeling was employed).  I essentially wrote a simple web-crawler to download all the play-by-play data from the website of the Big 12 Sports Conference.  The data was then parsed and analyzed using a combination of Excel and SAS.  The culmination of the paper (though you should read it in it's entirety, for a nominal fee: http://ijr.cgpublisher.com/product/pub.191/prod.123.) rests in the following model:


Admittedly, the model is useless for prediction (sorry gamblers), as it depends on the post-hoc conference rankings at the end of the season.  Sure, it may bit naive to assume that these rankings follow a linear pattern, but the error terms did appear normally distributed for the model.  Still, I believe the coefficients of the model are useful for inferring the approximate point value of a rebound, assist, blocked shot, and steal.  Possession gaining events like steals and defensive rebounds are worth approximately 1 additional point, while possession ending events like a turnover, cost about a point.  This seems reasonable since the typical shooting percentage is around 50%, so gaining/losing a possession can be expected to net one more/less point.

In terms of application, I think this model would be useful for a division I coach to quantify the contribution of non-scoring events from the box score alone to rate the contribution of his players to the total margin of victory.  Adjusting for total minutes played (i.e. tabulating the point differential per minutes on the court) might allow them to compare players of the same position with respect to these non-scoring statistics.

Tuesday, September 2, 2014

Carrots and Caveats

The Carrot:
To inspire you to read more of my posts, let me tell you a little about myself.  The power of prediction is what lured me into pursing a Masters of Science in Statistics after earning an undergraduate degree in Pure Mathematics.  If mathematics is the mother of all sciences, then surely statistics is her firstborn son.  Few people realize that statistics is an entire field of study, rather than some numbers tacked onto a report at work or some poll in the newspaper.  In fact, whenever I admit to being a statistician, the majority of people respond, "Oh, so you're a numbers guy."  While that may be true, Statistics is SO much more than numbers: it's the systematic study of random variation and the production of meaningful models to understand an academic, financial, or competitive endeavor.

The Caveat:
After completing my graduate work in 2012, I've been applying my science to clinical research.  Some days I doubt that statistics is powerful enough to meaningfully predict outcomes that depend so heavily on fickle human nature.  Still, as George Box said, "all models are wrong; some models are useful."  Since no model is perfect and good statisticians do "precision guesswork", please don't send me angry emails saying, "I applied the model you reported in such-and-such post and we lost a boatload of money,"  There is always a chance that the model prediction was wrong (due to false assumptions in the data or the modeling) or you got unlucky.  Also, as you read these posts, I'd encourage you to approach all analysis with an open mind and tempered with common sense, as statistics should always be used for illumination, rather than support of a preconceived notion.