Tuesday, November 11, 2014
Diet/Nutrition Data Analysis
Admittedly, this post will be a bit dry. Just a link to a publication of a diet study I helped analyze that was conducted by a nutritionist and fertility specialist targeting overweight/obese women with Poly-Cystic Ovarian Syndrome. Just meant to serve as an illustration of my developing time series skills. Enjoy!
Saturday, October 4, 2014
Interview with David J. Hand
Thought I should share the interview that came from of the amazing opportunity that I had to correspond with David J. Hand, recent author of The Improbability Principle, which can be read here.
The limitations of space in the article resulted in a seemingly dry exchange, but I found correspondence with Dr. Hand to engaging and enlightening. Below are some of the more whimsical questions on the back-end of the interview that didn't make it into the article:
The limitations of space in the article resulted in a seemingly dry exchange, but I found correspondence with Dr. Hand to engaging and enlightening. Below are some of the more whimsical questions on the back-end of the interview that didn't make it into the article:
(PW) I am awed at the result of this complex
genetic algorithm as the law of selection couples with the other laws,
regardless of the origins. On a related
note, do you think these laws have any bearing on the possibility of
intelligent life elsewhere in the Universe?
(DH) I think your
reaction to the power of evolution is a great illustration of my earlier point:
the fact that we understand the mechanism in no way detracts from the sense of
awe it inspires.
I’m sure the
laws do have a bearing on the possibility of intelligent life elsewhere in the
Universe. The law of truly large numbers
in particular must apply. This says that, given enough opportunities, any
event, no matter how improbable, must occur. Our galaxy has some 400 billion
stars, but other galaxies contain many more - some even 100 trillion stars. And
there are thought to be around 200 billion galaxies in the observable universe.
Multiply these numbers together and you get a result which is so truly large
that it seems simply inconceivable that there is not intelligent life
elsewhere. Unfortunately, however, other truly large numbers, such as the fact
that the radius of the observable universe is some 46.5 billion light years,
means that we’ll probably never interact with such life.
(PW) Hmm, assuming all alien life is as
bound by space and time as we are, I’d hazard that you didn’t take out that
insurance policy on alien abduction you mention in your book! In all seriousness, did you come across any
other interesting but improbable insurance policies while researching The Improbability Principle?
(DH) I have come across some unusual
insurance policies, both while writing the book and subsequently. Examples
include insurance against silent film star Ben Turpin’s crossed eyes becoming
uncrossed, against being struck by a meteorite, against giving birth to the
second incarnation of Christ, and Bette Davis insuring her waistline against
growing too large. I wonder what actuarial calculations underlie some of those
policies!
(PW) Yes, I believe Lloyds of London was
behind at least one of those policies. Perhaps the fine folks at
Significance can persuade some
actuary there to divulge some of trade secrets to put us in the know. Speculation aside, I do have it on good
authority that you own an extensive dice collection, so what are some of the crown
jewels of your collection?
(DH) Ah! Well, I
have some spherical dice (they have sliding weights inside, so they really do
end up pointing in one of six directions), weighted cubic dice, dice with
double sixes (on opposite faces), a die with a hundred faces, non-transitive
dice, irregular dice (I call them my DiskWorld dice), dice made of plastic,
wood, metal (including gold and silver), marble, granite, glass, and bone, and
also some rather wonderful jewelled dice. Not to mention many others.
(PW) How
fascinating! My Father has a quote
collection, An Encyclopaedia of
Compelling Quotations, which he began with a single quote from his
favourite book. If you had a similar
experience, which die/pair of dice inspired you to start your collection?
(DH) That’s an
interesting question! I don’t think I can recall the first ones. Certainly, the
asymmetric ones were amongst the earliest - their sheer absurdity struck me - along
with a rather beautiful set of carefully engineered casino dice. But I was also
very taken by a credit card sized sheet of metal, with an unfolded die etched
on it, which could be pushed out and folded into a cube - for those emergencies
when you find yourself without a die.
(PW) How
funny to face such a “die-er” situation!
I wonder if the folks behind www.random.org would be interested in making your
collection available online so you (or anyone) can have access to any dice
whenever the need should arise. So do
you carry this collapsible die around with you and how often does the need ever
arise for a 6 sided randomization device while on the go?
(DH) I hadn’t
seen the www.random.org
website before. Thanks for drawing it to my attention. Much easier than bending
the metal sheet into a cube. As to how often the need for such an emergency die
arises - why I need it all the time! Doesn’t everyone?
(PW) How
funny! In spite of your possibly
hyperbolical need for a die, I understand you don’t care for games of chance. Any reason why?
(DH) The problem
with games of chance is that I know the odds. Once, on a trip to Las Vegas for
a conference, my wife insisted on playing one of the games, ignoring my sage
warnings that the odds were against her, so she was bound to lose. I only
managed to convince her to stop when she’d become bored, after multiplying her
initial stake ten times.
(PW) Sure, though
I suppose expectation isn’t everything, as the excitement of variance is what
keeps the casinos in business (as much as it does the statisticians). Speaking of business, I don’t want to take up
too much of your valuable time, except to impose one final question: Should we be expecting another book in the
future? If so, can you give us any
teasers on the topic?
(DH) I always have several potential books
buzzing around in my head. It’s just a question of finding the time to put them
on paper. Unlike many people, I’m very lucky in that I enjoy writing. I do have
another book coming out later this year - on measuring national wellbeing,
co-authored with my colleague Paul Allin. Though it’s more technical than The Improbability Principle. Beyond
that, there are various possibilities, and we shall have to see which
crystallizes first.
Thursday, September 4, 2014
The Home Court Advantage
As the title suggests, this is regarding basketball. Specifically, men's Division I college basketball. This effort was part of my master's report, which ultimately was an exercise in data mining rather than statistical analysis (though some straight-forward statistical modeling was employed). I essentially wrote a simple web-crawler to download all the play-by-play data from the website of the Big 12 Sports Conference. The data was then parsed and analyzed using a combination of Excel and SAS. The culmination of the paper (though you should read it in it's entirety, for a nominal fee: http://ijr.cgpublisher.com/product/pub.191/prod.123.) rests in the following model:
Admittedly, the model is useless for prediction (sorry gamblers), as it depends on the post-hoc conference rankings at the end of the season. Sure, it may bit naive to assume that these rankings follow a linear pattern, but the error terms did appear normally distributed for the model. Still, I believe the coefficients of the model are useful for inferring the approximate point value of a rebound, assist, blocked shot, and steal. Possession gaining events like steals and defensive rebounds are worth approximately 1 additional point, while possession ending events like a turnover, cost about a point. This seems reasonable since the typical shooting percentage is around 50%, so gaining/losing a possession can be expected to net one more/less point.
In terms of application, I think this model would be useful for a division I coach to quantify the contribution of non-scoring events from the box score alone to rate the contribution of his players to the total margin of victory. Adjusting for total minutes played (i.e. tabulating the point differential per minutes on the court) might allow them to compare players of the same position with respect to these non-scoring statistics.
Admittedly, the model is useless for prediction (sorry gamblers), as it depends on the post-hoc conference rankings at the end of the season. Sure, it may bit naive to assume that these rankings follow a linear pattern, but the error terms did appear normally distributed for the model. Still, I believe the coefficients of the model are useful for inferring the approximate point value of a rebound, assist, blocked shot, and steal. Possession gaining events like steals and defensive rebounds are worth approximately 1 additional point, while possession ending events like a turnover, cost about a point. This seems reasonable since the typical shooting percentage is around 50%, so gaining/losing a possession can be expected to net one more/less point.
In terms of application, I think this model would be useful for a division I coach to quantify the contribution of non-scoring events from the box score alone to rate the contribution of his players to the total margin of victory. Adjusting for total minutes played (i.e. tabulating the point differential per minutes on the court) might allow them to compare players of the same position with respect to these non-scoring statistics.
Tuesday, September 2, 2014
Carrots and Caveats
The Carrot:
To inspire you to read more of my posts, let me tell you a little about myself. The power of prediction is what lured me into pursing a Masters of Science in Statistics after earning an undergraduate degree in Pure Mathematics. If mathematics is the mother of all sciences, then surely statistics is her firstborn son. Few people realize that statistics is an entire field of study, rather than some numbers tacked onto a report at work or some poll in the newspaper. In fact, whenever I admit to being a statistician, the majority of people respond, "Oh, so you're a numbers guy." While that may be true, Statistics is SO much more than numbers: it's the systematic study of random variation and the production of meaningful models to understand an academic, financial, or competitive endeavor.
The Caveat:
After completing my graduate work in 2012, I've been applying my science to clinical research. Some days I doubt that statistics is powerful enough to meaningfully predict outcomes that depend so heavily on fickle human nature. Still, as George Box said, "all models are wrong; some models are useful." Since no model is perfect and good statisticians do "precision guesswork", please don't send me angry emails saying, "I applied the model you reported in such-and-such post and we lost a boatload of money," There is always a chance that the model prediction was wrong (due to false assumptions in the data or the modeling) or you got unlucky. Also, as you read these posts, I'd encourage you to approach all analysis with an open mind and tempered with common sense, as statistics should always be used for illumination, rather than support of a preconceived notion.
To inspire you to read more of my posts, let me tell you a little about myself. The power of prediction is what lured me into pursing a Masters of Science in Statistics after earning an undergraduate degree in Pure Mathematics. If mathematics is the mother of all sciences, then surely statistics is her firstborn son. Few people realize that statistics is an entire field of study, rather than some numbers tacked onto a report at work or some poll in the newspaper. In fact, whenever I admit to being a statistician, the majority of people respond, "Oh, so you're a numbers guy." While that may be true, Statistics is SO much more than numbers: it's the systematic study of random variation and the production of meaningful models to understand an academic, financial, or competitive endeavor.
The Caveat:
After completing my graduate work in 2012, I've been applying my science to clinical research. Some days I doubt that statistics is powerful enough to meaningfully predict outcomes that depend so heavily on fickle human nature. Still, as George Box said, "all models are wrong; some models are useful." Since no model is perfect and good statisticians do "precision guesswork", please don't send me angry emails saying, "I applied the model you reported in such-and-such post and we lost a boatload of money," There is always a chance that the model prediction was wrong (due to false assumptions in the data or the modeling) or you got unlucky. Also, as you read these posts, I'd encourage you to approach all analysis with an open mind and tempered with common sense, as statistics should always be used for illumination, rather than support of a preconceived notion.
Subscribe to:
Posts (Atom)