06 August 2009

The rich lode of Web data, experts warn, has its perils. Its sheer volume can easily overwhelm statistical models. Statisticians also caution that strong correlations of data do not necessarily prove a cause-and-effect link.

For example, in the late 1940s, before there was a polio vaccine, public health experts in America noted that polio cases increased in step with the consumption of ice cream and soft drinks, according to David Alan Grier, ahistorian and statistician at George Washington University. Eliminating such treats was even recommended as part of an anti-polio diet. It turned out that polio outbreaks were most common in the hot months of summer, when people naturally ate more ice cream, showing only an association, Mr. Grier said.

If the data explosion magnifies longstanding issues in statistics, it also opens up new frontiers.

“The key is to let computers do what they are good at, which is trawling these massive data sets for something that is mathematically odd,” said Daniel Gruhl, an I.B.M. researcher whose recent work includes mining medical data to improve treatment. “And that makes it easier for humans to do what they are good at — explain those anomalies.”

via For Today’s Graduate, Just One Word - Statistics - NYTimes.com.

I really enjoy and have a talent for mathematics and statistics.  I sometimes wish I had more training in it and pursued it beyond my two years of calculus, physics, and bioinformatics classes.

My history in the quantitative sciences has, instead, led me to work on information science (with a bent towards health and medicine), which may eventually return me, in some form, to more study of mathematics and statistics.

blog comments powered by Disqus