When I learn of statistics theorems I often try to plug some numbers in to see roughly what kind of error rate is attained or how large of a sample size is necessary to get good results. I think this is a good practice for understanding the limitations of the theory. The title of this post may seem inflammatory, but I should clarify that I don’t think “useless” things have no value. A theorem which is practically useless may still be interesting for the mathematical techniques involved. It may show the way to refinements that become increasingly useful. Or it may suggest that a procedure is not unreasonable, and further empirical study can analyze the finite sample performance of said procedure.
All that being said, I also think that because mathematics is more technically difficult than empirical study it gets a bit too much prestige and respect. So, I intend to highlight some examples of theoretical results which are considered high impact, but which are severely limited in their practical usefulness.
My first example comes from a series of recent papers appearing in what is considered the most prestigious journal of statistical theory, the Annals of Statistics. These papers (referenced below) apply Stein’s method type arguments to obtain explicit bounds on the approximation error caused by assuming non-Gaussian data is actually Gaussian. The methods would be practically useful if the error bounds were small enough…
The papers are so technical that the forms of the error bounds are very complicated, so I’m referencing a version in a tutorial by Larry Wasserman instead of the more complicated version in the original papers. If there are (i.i.d.) data points in -dimensional space, and we approximate the distance of their sample mean from the true mean (and take the maximum over all coordinates) by assuming the corresponding probability can be computed from a multivariate Gaussian, then the absolute error between these two probabilities and is upper bounded by
Here’s where it gets useless. Suppose we want the error to be less or equal to 1/10, and we are only in dimensions. How large does have to be? Over .
The power of these theorems is that they give asymptotic results for when is increasing with . So for example, suppose . Then the error bound is less or equal to 1/10 if , and the corresponding value of .
Again, the theory in these papers is beautiful and probably setting a good direction for further inquiry. But we desperately need to supplement impressive looking math with more empirical study, and perhaps make it the norm for such empirical study to be included in the theoretical papers that make it into the top journals.
Larry Wasserman, Stein’s Method and The Bootstrap in Low and High Dimensions: A Tutorial, http://www.cs.cmu.edu/~aarti/SMLRG/Stein.pdf
Victor Chernozhukov, Denis Chetverikov, Kengo Kato (2012). Central Limit Theorems and Multiplier Bootstrap when p is much larger than n. http://arxiv.org/abs/1212.6906
Victor Chernozhukov, Denis Chetverikov, Kengo Kato (2013). Comparison and anticoncentration bounds for maxima of Gaussian random vectors. http://arxiv.org/abs/1301.4807.