Can Confidence Intervals Indicate Uncertainty/Precision?

A lot of people who interpret confidence intervals (and not as credible intervals) aim to use their span as an indication of precision and uncertainty, however, there have been arguments against this suggesting that confidence intervals simply cannot do this. (“There is no necessary connection between the precision of an estimate and the size of a confidence interval” - Morey et al. 2016, ).

I’d like to hear people’s thoughts. Do confidence intervals have any utility

Do CI’s indicate uncertainty? No. First of all uncertainty has not been defined well enough. Second of all the definition is parametric and model-dependent: all a CI of 95% says is that 95% of our models fell within this range.

I believe there was an article in Nature some time ago showing calculated CI’s for neutron weight or something like that. You can see visually that past bars did not contain the future right answer.

The case is much worse for survey data (eg ) where asking 1000 online survey respondents in no way guarantees that you have captured a significant chunk of 320,000,000 opinions. Unreported victimization ( the same ---- no assumption can prove something about reality, you can just try to assume different things to get a number at all.

In the case of financial markets, where I’m more comfortable, it’s clearer that a “100-sigma move” can and does happen. In an options model a material change (say RTO of a much larger entity) could, in the same way, be called a change in volatility. A less convoluted and stats-centric way of viewing the world would be to say that the stat model captured enough of the moments for a while to successfully make markets, but when the world changes you need a new model. This reinforces the quick-and-dirty view in which stats can be useful fast Excel answers, not philosophically justifiable “uncertainty quantification”.

Do confidence intervals have any utility? Absolutely yes. Every statistical number in my opinion should be reported as a 98% CI range, since the middle gives a false sense of precision. For example in Cathy Cohen / Gen Next’s survey cited above, the public is led to draw speculative conclusions on the mean which they would never draw on the range.

“according to preliminary estimates, young African Americans are pro-socialism somewhere between -8 and +28 percent
more than pro-capitalism ---- although we didn’t ask a direct comparison” ------- would be accurate, and posting (-8,28) versus (-29,+5) would not invite graph readers to compare means which, statistically, we cannot tell whether they subtract to negative or positive.

It would also point out that very few latin@ americans or asian americans were asked, whereas the plot forwarded to the press makes it look as if latin@s, Asians, Black, and WNH can be compared directly.

In the last study cited, a lack of a direct comparison question between socialism and capitalism is more damning than the failure to use intervals, showing that statisticians need to spend more time on experimental design and that “pure stat” debates won’t probably capture the most relevant factors to truth and inference ---- but that’s another topic.


The bad news: it is not possible to make probabilistic statements about the population based on a sample alone. If I see 200 black raven, I cannot say how likely it is that all raven are black.

The good news: in practice, "numerical differences between both approaches [frequentist versus Bayesian] are small, sometimes even smaller than those between two competing frequentist or two competing Bayesian approaches.

Credible Confidence: A Pragmatic View on the Frequentist vs Bayesian Debate. By Casper J. Albers ,Henk A. L. Kiers, Don van Ravenzwaaij