How to interpret “confidence intervals” in observational studies

I think this old post by @Sander is worth study:

The frequentist domination of statistics has caused the emphasis on randomization as a methodology to obscure what it was supposed to do in the first place: to create exchangeable observations for comparison.

Bernardo, J. M. (1996). The concept of exchangeability and its applications. Far East Journal of Mathematical Sciences, 4, 111-122. (PDF)

This confusion stems from the language in basic stats textbooks not being formally powerful enough to represent uncertainty about the entire experimental model, which can be done using hierarchical models, as Bernardo notes in his abstract (quoted below).

The general concept of exchangeability allows the more flexible modelling of most experimental setups. The representation theorems for exchangeable sequences of random variables establish that any coherent analysis of the information thus modelled requires the specification of a joint probability distribution on all the parameters involved, hence forcing a Bayesian approach. The concept of partial exchangeability provides a further refinement, by permitting appropriate modelling of related experimental setups, leading to coherent information integration by means of so-called hierarchical models. Recent applications of hierarchical models for combining information from similar experiments in education, medicine and psychology have been produced under the name of meta-analysis.

The concept of exchangeability is fundamental to the frequentist analysis of data using permutation tests.

Good, P. I. (2002). Extensions of the concept of exchangeability and their applications. Journal of Modern Applied Statistical Methods, 1(2), 34.

I think Bernardo might be a bit extreme when he writes about coherence demanding a Bayesian approach, when he has done extensive study of reference Bayesian methods, that attempt to maximize the weight of information in the observed data, while still obtaining a proper probability distribution.

If we view any data collection effort as attempting to minimize all sources of uncertainty (ie aleatory and epistemic), we are faced with a multi-objective decision problem, which is widely known to have multiple solutions.

Related Threads

3 Likes