Bayesian prediction when you have the whole population?

f2harrell · December 30, 2025, 3:11pm

@Pavlos_Msaouel that is beautifully stated, and captures my thinking better than I did. To me EPaM is a simpler concept that is far more appealing. I don’t think FaM really works without conceiving of a data generating mechanism that can be replicated, i.e., that transports. So FaM assumes all that EPaM assumes and makes an additional assumption about the existence and meaning of “populations”. “Populations” imply multi-time events whereas most events are one-time. For example a clinical trial cannot be perfectly replicated because of changing concomitant therapies, patients, and epidemics. This does have ramifications on the data model and experimental design, again applying equally to both schools of thought. For example we need some kind of additivity assumption regarding the effects of concomitant therapies that were not assessed, such that there is merely a shift and not a negation of the treatment effect of interest.

Thanks for continuing to engage in this fascinating discussion.

Pavlos_Msaouel · December 30, 2025, 4:11pm

Yes, this is a good way to build intuition about all these concepts, including the differences between various versions of Bayes versus freq. keeping in mind that even in cases where the frequentist results numerically coincide with the Bayesian posterior, they are still different in that the Bayesian model uses inferences from the posterior distribution in the parameter space, whereas the frequentist model makes sample space inferences. This has downstream implications that cannot completely go away.

Just for fun, in addition to the uniform Bayes-Laplace Beta(1, 1) and the Jeffreys Beta(0.5, 0.5) you can add the improper Haldane prior Beta(0, 0) which yields a posterior mean exactly equal to the sample proportion:

But whereas Beta(1, 1) assumes that extreme outcomes are as likely as any other outcome, Beta(0.5, 0.5) and Beta(0, 0) assume that extreme outcomes are much more likely than anything else by placing huge probability masses on them. The posteriors from Beta(0, 0) here can be proper if there is at least one COPD and one non-COPD case in each group. Otherwise they become improper by integrating to infinity as opposed to integrating to one.