Are there situations in which a frequentist approach is inherently better than Bayesian?

Hi,

I have a bit of an impractical questions, but it still bugs me. I heard a casual remark about frequentist approaches being better in some settings, it made me thinking.

Apart from a local culture in a domain, or there not being a Bayesian alternative, are there situations in which a frequentist approach shines, despite there being a reasonable Bayesian alternative?

Cheers,

Sanne

Blockquote
Apart from a local culture in a domain, or there not being a Bayesian alternative, are there situations in which a frequentist approach shines, despite there being a reasonable Bayesian alternative?

Much of this dispute “between Bayes vs. Frequentists” is philosophical, and I’ve come to see as counter-productive. I would study a few of Sander Greenland’s posts in this forum (search for posts by @Sander or @Sander_Greenland ) who points out that the set of admissible Frequentist Procedures can also be considered Bayesian procedures.

This complete class theorem (which was only proved for parametric problems by Abraham Wald) was recently extended (using tools from mathematical logic, one of my other interests) to arbitrary (ie. infinite dimensional) problems.

The TLDR summary is that historically (ie. the past 100 years), frequentist methods lead to computable techniques that could be implemented with very limited computing technology, and when properly used, can get you very close to a Bayesian result, without the risk of basing the analysis on erroneous prior information. This remains true in nonparametric situations, where the Bayesian solution is often extremely complex to compute (but that is changing).

Related Readings:
B. Efron (1986) Why Isn’t Everyone a Bayesian?, The American Statistician, 40:1, 1-5, DOI: 10.1080/00031305.1986.10475342 (link)

I continue to be influenced by Herman Chernoff’s comment on the paper:

Blockquote
With the help of theory, I have developed insights and intuitions that prevent me from giving weight to data dredging and other forms of statistical heresy. This feeling of freedom and ease does not exist until I have a decision theoretic, Bayesian view of the problem … I am a Bayesian decision theorist in spite of my use of Fisherian tools.

Cobb, G. W. (2007). The Introductory Statistics Course: A Ptolemaic Curriculum? Technology Innovations in Statistics Education. Retrieved from

See the discussion of this paper in this thread:

The first 20 min of this talk by Michael I Jordan on the relationship between Bayesian and Frequentist methods are instructive:

5 Likes

The only time I can think of a frequentist analysis having an advantage over a Bayesian analysis is when a proper frequentist analysis can be done quickly and you just don’t have time to do a Bayesian one. And that’s assuming that the frequentist analysis is accurate. Many commonly used frequentist procedures (logistic regression, random effects models, …) use approximations that are not guaranteed to be accurate enough in a given situation.

The Bayesian approach just opens you up to so many more possibilities because of a completely different way of thinking. Take a look at this for example.

Thank you, I think this is a useful frame. Frequentist methods can be applied quick, but the devil is in the details. These details can be worked out, but need verification. In the Bayesian approach the details are upfront. The approach allows for a lot of model extensions. This is also the McElReath take I think. I pasted some of the advantages taken from your RMS site.

====
There are many advantages to fitting models with a Bayesian approach when compared to the frequentist / maximum likelihood approach that receives more coverage in this text. These advantages include

  • the ability to use outside information, e.g. the direction or magnitude of an effect, magnitude of interaction effect, degree of nonlinearity
  • getting exact inference (to within simulation error) regarding model parameters without using any Gaussian approximations as used so heavily in frequentist inference
  • getting exact inference about derived parameters that are nonlinear transformations of the original model parameters, without using the method approximation so often required in frequentist procedures
  • devoting less than one degree of freedom to a parameter by borrowing information
  • getting exact inference using ordinary Bayesian procedures when penalization/shrinkage/regularization is used to limit overfitting
  • obtaining automatic inference about any interval of possible parameter values instead of just using values to bring evidence against a null value
  • obtaining exact inference about unions and intersections of various assertions about parameters, e.g., the probability that a treatment reduces mortality by any amount or reduces blood pressure by $% mmHg
1 Like

Thank you for your elaborate reply. I grasp what you are saying I think, and will dive in some of the links you provided.

This thread seems like a good place to remind everyone about Fisher’s comments to his son-in-law George Box about the Bayes-Laplace approach. :wink:

4 Likes

This presentation by Sandy Zabell on the relationship between Fisher and Bayes is also instructive. A modern take on this old dispute explores situations where NP, Fiducial, and Bayesian procedures are in close agreement. This can be done in an estimation context; getting agreement on testing is more challenging. This goes under the acronym BFF – Bayes, Frequentist, Fiducial Inference – Best Friends Forever?

3 Likes

Nice! It really comes down to the definition of the nature of probabilities. A common Bayesian view is that probabilities are not inherent properties of objects or events and do not exist outside of the mind. This edges the discussion of not knowing towards to some kind of prior probability. Not meant to flame, just to understand better.

I think you’re right. IJ Good explained why probabilities must be partially in the mind. A probability is an information measure and is a function of the information available to the person quoting the probability. Good has an example where a card player observes that one card is scuffed thus will stick to the card below it. That play will have different probabilities than plays who are not aware of this.

1 Like

IJ Good is new to me, I will keep it in mind.

If you are interested and like podcasts, then I can recommend this podcast with David Spiegelhalter, former head of the Royal Statistical Society: #50 Ta(l)king Risks & Embracing Uncertainty, with David Spiegelhalter – Learning Bayesian Statistics

Another podcast with Bernouilli’s Fallacy author Aubrey Clayton in this series points to the work of Edward Jaynes: #51 Bernoulli’s Fallacy & the Crisis of Modern Science, with Aubrey Clayton – Learning Bayesian Statistics

Clayton made series of videos on Jaynes’ book: https://www.youtube.com/watch?v=rfKS69cIwHc&list=PL9v9IXDsJkktefQzX39wC2YG07vw7DsQ_

1 Like