Who should define reasonable priors and how?

It is hard for me to imagine doing any experiment without considering the value of the information. Methodology scholars have long complained that experiments have been “under powered”, and that much clinical research is not useful [1].

IJ Good considered information as something he called a “quasi-utility.” [2, p.40-41] This was also seen by Lindley [3], Bernardo [4], Degroot [5], and others throughout the decades after Shannon wrote his classic on information theory.

In terms of what you had written here [6], I was thinking that reflection on the cost of information might help in setting “skeptical” priors. We want a “severe test” that Deborah Mayo would advocate, but we would also want the skeptic to shift his beliefs with the information budget we have available, as well as one consistent with background information.

It may very well be that our (informational) budget does not permit us to persuade the skeptic, in light of what is currently believed.

Objective Bayesians like E.T. Jaynes suggested Maximum Entropy priors. That reduces to a constrained optimization problem, according to [7].

I was thinking it might also be possible to work Bayes Theorem in reverse, and derive a prior by using Good’s method of Imaginary Results with only:

  1. Skeptic and Advocate’s point estimates,
  2. the Schwarz/Bayesian Information criterion discussed in [8], which they give as:
    S = log (pr(D|\hat{\theta}_1, H_1)) - log (pr(D|\hat{\theta}_2, H_2)) - \frac{1}{2}(d_1 - d_2)log(n)
  3. A maximum sample size,
  4. The amount of shift after seeing data that would be considered important. I’d take some fraction > 0.5 of the distance between the point estimate of the hypothetical skeptic and advocate.

The regions where one or the other party would want to stop the experiment, having been so surprised by the data that they no longer wish to spend more of their budget to obtain observations. These regions would get less weight as a percentage of the total sample size.
Those parts where the models intersect, would get more weight (more observations). Smooth out the histogram, and center it at 0, and there is the skeptic’s prior for the experiment.

(I have a bit more thinking to about this, but it strikes me as plausible).

This seems to be related to a Bayesian Power Analysis [9]. It is obvious to me that areas where power is either extremely low or extremely high should have less weight, but sections where neither Skeptic or Advocate would be terribly surprised, would get more weight, implying they would need more observations to be convinced the other was correct.

Economists and other scholars dispute the axioms that define subjective expected utility model all the time. I think it is clear that everyday clinical research is far from how “rational” actors would behave.

But ultimately it doesn’t matter, if we think of changes in utility as information (in the mathematical sense). Bayesian decision theory methods remain valid in a world with state-dependent utilities and “irrational” actors, so long as there some penalty for excess optimism (or skepticism) that will induce agents to change, given finite sample sizes.

  1. Ioannidis, John (2016) Why Most Clinical Research Is Not Useful
    link

  2. Good, IJ (1983) Good Thinking: Foundation of Probability and Its Applications
    link

  3. Lindley, David (1956) On a Measure of information Provided by an Experiment
    link

  4. Bernardo, Jose (1979) Expected Information as Expected Utility
    link

  5. DeGroot, Morris (1984) Changes in Utility as Information
    link

  6. Harrell, F. (2019) Bayesian Power: No Unobservables

  7. Sivia, DS; Skilling, J. (2006) Data Analysis: A Bayesian Tutorial
    link

  8. Robert E. Kass & Adrian E. Raftery (1995) Bayes Factors, Journal of the American Statistical Association, 90:430, 773-795, (link)

  9. Ubersax, John (2007) Bayesian Unconditional Power

2 Likes