If there were ever a shift to subjective Bayesian methods for observational research, what types of priors would be most defensible? Specifically, do most scientists consider the “null bias” to be justified, or not? Do research consumers (i.e., other scientists, clinicians), by default, tend to discount the possibility that certain cause/effect relationships might exist, simply because accepting the possibility that they do exist would be inconvenient? And is this approach “unscientific”?
In previous posts, Chris asserted that observational researchers who habitually overinterpret their findings are effectively short-circuiting the process of scientific discovery. They are essentially trying to do an end-run around the effortful evidence triangulation that’s needed to inform scientific decision-making. Conversely, those who decry “nullism” seem to be accusing research consumers with clinical/scientific backgrounds of short-circuiting scientific discovery in the opposite direction (by completely disregarding the potential value of any effects with confidence intervals that cross the null).
This 2013 publication by Greenland and Poole discusses the issue:
This 2019 post from Andrew Gelman’s blog is also relevant, as are the comments that follow it:
https://statmodeling.stat.columbia.edu/2019/08/12/here-are-some-examples-of-real-world-statistical-analyses-that-dont-use-p-values-and-significance-testing/
These passages from the 2013 Greenland/Poole article stood out (bolding is mine):
“Our stand against spikes directly contradicts a good portion of the Bayesian literature, where null spikes are used too freely to represent the belief that a parameter “differs negligibly” from the null. In many settings we see, even a tightly concentrated probability near the null has no basis in genuine evidence. Many scientists and statisticians exhibit quite a bit of irrational prejudice in favor of the null based on faith in oversimplified physical models; Shermer…is a vivid example involving cell phones and cancer (see the Greenland…chapter for a discussion). This null prejudice also arises more subtly from confusion of decision rules with inference rules, and from adoption of simplicity or parsimony as a metaphysical principle rather than as an effective heuristic…
Cultural norms vary among research areas on this question. In psychological research, for instance, many hold that the null hypothesis is almost never true…We may be highly certain that any effect present is small enough so that it would make sense to behave as if the null were true until presented with sufficient evidence otherwise (a practice both Fisher and Neyman recommended); this is a heuristic use of parsimony. But a prior that a hypothesis is (for now) a useful approximation to the truth can lead to results quite different from using a spiked prior (which presumes there is evidence that the tested hypothesis is exactly true)…When there is no such evidence, a spike represents an unscientific faith in, or commitment to, the null, with no empirical foundation in most health and social-science applications…”
First consider this phrase:
“This null prejudice also arises more subtly from confusion of decision rules with inference rules.”
We should be clear about who’s responsible for this confusion. As long as observational research is trumpeted in clinical journals, any inappropriate conflation of “inference rules” with “decision rules” is not going to be the fault of clinicians. Researchers who publish observational research in a clinical journal know that their audience will be clinicians and that clinicians are in the business of making clinical decisions. Therefore, they are sending a very strong signal that they expect clinicians to use their findings for this purpose.
Next, let’s consider the following claim:
“…a prior that a hypothesis is (for now) a useful approximation to the truth can lead to results quite different from using a spiked prior (which presumes there is evidence that the tested hypothesis is exactly true)…When there is no such evidence, a spike represents an unscientific faith in, or commitment to, the null, with no empirical foundation in most health and social-science applications.”
I don’t understand all the possible ways to define priors using subjective Bayesian methods, in order to reflect various degrees of belief that an important effect is present. But, after reading many observational studies over many years, I certainly don’t believe that a researcher’s hypothesis should be considered credible by default, simply because it exists. Most observational researchers cite “prior evidence” to justify their studies. They are usually trying to build a case that a certain exposure “causes” an outcome (though they rarely admit their causal aims). However, the prior evidence in question very often seems, scientifically speaking, either very flimsy or utterly uncompelling. For example, given the abysmal approval rate for drugs that show promise in a test tube, there’s usually little reason to believe that a repurposed drug will, strictly by virtue of its mechanism of action, prove efficacious for a new indication that’s completely different from the one for which it’s approved. An investigator’s prior might be be optimistic, while other scientists’ prior might be skeptical.
The surreptitious yet widely acknowledged practice known as HARK’ing is another reason why researchers’ hypotheses should not necessarily be afforded much credibility. HARK’ing reflects hopelessly perverted academic incentives and disdain for the labour required for important scientific discovery. For readers with any relevant subject matter knowledge, it’s dead easy to detect and really aggravating when it’s detected. Several years ago, I encountered a publication in which every article in every issue involved painfully obvious data dredging followed by HARK’ing- clearly the editors were fine with this approach!
For the reasons noted above, I don’t agree that the mere existence of a researcher’s hypothesis renders a null-based prior unscientific. Similarly, I wouldn’t agree that an unusually fortunate occurrence represents a “miracle” just because a religious person tells me that it does. Going forward, observational researchers will need to come to grips with the fact that scientifically compelling evidence triangulation takes a LOT more effort than many (?most) believe.
In short, clinicians do indeed harbour a “null prejudice.” And this prejudice does represent a “heuristic use of parsimony.” But, rather than viewing this particular heuristic in a derogatory light (i.e., conflating it with intellectual laziness), we should, arguably, recognize it as both justifiable and necessary in the current observational research ecosystem. We have been inundated by such a huge volume of egregiously overinterpreted, (largely) poor quality research for so many years that we would be completely irresponsible NOT to adopt a highly conservative approach to dealing with it. The daily tsunami of groundbreaking observational research “discoveries” would keep us running in a thousand different directions if we were to take every weak observational effect seriously. Clinicians simply wouldn’t be able to function without the “null prejudice”! Unwillingness to grapple for three hours with the the substantive shortcomings of every published observational study isn’t a sign of intellectual laziness, but rather a justifiable and absolutely essential survival mechanism. Finally, and most importantly, accusations of unscientific prejudice also distract from the much bigger actual problem - the poor quality of much modern observational research. What’s needed now, to restore lost credibility, is adoption of a much more cautious and conservative approach to the presentation of observational research findings.
If anyone knows of any published observational studies in medicine that used subjective Bayesian designs and generated a spectrum of posterior distributions corresponding to a spectrum of priors, I’d be keen to see them. It would be interesting to see how much a “null prejudice” could actually affect a study’s results.