Should the Neyman-Pearson lemma remain dominant in the health sciences?

If we accept the premise that the Neyman-Pearson lemma could have use in certain areas of science such as psychology and industrial quality control (where replications are numerous, easy, and possible) than we might be able to argue that this form of behavioral guidance, which is meant to control errors in the long term, can have some utility.

However, this is also the dominant methodology used in most of the health sciences, and as some of you may know (since many of us work in the health sciences) replications aren’t easy, and trials tend to be incredibly expensive.

If you believe that the NP lemma has a place in health sciences, could you argue why, and if you don’t believe it does, could you give the strongest reasons why? And if you wished to test statistical hypotheses (which might be of interest at some point), how would you do so?


I see no reason why not. the NP lemma is a mathematical truth, and it applies to health sciences as well.

I know that there are some statisticians (e.g. Andrew Gelman) who call for abandoning hypothses testing at all (which is practically abandoning the NP lemma), not only in health sciences, everywhere. They propose to focus only on the magnitude of the effect. I disagree with that approach.
Although I do recommend to pay a very close attention the the effect and determining if it is meaningful, it cannot replace hypotheses testing. The reason is that this approach will lead to generation of rules of thumbs (e.g. Hedges’ g >2. why 2?

I don’t think that Gelman looks only at the magnitude of effect. He looks at the overall evidence, including the entire posterior probability distribution. But to the general point, in decades of work on hundreds of medical studies I can safely say there was not a single study where hypothesis testing was useful. Hypothesis testing is useful in “existence studies” such as “does ESP exist?” but I don’t encounter those. I would trade all the p-values and confidence limits I’ve ever computed for subjective posterior probabilities of a series of conditions, e.g., P(effect in the right direction), P(effect > a), P(effect > b), P(effect > c), … I also look at raw data and point estimate summaries but point estimates really need to be thought of as fuzz around the unknown truth.


Hypothesis testing is useful in “existence studies” such as “does ESP exist?”

Even then. Hypothesis testing does not tell if ESP exists.

1 Like

Formally true. It would provide evidence against the supposition that it doesn’t exist, if all other explanations have been ruled out by experimental design. But it’s the most relevant place for point hypothesis testing, because evidence in favor of existence would not need to convince me of the magnitude of the ESP effect.


The fact that some theorem is proven doesn’t mean it can be applied successfully to a real-world domain. Rasch modelling would be my go-to example: it has equations, and theorems! But they don’t make any sense.