Paper on European Radiology Experimental on statistical significance

Hello everyone,
As a radiologist with interest in statistics, I have come upon this article on radiologists trying to write about why we should maintain the use of p-values citing for example that “clinical research needs dichotomic answers to guide decision-making, in particular in the case of diagnostic imaging and interventional radiology”. There is a lot I don’t agree with the paper but perhaps I’m becoming just a little pedantic and exaggerated after my adventures with Bayesian Statistics so I would like to hear some of your opinions as well :slight_smile:


i feel they should have referenced Deborah Mayo. Have you read her book? And they didnt reference the statement in annals applied stat by the ASA task force …


The authors implied that a threshold is a good thing. I hope we can get beyond that idea. And they missed the big picture when discussing the high dimensional setting. The use of false discovery and other p-value cutoff methods has always been misleading. It doesn’t care about false negatives and doesn’t expose the difficulty of the task. We would be much better off by computing bootstrap confidence intervals for the importance of each of the many candidate features, with importance measured any way you choose (regression coefficient, z-statistic, c-index (AUROC), …). You’ll find that the determination of winners and losers is futile for most of the datasets being analyzed (confidence intervals are too wide). P-values just get in the way.


I did a year ago. Quite a good book and one the authors surely did not read :slight_smile:

Completely agree with you. Also I completely find off-putting the idea that we need binary decision making in order to do good science. And I am not sold on the idea of a lower threshold completely destroying independent research due to bigger samples needed, it would probably lead to better study design, better studies and more worthy research.

:new: I accidentally deleted this (post 4) trying to copy the link for another thread; apologies for repost and bump.

In this article, we discuss the value of p value and explain why it should not be abandoned nor should the conventional threshold of 0.05 be modified.

It seems to me they continue the confusion between \alpha and p. To continue the use of a 0.05 threshold is analogous to saying the same prior should be used in every problem, regardless of the precision of the study, or the plausibility of the alternative in question. The experimenter’s local \alpha has very little bearing on my personal \alpha, much like his/her prior has minimal relevance to my own.

The most objectionable feature of these interpretations of statistical methods is they manipulate the reader into either shutting down the critical faculty entirely, or offering unhelpful and nonsensical critiques that do not aid future researchers in designing protocols that provide the relevant information to eventually settle the question.

The entire debate about the “right” p value threshold is misplaced. Professor Harry Crane wrote a persuasive paper criticizing the idea that lowering the cutoff would improve research that deserves attention.