Suhail- I think you’re trying to create a disagreement where there is none
. I agree with everything you’ve written here! But “NNT”=1/ARR and ARR will change from trial to trial, so I’m just saying that the notion that there’s “one true NNT to rule them all” for a given therapy is misguided. Therefore, we shouldn’t be using “NNT” to decide whether it’s worthwhile to apply a therapy to an individual patient…
Okay now I get your point: The RD (aka ARR) and its transforms are point of care measures of applicability and are not considered trial specific measures even when reported in trials and are therefore not usually meta-analyzed except in error.
In a recent three-part series of blog posts, statistician Maggie Qian discusses the series of articles by Robert Matthews that you have cited:
https://evidenceinthewild.com/randomisation-origin-myth/
Her writing style is refreshingly clear and incisive.
Nice summary indeed but oversimplifies since many statisticians (including in the DataMethods forum) would reasonably counterargue in favor of using parametric approaches in practice while keeping the randomization framework in mind (see also discussion here).
The crystal clear telltale sign, also noted by Robert Matthews but not the blog posts, that Fisher’s framework is indeed too often misunderstood in medical RCTs is the emergence and advocacy by a surprisingly large fraction of biostatisticians of randomized non-comparative trials (RNCTs). While we can debate about the merits of pure randomization-based inference, there is no convincing excuse in favor of RNCTs.
Outstanding @Stephen paper on the topic of randomization versus parametric inference coincidentally just pointed out by @R_cubed in this thread.
Thank you, Pavlos. The oversimplification point is fair, and I take it on board. The blog format does push toward cleaner narratives than the debate warrants, and I probably underplayed how reasonable the parametric counterargument is in practice.
The RNCT observation is sharp, and I wish I’d included it. You’re right that it’s noted by Matthews but I missed its value as a diagnostic. It cuts through a lot of the abstract framework debate by posing a concrete question: if randomization’s inferential purpose requires a comparison group, what exactly is a randomized non-comparative trial doing? The fact that this design exists, is advocated, and gets published reveals something about how the justification for randomization is being understood, or misunderstood.
My one hesitation about calling it a crystal clear sign is that the motivation behind RNCTs isn’t always straightforwardly inferential confusion. Your linked thread suggests they are often recommended at oncology workshops when there aren’t enough resources to power a comparative trial — randomization retained as a kind of procedural legitimacy device even when comparison is abandoned. That might be a different kind of error: conflating the procedural and inferential roles of randomization, rather than a direct misreading of Fisher. Though perhaps that distinction doesn’t rescue the practice much. If practitioners can’t articulate why they are randomizing when there is no comparison, that itself reflects exactly the misunderstanding of Fisher’s framework you’re describing, and may even strengthen rather than soften your point.
I’ll look more carefully at the RNCT thread; it seems like a natural extension of the series.
Correct. See also related presentation here to the International Society for Biopharmaceutical Statistics (ISBS) with practical examples how understanding Fisher’s inferential machinery can lead to practice-changing RCTs. Connection with information theory here (more extended discussion here). We have also shown here that the quasirandomness fallacy occurs in more than half of oncology phase 3 RCTs and it is increasing over time. Concurrently, the incidence of RNCTs is also increasing.
I find it strange, even suspicious, that balanced/minimized designs are not recommended by biostatisticians in these particular cases of scarce resources.
Fisher and Gossett debated the merits of balanced vs randomized designs going on 100 years ago. They were more recently discussed in the CONSORT guidelines in 2010. The first minimization algorithms are over 50 years old, with improvements being made more recently. Even if one has a preference for large randomized trials, why don’t controlled allocation designs such as minimization get used more in these resource constrained scenarios?
Related thread