Randomized non-comparative trials: an oxymoron?

They even used a placebo in that trial! The lack of understanding of what they are doing is really remarkable.

As an aside, I’ve been surprised by how many oncology trials contain similar errors. A favourite is single arm trials that aim the estimate “the response rate” or “survival rate” or something - which as you say in the Lancet response, they can’t easily do as they aren’t a random sample. Usually the resulting numbers get compared with another non-random sample from other single arm trials, or numbers that are just in the clinicians’ heads.

2 Likes

Yup. There is a huge surge of oncology RNCTs being currently published in prominent clinical journals (two weeks ago JCO included another one and just a few weeks earlier another one - the list keeps growing). Hopefully by raising awareness this will be tempered.

2 Likes

The second of those trials (the Italian one) suggests a possible reason for using this design: they can compare each group to a single point value (and choose a nice low one) so more likely to get a “statistically significant” result.

If I’m understanding corrrectly their outcome looks dodgy too: it excluded progressions during treatment-free intervals, but the comparison was between continuous treatment and intermittent, so presumably progressions were omitted from only one arm, which ended up with a much longer time to event. But median overall survival times were similar.
I might be misunderstanding something here - I’ll read it again.

2 Likes

Have any of these published trials posted their consent forms as Supplementary Materials? How is the rationale for randomization presented to trial participants?

3 Likes

Good question! There’s definitely an ethical dimension here - if we’re asking patients to join an experiment and accept a novel treatment that could harm them, we have an obligation to make sure that it’s soundly designed and will yield useful information.

4 Likes

Pavlos

I read the paper corresponding to your second link, but don’t understand why they assert that it’s “non-comparative” when their stated goal is definitely comparative in nature (?) Specifically, it seems like they are trying to show “non-inferiority” of an intermittent versus continuous treatment strategy, but without using a non-inferiority design (?) Tellingly (with regard to presence/absence of input from an experienced statistician), they present “95% CI” for within-arm statistics, which seems…odd…

2 Likes

As an update, Significance just published this article to raise awareness of randomized non-comparative trials in the statistical community.

4 Likes

Pavlos

Great new article. Thanks.

Is it possible that researchers who design RNCTs might be using the following “logic” (perhaps an extension of the “Right to Try” movement ethos?):

  • Since this cancer is very rare, we will have trouble recruiting enough patients to adequately power a trial to show superiority of this new treatment in a statistically rigorous/traditional sense (i.e., two trials with p<0.05);
  • Our patients are clamouring for a treatment that might buy them a bit more time. Previous single-arm trials provided a possible “signal” of efficacy for this drug, but we still aren’t convinced that it has meaningful clinical efficacy or that its benefits outweigh its risks;
  • Our patients tell us that they would be willing to take a treatment whose efficacy hasn’t been established as rigorously as is “ideal” (i.e, two trials with p<0.05), provided that they could get some reassurance that the treatment isn’t likely to worsen their prognosis;
  • By randomly assigning some of our small number of patients with this rare cancer to the new treatment and others to placebo and comparing outcomes between the two arms, we might at least be able to reassure ourselves that, in offering this treatment to future patients, we would not likely be harming them (even if we haven’t proven convincingly that we are helping them);
  • If we were to see a signal of worse outcomes in the drug-treated arm as compared with the placebo arm, we would have a good reason to stop offering the drug to patients with this rare cancer;
  • At the end of this trial, we know that we won’t be able to claim superiority of the new treatment over placebo with any sort of rigour, since our trial is hopelessly underpowered. To make this point clear, we will call our design “non-comparative.” But by keeping the randomization feature of an RCT, we might at least be able to glean some reassurance (by using a “non-statistical” between-arm comparison to look mainly for a harm signal) that we will not be doing net harm by letting future patients try the therapy.

I guess this is my attempt at “steel-manning” (?) Maybe this angle still doesn’t make any sense- you’re in a better position than I to say…But if it does have any redeeming feature(s), the RNCT would need to be clearly labelled as a (distant) “second best” alternative to the traditional RCT for establishing efficacy in oncology, reserved only for the very rare disease context (?) It would be disastrous for it to become widespread and to supplant traditional RCT design and interpretation.

1 Like

Definitely supportive and highly encourage steelmanning of RNCTs with thoughtful dissections like yours. The problem is that what you describe is based on comparisons and thus not an RNCT. The schizophrenic aspect of RNCTs is that they are non-comparative.

For your comparative approach we could, e.g., change our type I & II error rate trade-offs (if frequentist decision making), or Bayesian probability thresholds for go/no-go decisions, or even forgo all decision thresholds altogether and simply state that we will estimate the comparative probability distribution as much as realistically/logistically feasible. All this is absolutely OK. But foregoing comparisons is just weird.

1 Like

Uggh…Yes, you’re right of course. I don’t have access to the full study in the first link in your original post, so I’m trying to make sense of the design based only on the abstract. But every time I read it I just get more confused by the phrasing. It’s so tempting to infer that the authors are comparing outcomes between arms, but they’re not actually doing this at all (?) I propose we coin a new term for studies with this design: “statistical gaslighting” - the more you read, the crazier you feel :slight_smile:

1 Like

@Pavlos_Msaouel I just read your article which is wonderfully written. I would have had a great deal of trouble writing such an article because I’d be peppering the discussion with 4-letter words :slightly_smiling_face:

Do we know the first researcher who used or proposed this method? I would love to know who to blame.

3 Likes

Since the article was published I have certainly received a fair number of messages and phone calls by statisticians in shock and unable to contain their use of of 4-letter words :wink:

We could not find any methodology paper specifically advocating for RNCTs, but some have told me that they are a distorted version of old phase 2 concepts (e.g., here and here) used to prioritize and plan subsequent randomized phase 3 comparative trials.

In clinical practice, the earliest RNCTs we found in our review include this one from 2002 and another in 2005. It is hard though to detect them because they don’t directly say in the title or abstract that they are actually RNCTs. Their frequency though has definitely skyrocketed in the past few years.

4 Likes

Brimming with such clearly expressed insights, Pavlos! I am especially struck by your observations on the cultural aspects of the problem, as revealed in words like talismanic, performative, aura, ritual, halo, and sacred.

This whole business invites comparison with the esotericism I identify at the core of Project Optimus in this PubPeer post. And of course, the whole docrine that [1]

The [dose-optimization] trial should be sized to allow for sufficient assessment of safety and antitumor activity for each dosage. The trial does not need to be powered to demonstrate statistical superiority of a dosage or statistical non-inferiority among the dosages using Type I error rates which would be used in registrational trials.

sounds like this hypothetical ‘internal monologue’ from your Significance article:

“We don’t have the money, resources or sample size to power a direct comparison, yet we want the aura of an RCT. So let’s randomise into two arms – say, an experimental therapy and a placebo arm – but we won’t do a formal comparison."

Compare again a quote from Anomal Pharm

"And so the purpose of randomization is not necessarily to establish that one dose of the drug is superior to another dose in a statistical way, or establish statistical non-inferiority between the two doses, but really to have trials that are sufficiently sized so that we can interpret these dose-efficacy and dose-toxicity relationships, and use that information to guide overall decision-making.”

So one dimension left mostly unexplored in your piece (although I do note you gesture toward “bureaucratic or regulatory insurance”) is the imprimatur some regulatory authorities may have granted RNCTs.


  1. https://www.fda.gov/regulatory-information/search-fda-guidance-documents/optimizing-dosage-human-prescription-drugs-and-biological-products-treatment-oncologic-diseaseshttps://en.wikipedia.org/wiki/The_Myth_of_Disenchantment

  2. Re esotericism, cf. The Myth of Disenchantment - Wikipedia

3 Likes

Adding to the list of RNCT issues: (pardon if already noted) it’s likely to delay trial accrual as randomization in RCTs is a known barrier to accrual.

1 Like

Based on my reading of @Pavlos_Msaouel ‘s paper, I think the simplest way to review a paper that used RNCT is to say something along these lines:

Unfortunately, the authors made a critical error in choosing a design that is at odds with their stated goals. One-half of the patients were wasted by randomizing patients into two groups. The aim-appropriate design would not randomize patients.

As an aside, I wonder if there have been any RNCTs where one treatment arm is a placebo. That design would be helpful in checking bias in historical control data.

2 Likes

This RNCT used placebo in one treatment arm. They had assumed 12-month survival probability of 0.2 in the placebo arm and the actual result was 0.19 (90% CI 0.11 - 0.31).

Stay tuned for an upcoming paper on control arm performance compared to pre-trial expectations in oncology.

2 Likes