Can treatment B be superior after the RCT if it was thought that treatment A is superior before the RCT?

a_j · October 11, 2020, 10:48am

Dear all,

I am interested in your opinions about the result of the ISAR-REACT 5 trial, which was published in 2019 (N Engl J Med. 2019 Oct 17;381(16):1524-1534).

ISAR-REACT 5 randomized more than 4.000 patients to receive either Prasugrel or Ticagrelor after an acute coronary syndrome. The primary endpoint was the composite of death, myocardial infarction or stroke at 1 year.

The sample size calculation was based on the following assumptions:

the incidence of the primary endpoint would be 10.0% in the ticagrelor group
the incidence of the primary endpoint would be 12.9% in the prasugrel group
80% power to detect a relative risk that was lower by 22.5% in the rate of the primary end point in the ticagrelor group as compared with the prasugrel group
two-sided alpha level of 0.05

The result of the trial was a hazard ratio of 1.36 with a 95% confidence interval ranging from 1.09 to 1.70 in favor for prasugrel. This is in contrast to the assumptions of the sample size calculation.

The authors concluded “Among patients who presented with acute coronary syndromes with or without ST-segment elevation, the incidence of death, myocardial infarction, or stroke was significantly lower among those who received prasugrel than among those who received ticagrelor […]”.

Bittl et al just recently published an editorial on a substudy of the ISAR-REACT 5 trial (JACC Cardiovasc Interv. 2020 Oct 12;13(19):2248-2250). The authors of the editorial proposed that “[…] the findings in ISAR-REACT 5 […] were statistical false positives” because “[…] the ISAR-REACT 5 investigators believed that ticagrelor would have been superior. If they thought that there was only a 10% prior probability (i.e., 1:9 odds) that prasugrel could outperform ticagrelor, the “significant” outcome in favor of prasugrel has only a 64% chance of being a true positive, […]”. This statement is followed by a calculation leading the chance of 64%.

What do you thing about this? Is it okay to conclude from the ISAR-REACT 5 trial that Prasugrel is superior over Ticagrelor from a statistical point of view?

I would be happy to read your thoughts!

f2harrell · October 11, 2020, 11:11am

This is an excellent case study in RCT interpretation and why the indirect evidence provided by p-values and confidence intervals is not enough. I think there are two ways to go: (1) either believe that a single skeptical prior distribution should be used or (2) show the readers how to get posterior probabilities of efficacy given their own prior. Sticking with (1) for now, how the skeptical prior is formulated is key. What was the prior distribution that they selected for the re-analysis? If it was a prior with a discontinuity in it, e.g., it contains an “absorbing state” at a zero effect, then the authors have represented the state of knowledge in a very strange way and and tilted the game board in their favor. If the prior is one that specifies an equal chance of benefit as for harm and restricts the probability of a large magnitude of effect in either direction, and if this probability is reasonable to the majority of readers, then the reversal of the hazard ratio is more reasonably seen as noise. Please provide more information.

MSchwartz · October 11, 2020, 2:28pm

Hi,

Before worrying about the apparent reversal of the study results and whether the statistical analysis was reasonable, I would go back to basics and question whether or not the design and/or the integrity of the study conduct may have contributed to the unexpected findings.

I would start with an evaluation of the study inclusion/exclusion criteria to be reasonably comfortable that there is nothing there that could bias the subjects enrolled in the study, leading to issues with the prevalence of the endpoints.

From a review of the study protocol, page 23, on randomization:

https://www.nejm.org/doi/suppl/10.1056/NEJMoa1908973/suppl_file/nejmoa1908973_protocol.pdf

it would seem that, besides study site, they only stratified the randomization on the nature of the MI presentation (STEMI vs. NSTE-ACS), and not on other potentially relevant clinical factors. I would wonder whether or not there was an imbalance of one or more relevant subject characteristics at baseline that could have led to confounding. Keep in mind that randomization only ensures that any inter-arm imbalance at baseline occurred by chance, but it does not ensure inter-arm balance of what may end up being clinically important characteristics.

Also, endpoint definitions matter, and I might wonder whether or not the definitions of any of the study endpoints could have influenced the findings. We know, for example, that definitions may have contributed to meaningful bias in the EXCEL trial findings, which compared PCI to CABG in left main stenosis settings:

Lastly, are there actual ongoing study compliance and treatment adherence issues that may have confounded the results. This article:

suggests that there are reasons to be at least concerned about those issues, and their possible influence on the study results.

simongates · October 12, 2020, 7:40am

Is that editorial available somewhere? - I can’t access it (although my institution allegedly has access, can’t get to it).

MSchwartz · October 12, 2020, 3:11pm

I don’t have access to the full editorial, but just for ease of access for those that do, the main URL is:

The sub-study publication is available here:

In the course of searching for additional information, I came across this TCTMD article from a few days ago, which discusses the Bittl editorial and includes some responses to it, including challenging the 1:9 odds by the editorial authors:

The TCTMD article is worth a read in light of some of the concerns that are raised.

I have also been wrestling with an additional thought here.

There were no formal interim analyses planned in this open-label study, other than for a possible sample size re-estimation after the first 1,000 subjects completed the 12 month follow up. By then, it was expected that 2,000 subjects would have been enrolled.

That being said, I have to wonder what interim data the DSMB had access to, what their closed session deliberations were like once it became evident that the original hypothesis was not going to be achieved, and that the direction of the difference was actually reversed.

What discussions, if any, around stopping the study for futility might have been undertaken? If not, why not?

jay_brophy · October 13, 2020, 6:27pm

The editorialist provide a very skeptical prior and I am unsure without reading it how this can be justified. It is incorrect to say the trialists shared this skeptical view as it would be impossible for them to have the required equipoise for randomization if they truly believed one treatment has a 90% of being superior. In fact a network meta-analysis of all trials looking at these 2 drugs showed no mortality difference ( (HR 0.90, 95%CI 0.68-1.18, p=0.75) ( Journal of the American College of Cardiology[Volume 75, Issue 11 Supplement 1, March 2020] (https://www.onlinejacc.org/content/75/11_Supplement_1) supporting their view of equipoise. It seems unlikely that the original trial results will be reversed by this more reasonable prior.
I believe the original design and analysis was not specified to be Bayesian in which case the authors have elected to interpret their results in isolation from prior beliefs. I fully endorse the Bayesian perspective as providing more meaningful insights but the first rule of thumb is that priors must be fully transparent and justifiable.