A randomized trial of epinephrine in out-of-hospital cardiac arrest

GD Perkins et al presented an extremely well designed and conducted double blind randomized trial of epinephrine vs. placebo, randomizing over 8,000 patients. The primary outcome was the probability of survival at 30 days. Secondary outcomes included the probability of survival until hospital discharge with a score of 3 or less on the modified Rankin scale (which ranges from 0 [no symptoms] to 6 [death]). The statistical analysis was state-of-the-art, including an ordinal analysis with the proportional odds ordinal logistic model, and Bayesian analysis. The conclusion was that epinephrine increased the chance of survival but among survivors, there was a tendency for worse neurological outcomes. Reaction on twitter (see also here) has been interesting, with some clinicians emphasizing the Rankin score outcomes on survivors. It’s always tricky to interpret conditional analyses, and as one tweet said, there is a relationship between brain damage and risk of death.

A shortcoming of the original analysis in my view is that (1) it’s unclear exactly how the proportional odds analysis was done, and (2) it is not clear that the authors ever performed the preferred ordinal analysis that did not group any outcome levels. A key analysis in the paper combined the bottom 4 levels of the Rankin scale.
Failure to distinguish these categories is not a great idea. Grouping loses information and power. An excellent re-analysis by Matthew Shun-Shin considered full 7-level ordinal analyses with various re-arrangements of the Rankin categories to assume that some neurocognitive outcomes are worse than death. A gold-standard analysis would elicit utilities for all the outcome states from relevant persons and test whether epinephrine increases the expected utility. Short of that, ordinal analyses are better than binary analyses as demonstrated here and here. :new:

Here is an analysis that uses all 7 categories in their original order, using R.

a <- c(rep(0,15), rep(1,10), rep(2,29), rep(3,20), rep(4,8), rep(5,8), rep(6,3904))
b <- c(rep(0,12), rep(1,17), rep(2,23), rep(3,35), rep(4,12), rep(5,27), rep(6,3881))
x <- c(rep('placebo', length(a)), rep('epinephrine', length(b)))
y <- c(a, b)

require(rms)
f <- lrm(y ~ x)
f

Logistic Regression Model
 
 lrm(formula = y ~ x)
 
 Frequencies of Responses
 
    0    1    2    3    4    5    6 
   27   27   52   55   20   35 7785 
 
                      Model Likelihood     Discrimination    Rank Discrim.    
                         Ratio Test           Indexes           Indexes       
 Obs          8001    LR chi2      5.88    R2       0.003    C       0.541    
 max |deriv| 6e-12    d.f.            1    g        0.168    Dxy     0.082    
                      Pr(> chi2) 0.0153    gr       1.183    gamma   0.165    
                                           gp       0.002    tau-a   0.004    
                                           Brier    0.013                     
 
           Coef   S.E.   Wald Z Pr(>|Z|)
 y>=1      5.5342 0.2014 27.47  <0.0001 
 y>=2      4.8376 0.1485 32.57  <0.0001 
 y>=3      4.1565 0.1139 36.48  <0.0001 
 y>=4      3.7315 0.0988 37.77  <0.0001 
 y>=5      3.6118 0.0953 37.90  <0.0001 
 y>=6      3.4304 0.0905 37.90  <0.0001 
 x=placebo 0.3368 0.1398  2.41  0.0160  

summary(f, x='placebo')

             Effects              Response : y 

 Factor                  Low High Diff. Effect   S.E.    Lower 0.95 Upper 0.95
 x - epinephrine:placebo 2   1    NA    -0.33675 0.13985 -0.61085   -0.06266  
  Odds Ratio             2   1    NA     0.71408      NA  0.54289    0.93926  

The 2-sided (why?) p-value is 0.015 in favor of epi, and the odds epi:placebo OR is 0.71. This provides evidence that patients getting epi tended to have better outcomes on the 7-point scale than those randomized to placebo.

Along the lines of Shun-Shin, let’s assume that modified Rankin scale level 5 is worse than death, and get a new 7-level ordinal analysis:

y2 <- ifelse(y == 5, 7, y)
f <- lrm(y2 ~ x)
f

Logistic Regression Model
 
 lrm(formula = y2 ~ x)
 
  Frequencies of Responses
 
    0    1    2    3    4    6    7 
   27   27   52   55   20 7785   35 
 
                      Model Likelihood     Discrimination    Rank Discrim.    
                         Ratio Test           Indexes           Indexes       
 Obs          8001    LR chi2      0.02    R2       0.000    C       0.502    
 max |deriv| 2e-07    d.f.            1    g        0.010    Dxy     0.005    
                      Pr(> chi2) 0.8843    gr       1.010    gamma   0.010    
                                           gp       0.000    tau-a   0.000    
                                           Brier    0.013                     
 
           Coef    S.E.   Wald Z Pr(>|Z|)
 y>=1       5.6982 0.2049  27.81 <0.0001 
 y>=2       5.0016 0.1532  32.64 <0.0001 
 y>=3       4.3206 0.1200  36.01 <0.0001 
 y>=4       3.8957 0.1057  36.85 <0.0001 
 y>=6       3.7760 0.1025  36.86 <0.0001 
 y>=7      -5.4176 0.1827 -29.66 <0.0001 
 x=placebo -0.0201 0.1379  -0.15 0.8843  

summary(f, x='placebo')

             Effects              Response : y2 

 Factor                  Low High Diff. Effect  S.E.    Lower 0.95 Upper 0.95
 x - epinephrine:placebo 2   1    NA    0.02007 0.13795 -0.25030   0.29044   
  Odds Ratio             2   1    NA    1.02030      NA  0.77857   1.33700   

Now we don’t have evidence for benefit of epi (p=0.88) but we also do not have evidence for non-benefit of epi, since the confidence interval on the odds ratio is wide.

Interpretations of clinical trials with nontrivial outcomes are always nuanced!

Note that the above analyses were unadjusted for baseline covariates, due to non-availability of the raw data. Adjusted analyses are more appropriate.

Conclusions

With an ordinal outcome, frequentist statistical power and limiting effective sample size are largely determined by the total of the frequencies of the non-dominant outcome categories. Unless Rankin level 5 is counted as more favorable to patients than death, and absent a full patient utility analysis, the study’s sample size was insufficient for drawing firm conclusions. There were not enough survivors.

What Would a Bayesian Design Do Differently? :new:

Frequentist designs invite fixed sample sizes, and sample size computation requires knowledge that is not available during study planning. With a Bayesian approach, sampling can continue until a target (efficacy, harm, or futility) is reached, with no penalty for multiple looks. Studies that ended equivocally in the frequentist paradigm can readily be extended in the Bayesian paradigm, subject to resource limitations. Bayesian analysis can also provide some advantages. For example, one can compute the posterior probability that epinephrine reduces mortality by some small, but nonzero, amount.


Other Analyses

G Howard et al provide an exact randomization Wilcoxon test for analyzing all levels of the Rankin score. This is more computationally involved than the proportional odds model and does not allow for covariate adjustment.

Anupam Singh provides an assessment of the proportional odds assumption for this study. The proportional odds assumption is always violated, so one needs to ask whether the weighted average odds ratio arising from the PO model is worse than other overall treatment summaries that may be computed. To quote Stephen Senn:

Clearly, the dependence of the proportional odds model on the assumption of proportionality can be overstressed. Suppose that two different statisticians would cut the same three-point scale at different cut points. It is hard to see how anybody who could accept either dichotomy could object to the compromise answer produced by the proportional odds model.

5 Likes

i quite like this open access paper that compared methods for joint analysis of survival and functional decline in ALS.

1 Like

Nice. I think that is especially useful when follow-up for mortality is long-term.

1 Like

Thanks for this - really interesting and useful. The comment about alternative Bayesian designs is interesting, as it is something we are working on at the moment, so hopefully there will be some interesting results from that soon.

[declaration of interest: I’m one of the study investigators and co-authors]

3 Likes

I was planning on getting a student to conduct a joint analysis of survival and repeated measures ratings scale data in ALS - I cannot believe it has already been done! Back to the drawing board. Thanks for raising it.

2 Likes