I’m looking for advice in the design of a comparative effectiveness RCT. Importantly, the RCT is in extremely premature infants receiving intensive care, where there is high early mortality and some relevant clinical outcomes cannot be measured until later in development. For example, diagnosis of the morbidity retinopathy of prematurity requires the infant to survive for multiple months (one must wait until the eye vessels develop to see if there is problem).
The trial studies an intervention that might impact death (safety/harm outcome) and, for the sake of simplicity, a single morbidity (efficacy outcome) at this later fixed time point.
Because of competing risks (death precludes the morbidity), it is standard our field to use a composite outcome (“death or morbidity” or its complement) as the primary outcome. This is way that nearly every trial in this population published in NEJM or JAMA in the past 2 decades has been done.
As I see it, accounting for both death and morbidity could be done in 2 ways:
A) pick “survival without morbidity” as a dichotomous binary (this is like our -efficacy / -harm cell in the 2x2 below and is the most desirable outcome) and do a standard frequentist or Bayesian analysis
B) incorporate the information in the other 3 cells of the 2x2 table into the outcome - this increases information richness and may be valuable both clinically and statistically
Three options to pursue #2:
#1) use DOOR with an ordered ordinal outcome (survival without morbidity / survival with morbidity / death — due to the nature of morbidity not appearing until later, the 4th cell, “death with morbidity” is a bit tricky to conceptualize)
#2) use a win ratio (first compare on survival; then, among survivors, compare on morbidity)
#3) use ordinal logistic regression as commonly advocated for here, ordered as survival without morbidity / survival with morbidity / death
I think most people would agree that option B is superior to option A. Why discard the additional information from the other cells? But I’m particularly interested in which approach to incorporating that information I should pick and why. I see pros/cons to each related to interpretation (all models are “false,” but each brings different value). Specifically, if we’re optimizing for statistical efficacy in selecting a primary outcome and analysis, which is best? Has anyone looked at statistical efficiency across these 3 similar methods? (And does it vary in different scenarios?)
Thanks for any input this thoughtful group can offer.
Matt
p.s. Of course, the above could be done using additional graduations (such as survival w/o morbidity → survival w/ 1 morbidity → survival w/ 2 morbidities → survival w/ 3 morbidities → death). I’ve simplified here because I found it easier to conceptualize using 2x2 tables.