Handling missing continuous data due to death

Hi everyone,

I work in neonatal research where it’s common to have interventions that have large benefits to the in-hospital course but heterogeneous outcomes in terms of development. So for example we can reduce lung disease associated with prematurity with post-natal steroids but doing so has deleterious effects on brain development which show up at 18 months +. Because of this there is always an interest in assessing later development in clinical trials. I’m curious though about thoughts on the current approach to handling developmental outcomes that are missing because the infant died while in hospital. When development is summarized into terms of disability status (2 SDs below the mean) the usual composite approach is used (Death or Disability) which has its own issues but I’m even more interested in the case with continuous scores (mean difference in total score) where the usual practice is to give patients who died the lowest possible score. This seems a little muddy to me since now I’m not even sure what sort of summary this is? Some alternatives I’ve thought of:

  1. Ordinal outcome with death as worst - A more efficient approach than Death or disability?
  2. Some sort of imputation of the missing development data? While mortality can be high in some populations (>20%) there may be enough overlap that this is feasible, but then it’s like we are getting the difference in developmental scores if we were able to stop everyone from dying on both therapies? Does that make sense when the treatments are often intended to have a treatment effect on mortality?

Also considered things like joint models, but those all seem to rely on a repeat measures on the continuous component (which isn’t feasible here) and I believe ends up giving an interpretation like #2?

1 Like

I do anything to avoid thinking about unobservables (which imputation is related to) or competing risk analysis. I do not know of a method that doesn’t treat death as a bad outcome that is interpretable. With ordinal longitudinal models you fortunately don’t have to judge how much worse death is that disability, only that it’s worse.


Thanks @f2harrell. What are your thoughts on the treatment effect summary? I guess one would be the probability of surviving with a score above any threshold of interest (thinking of the ordinal models for continuous Y paper)? If this were a valid summary I would be tempted to think of the difference in scores among survivors by predicting for trt = 0 and trt = 1, but I am hesitant to think we can go there given it involves seems to still involve conditioning on a post-randomization outcome even if the model itself doesn’t?

Transition probabilities are a bit similar to hazard ratios. They may be seen as a great way to model longitudinal dependencies, but then previous states are deconditioned on to arrive at state occupancy probabilities, which when computed using only baseline covariates are unconditional (except on initial state and baseline covariates). So they are intent-to-treat causal-type quantities.

The estimand you mentioned is one of the main ones. Another is the mean time in that state, and the difference in mean time in states between treatments.

It sounds to me as if you are taking the need for “summary” for granted, without embedding your summative expressions into a larger scientific context. Who are the consumers of the summaries? Are they subsequent modeling steps, or the clinicians and parents? If the latter, then might a graphical treatment (Sankey diagrams, e.g.) help?

If your aim in summarizing is to yield inputs to subsequent modeling efforts, then maybe the appetites of your models are insufficiently refined? How sharply defined a scientific question can be posed or answered in terms of any summary that regards death on the same terms as child development?

I think the main problem I have right now is that when the development outcomes are treated as continuous the commonly accepted practice has been to replace those observations with the lowest possible score which leads to a mean difference that I don’t find very interpretable and in the most extreme case could hide effects on developmental outcomes that have policy/resource allocation implications. I’m not convinced there is any population for whom that summary effect will lead to good decisions. The summaries I’m interested are:

  1. Does the treatment improve utility at the population level - This can be achieved via converting outcomes to utilities (as if often done in health economic models) either before or after analysis, MCDA, or ordinal models (and likely other approaches?).
  2. Does the treatment have an important effect on downstream supports required? For this I think we just need the disability outcome to model out the implications for early intervention programs (eg, do we need to increase resources if we introduce an effective intervention for mortality that results in an increase in the total number of children with disabilities)
  3. Does the treatment have an effect on disability itself? For this I think the only solution is to accept that conditioning on survival creates collider bias and hope we can adjust for it. This is (I think?) equivalent to comparing the interventions in respect to disability if you could have a magical third intervention that prevented all death without an effect on disability?

Unfortunately what I see more commonly is that we either hide some of the benefits of the therapy by treating death/disability as equal outcomes, or we hide some of the potential resourcing/developmental harms by imputing a bunch more minimum scores in one group than the other. In the extreme this could lead to actually claiming a developmental benefit when the treatment has no effect or a harmful effect which I think has important implications for both resourcing and understanding effects on survivors.


With a partial proportional odds longitudinal model you can estimate and test how much differently a treatment causes death than it causes or reduces disability.

How much natural variability is there in steroid administration practices? Do practices vary from one neonatologist to another, or one NICU to another? Might there be an opportunity to use instrumental variables here? What do you think are the best examples of published efforts to get at some of the questions you’ve posed? My understanding is that modern perspectives on causal inference in medicine found some of their first exponents in your field [1].

  1. Weinberg CR. Toward a clearer definition of confounding. Am J Epidemiol. 1993;137(1):1-8. doi:10.1093/oxfordjournals.aje.a116591