Disentangling Competing Risks with and without Endogeneity



I was talking to a buddy about a study he is planning. I recognized the study has a significant competing risks problem, but to me, it seems a like special case of competing risks where I’m not sure the usual approaches work to resolve the issue. I’m realizing I don’t know how to solve this problem so figured I’d pose it to the hive mind.

Suppose you want to develop a prediction model for death from disease X. Disease X is really terrible and high incidence and mortality in the population of interest. Standard of care is to treat all patients in the population of interest w/ a drug known to be effective at lowering risk of developing disease X, and controlling severity of disease X if it develops. However, the drug has many side effects and itself can also cause death. You have a new molecular marker that is shown to explain multiple biological mechanisms and which has been shown to have sufficient predictive discrimination for disease X incidence (let’s make the assumption for now that the evidence showing this is valid). You want to use this marker, in conjunction with other clinical variables, to develop and validate a prediction model for disease X death, in hopes that you can better tailor treatment to risk. (So that you can later test whether this prediction model improves outcomes in an RCT)

Here’s the problem: As I see it, there are 4 possible states a patient can be in during the study.
Y0 = Alive
Y1 = Death from Disease X
Y2 = Death from treatment for Disease X
Y3 = Death from other causes

Y3 is a regular old competing risk, and can be addressed accordingly. However, presumably the clinician has a prior about the p(Y1) - or for that matter the p(Y2) - and alters treatment accordingly.

So, 2 questions:

  1. Given this endogeneity, how do you (or can you?) disentangle probability of death from disease X from probability of death from the treatment for disease X? (My sense is no).

  2. Putting the endogeneity aside, (Let’s say this is a perfect experiment where every patient gets exactly the same dose and duration of drug X), do censorship based approaches to competing risks still work? The conceptual study question is really: “is p(Y1) > p(Y2)?” So can you really get an accurate estimate of either p(Y1) or p(Y2) in this situation?

Curious to see how others would approach this problem. Any reading suggestions much appreciated as well.


This is a wonderful question for which I hope you get several answers. I think disintangling related from non-related causes of death is a very difficulty task. So difficult that sometimes I just want to predict death all-cause and to have a separate polytomous logistic model for predicting the cause given death. Or to have an ordinal model with multiple ranked outcomes. But it’s hard to rank non-related death.