Counterfactual Prediction + Longtitudinal Ordinal Models + Decision Making

Hi everyone, I want to share some thought and ask for some of your own.

I have a deep interest in three main dimensions of prediction: Information loss (time and outcome dichotomization), Counterfactual Prediction, and Performance Validation in the context of decision making.

My main intuition is that there is no good reason for not trying to overcome all of the above.

  • Information Loss: While binary outcomes are extremely popular I see no reason to avoid time-to-event, competing-risks/multi-state and longitudinal models or avoiding ordinal/continuous outcomes.

  • Counterfactual Prediction: While it is fairly easy to create prediction models nowadays I still find them very difficult to interpret under the “factual” context. I don’t think that any clinician really thinks in terms of so-called factual prediction, we do not prioritize patients according to their so-called “absolute-risk”, we do so by evaluating the implied predicted risk reduction and/or the implied predicted treatment harm.

  • Performance Validation in the context of Decision Making: Prediction performance metrics without context are useless and misleading. That’s why I’m a big fan of decision curves, they respect the narrative of the domain expert and they require a price in terms of real-life consequences.

Stating all of the above, while I have some solutions for the problems mentioned I’m not familiar with one solution for all of the above.

I would like to predict longitudinal counterfactual prediction and validate accordingly in terms of decision-making.

Counterfactual Prediction for the longitudinal ordinal setting would look like:

0 - Alive :face_with_raised_eyebrow:
1 - Sick :nauseated_face:
2 - Dead :skull:
treatment - :pill:

Under no-treatment:
112
:nauseated_face::nauseated_face::skull:

U(no-:pill:) = 2 * :nauseated_face:

Under treatment:
1110000002
:pill::pill::pill:
:nauseated_face::nauseated_face::nauseated_face::face_with_raised_eyebrow::face_with_raised_eyebrow::face_with_raised_eyebrow::face_with_raised_eyebrow::face_with_raised_eyebrow::face_with_raised_eyebrow::skull:

U(:pill:) = 3 * :nauseated_face: + 6 * :face_with_raised_eyebrow: + 3 *:pill:

We can have different utility values for treatment, days of being sick and days of being healthy accordingly. The expected difference is straightforward and there is no need for puzzling heuristics such as playing around with Calibration, Discrimination etc…

Are you familiar with related models? I wonder what is you perspective on the subject.

1 Like

It’s worth pursuing. But an example of the difficulty of this approach as opposed to an ordinal longitudinal model is the complexity of ordering a late death vs. an early heart attack with the summed utility approach. Missing data are also problematic.

How would you relate to late death vs early heart attack in the context of ordinal longitudinal modeling?

I don’t think that prediction models are making any kind of utility interpretation directly, but they do so by using performance metrics and inclusion-exclusion criteria of the target population.

For example:

  • c-index for time-to-event implies that the earliest events should be prioritized.
  • lift implies that patients at the highest risk will gain the highest benefit.
  • Brier Score implies that the absolute differences between the predictions and the outcomes are equally important for nonevents (p = 0.3, y =0) and events (p = 0.7, y = 1).

And so on… Sometimes these assumptions are reasonable and sometimes they are not, but I think that it will be much easier to interpret and communicate utility under a counterfactual setting. For lift we can use the uplift setting which is very natural to marketing profiling but I do believe that is relevant for healthcare as well.

My main takeaway is that we should strive for alignment between narrative thinking that comes from domain-experts / decision-makers of all sorts and the underlying assumptions behind performance metrics. I used to think that we should train clinicians to play poker in order to improve their probabilistic thinking, but the following thread changed my point of view:

Ordinal longitudinal analysis let’s you compute any probabilities or expected durations the clinicians want, without making value judgements.

Maybe this freedom comes with a price.

As statisticians we want to be greedy in terms of information loss, but should we embrace the same approach for communication with domain experts?

I really liked this provocative article by @VickersBiostats and I think it’s kind of related:

1 Like

Some thoughts I had regarding what we mean by “True / False Positives” for different settings and I wonder about yours @f2harrell :

If I understand correctly the equivalent of cumulative incidence is just the empirical proportion of transitions from initial state to a state of interest as the next step (let’s assume no censoring). For many adjustments I’ve seen for both competing-risks and censoring issues that the CIF/1-KM estimates function as “Real Positives/Negatives”.

So if we have more levels of sevirity for outcome (quality of life let’s say) we should consider a CIF for every possible transition, right? And if we allow patients to start from different states (why not?) the number of CIFs to explore will grow even larger.

The term cumulative might be misleading because not all states are observing states, and it’s not clear weather we would like to count “ever at state J” (weather the patient will be hospitalized at least once) or “currently at state J” (is the patient hospitalized at the end of the followup).

Another conceptual gap I find quite difficult to grasp is the difference between interventions(=cause) and states(=result).

For me a state is just the outcome and intervention requires counterfactual framework, the competing risks setting is challenging in terms of terminology because of the “cause-related-death” jargon.

Try to keep it simple. In general with ordinal outcomes you model through state transitions and after the fit you can also convert these to state occupancy probabilities (SOP). This is quite general, allowing for any number (including zero) of absorbing states. Only for an absorbing state does a SOP equate to a cumulative incidence. The SOP calculation is the place where you consider every possible transition from the current state.

It’s not useful to use ‘positive’ or ‘negative’ in this context.

1 Like

You might want to use the flexibility of longitudinal ordinal models for comparison with simpler models or just be able to communicate complexity with domain experts. In order to do both you must play with thresholds, in fact that’s what we do with SOP.

For non-absorbing state I can estimate the prevalence which might be just as important for some setting:

What is the prevalence of depressed 1 year after using anti depressants? These outputs are generic, just like the CI estimates.

Let’s say 0-4 means depressed and 5-10 means no-depression. Maybe some grace period is needed for the medications to work.

I might define the following outcome as a success:
[ 5, 3, 2, 2, 4, 5, 7, 7, 7]

1 Like