RWE: use of target trial emulation framework: why aiming to emulate RCT?

Tabaluga1 · November 29, 2024, 10:07am

Dear All,

Hernán and Robins [1] introduced the target trial emulation framework in 2016 to define the question of interest in observational studies. The target trial framework asks the investigator to specify key elements of the protocol of the RCT that would ideally be run to address the question of interest. But what if it is impossible to simulate the target RCT, then the causal question can’t be formulated. And why are they aiming to emulate exactly RCT? Is it because in RCT less assumptions are required to define the causal question?

[1] Hernán, M. A., and Robins, J. M. (2016), “Using Big Data to Emulate a
Target Trial When a Randomized Trial Is Not Available,” American
Journal of Epidemiology, 183, 758–764.

ehudk · December 2, 2024, 9:26am

In my view, the target trial framework is a useful thought experiment that assists in both formalizing well-defined causal questions and properly answering them.

There is a plethora of biases that can creep in when trying to make causal inference from observational data. People often focus on confounding bias, but that’s just one of many, like immortal-time bias or allocation bias^[1]. Most of these other biases often arise from how we handle data, all the degrees of freedom involved when converting some database to a table ready for statistical analysis.
Questions like who’s in the data, what are the interventions of interest, how to define which people get into what treatment group at which time, etc. Analysts not careful enough can formulate nonsensical analysis setups. For instance, taking all the people in the database, ages 0 to 100, forgetting the settings in which 1 year-olds get intervention (say heart surgery) is quite different from a 70 year-old getting that intervention; Or coming up with ill-defined interventions, with the famous example being weight-loss: it is very easy to compare heart attack rates for people with different weight loss in a dataset, but weight loss is a fuzzy concept that may be due to diet, exercise, or smoking, all of which will have different effects on rate of heart attacks^[2]; Or when and how people are assigned to their treatment groups, for example if you want to compare drug takers to no drug takers, it can be mind boggling for an analysts to think when does “no taking” starts in the data, and therefore could end up comparing any drug takers to never takers, which may be a very biased comparison if you care about the effect of drug initiation.

All these hurdles often do not exist in a clinical trial. Eligibility criteria is formulated by the (doctors of the) patients seeking care, for which the research question is clinically relevant for and is relatively clear to the practicing physician (not conflating 1 year-olds with 70 year-olds). Interventions are also clearly manipulatable actions that health workers can do to patients or patients can do to themselves (no one can randomize “weight loss”, you can prescribe educational nutrition programs or make people come exercise in a facility 3 times a week, these are well-defined actions). It is also very clear when and how patients get treatment or not (they get treatment at a clinic or they get a high-five instead or whatever, but they all get there at a certain point of their timeline [time-zero] and it is clear when you start to follow up and what period is regarded patient’s history).

RCTs solve many of the problems of asking a proper research question and setting up a setting to answer that question. Problems that observational studies may fail to properly do.
Therefore, the framework taking a trial design one could have make, and then mold the data to fit that trial design can be beneficial for analyzing observational data.
It puts necessary restraints on an otherwise infinitely permissive data analysis you could play fast and loose with.

So target trials may better ensure clinical relevancy and validity of research, as well as being a good way to communicate observational results to practicing clinicians (“here is the RCT we would have done, this is what we did instead”).

I see all of these as violation of the consistency assumption. ↩︎
“No causation without manipulation”. ↩︎

f2harrell · December 2, 2024, 2:58pm

A casual reading of some of the papers in cardiology that use TTE causes great worry. TTE is serving as a diversion from consideration of confounding, and is being applied to retrospective data collection-based studies. You can’t do good observational research without intentionality towards collecting well-thought-out potential confounder variables. The there is not even mention of blinding in these studies.

davidcnorrismd · December 3, 2024, 11:55am

I would bet such ‘uses’ of the TTE concept are made in a spirit opposite to Ehud’s “useful thought experiment.” That is, rather than using TTE as a way to challenge and criticize their thinking, the authors you’re reading use it as a ‘magic wand’ to wave off criticism.

FYI, this week’s NEJM has a Perspective on TTE. (I haven’t read it yet.)

f2harrell · December 3, 2024, 2:18pm

It’s great to see that commentary. But it missed the singlel most important point: Design trumps analysis. If you don’t have the power to design the observational study you are at the mercy of uncollected confounders, measurement error, and missing data. It is almost impossible to infer treatment effects reliably in a retrospective observational study.