This post began life as a tweetorial, or twitter tutorial, for International Clinical Trials Day 2018. To read the original tweetorial, click here. The goal of this tweetorial was to explain the evidence for and against estimating adherence adjusted effects in randomized trial, and show that modern causal inference tools make adherence adjustment feasible.
We begin with a short trip back in time to 1980: the Coronary Drug Project (CDP), a large 6-arm placebo-controlled randomized trial of lipid-lowering medications published a comparison of adherers & non-adherers in their placebo arm. You can read the original CDP paper here: https://www.nejm.org/doi/full/10.1056/NEJM198010303031804/
Over 5 years, survival was nearly 10 percentage points higher among people who adhered to placebo most of the time compared to those who didn’t, even after adjusting for ~40 baseline covariates. A massive difference and bad news for proponents of the per-protocol effect. These results were interpreted as evidence that per-protocol effects could not be estimated without bias, because people shouldn’t be able to think themselves into living longer!
But that was the 80s and lots have changed since then, including statistical techniques for time-varying exposures. Before we revisit the data, let’s talk a bit about time-varying exposures. The intention-to-treat (ITT) effect is the effect of being assigned to one treatment vs another. In most trials, assignment happens once and that’s it, so the ITT is almost always about a point exposure. But when we want to know the effect of treatment itself, suddenly we have to worry about time – when can people adhere to a treatment protocol, and when did they adhere to the treatment protocol.
Even if the intervention only happens once (e.g. surgery), we might still define adherers based on when (if ever) they get treated. For example, if non-adherers are people were randomized to surgery but who never get surgery by the end of follow-up, that could be a time-varying exposure because we don’t know who the non-adherers are at baseline. If our intervention is a medication, and adherers are people who always take their medication (or take, say, >80%), that’s time-varying too!
Why does this matter? Because if the exposure varies over time, then baseline covariates aren’t going to cut it! If we have a time-varying exposure, we need to deal with time-varying confounding. The CDP didn’t do this, because they couldn’t – methods didn’t exist in 1980.
But it’s not the 80s anymore, and now we can deal with time-varying confounding. And that’s exactly what we did! We got the CDP data, and adjusted for post-randomization (time-varying) confounding.
First, we looked at cumulative incidence of mortality - just like the original paper. Using only baseline covariates, adherers to placebo did seem to have better survival than non-adherers. But once we included post-randomization covariates, the survival difference disappeared! You can read our paper here: http://journals.sagepub.com/doi/abs/10.1177/1740774516634335
Cumulative incidence isn’t a great way to compare survival, though, so next we used a survival analysis approach. And we even tried lots of different ways to model adherence, to see if we were just getting lucky. We weren’t! When we adjusted correctly for post-randomization confounding, there was no survival difference between adherers and non-adherers to placebo! Correct adjustment meant: (1) IPW to prevent induced bias (2) a flexible function for adherence (not just linear!). You can read the 2nd paper here: https://trialsjournal.biomedcentral.com/articles/10.1186/s13063-018-2519-5
So we can adjust for adherence in randomized trials after all! And why is that good news? Because when we have non-adherence, the ITT is not an estimate of the effect of treatment. Compared to active treatment, the ITT could be an over- or under-estimate of the effect of treatment!
But the per-protocol effect is the effect of treatment. That’s the definition. Whether or not we can estimate this effect in a given trial is a separate issue. So, what are the options? And when should or shouldn’t we use them?
Option 1: (naive) per-protocol analysis
• Approach: throw out information on anyone or any person-time that’s non-adherent. Then just do your regular ITT-style analysis (maybe adjusting for baseline covariates)
• Why it fails: same problem as the 1980 paper — doesn’t account for time-varying confounding!
Option 2: as-treated analysis
• Approach: same as above but allow people to cross-over and pretend they’d been randomized to their cross-over group.
• Why it fails: Time-varying confounding strikes again!
Option 3: instrumental variables
• Approach: estimate the association between randomization and adherence and use to “correct” the ITT estimate.
• Why it fails: it doesn’t (at least, not all the time). But the simple version doesn’t work if adherence is time-varying!
Option 4: estimate the per-protocol effect
• Approach: same data as the naive per-protocol analysis PLUS adjust for time-varying confounding!
• Why it fails: it doesn’t! But it won’t work if you haven’t collected post-randomization confounder data!
So that’s it! Let’s recap
(1) Adherence is a time-varying exposure.
(2) Simple methods don’t work because of time-varying confounding affected by prior treatment (treatment-confounder feedback).
(3) But there are methods that do! If you have enough post-randomization data.