stats and clinical may not see see eye-to eye over whether to adjust for a post-baseline covariate. A typical example is the analysis of hypoglycemic episodes (hypos) in diabetes where clinical folk are quick to suggest an adjustment for post-baseline hba1c. The reasoning is straightforward: hypos and hba1c are related (see figure), there are fewer hypos if the patient doesn’t reach hba1c target; thus we want to know “if the same hab1c is achieved, then what is likelihood of hypos”. Or the reasoning may be: “hypos after XX weeks are probably not dependent on baseline hba1c”, which leads to a pre-specified analysis that uses “hba1c at baseline for hypo episodes before visit XX and hba1c at visit XX for hypos after visit XX. Bearing in mind, this is often a primary analysis.

the problem is, this reasoning is intuitive and cogent and countering statistical arguments feel too esoteric to persuade anyone on the other side (ref). However, the ema guideline on adjustment for baseline covariates seems definitive: “When a covariate is affected by the treatment … the adjustment may hide or exaggerate the treatment effect. It therefore makes the treatment effect difficult to interpret” [ref]. Or Statistical issues in drug development where Senn says “adjusting for such baselines… can be extremely misleading” [ref]

Thus: 1) is it reasonable to adjust for post-baseline covariates? 2) if not, how to persuade clinical colleagues? Does it suffice for the statistician to simply emphasise that “you are estimating something different” when adjusting for post-baseline hba1c?

I hope you get some replies to this excellent question. Interpretation of effects adjusted for time-dependent covariates is usually difficult. I often advocate landmark analysis where you have a second qualification period in which all the original baseline variables are measured again, then adjusting for dual baselines. For example, adjusting for original HbA1c and 2nd measured HbA1c also captures the effect of the change in HbA1c but is more general and better fitting than that.

Another approach is to use a joint modeling analysis in which the “covariate” is treated as an outcome. For example, using joint models for longitudinal and survival data. In such models treatment enters in both models as a covariate and you can investigate how it affects the primary outcome as the time-varying/longitudinal covariate changes over time.

With regard to the EMA document, a covariate affected by treatment has potential to be a mediator and thus comes with all the issues of adjusting for mediators. Most clinicians should be able to digest the problems of adjusting for mediators.

With regard to time-depend covariates, some analogies can be drawn between your example and the standard HIV examples from which Marginal Structural Models were developed. So it is possible to adjust for them appropriately.

With regard to your example of HbA1C, does it not depend when the post-baseline test was done? Would a 5-day-post-treatment HbA1c be likely to be different from the pre-treatment given HbA1C is meant to “reflect the preceding 120 days.”

Adjusting for post-baseline covariates can give you a bad estimate of the treatment effect, when those covariates are affected by prior treatment. This can be because you are blocking part of your treatment effect, and/or because you are opening a collider (i.e. creating selection bias) due to an unmeasured cause of the post-baseline variable that is also prognostic for the outcome. There are methods that can deal with this (g-methods like IPW and g-formula), but in the example you give the problem starts a bit earlier.

What is the actual causal question of interest? If you could design a trial to answer this question what would it look like?

From my read, there are two possible questions: what is the effect of a treatment on the number of hypoglycemic episodes among diabetics over some time period? Or, what is the joint effect of a combined intervention on treatment & on hba1c (fixing it to some value) on the number of hypoglycemic episodes?

The first question does not require adjusting for post-baseline hba1c. If you were doing a trial, you wouldn’t adjust for hba1c when comparing the two trial arms, and for the same reason you shouldn’t here. (Although, a properly conducted per-protocol effect could adjust for it if hba1c was associated with adherence – then you should be using g-methods).

The second question does require adjusting for hba1c but in a way that’s probably not possible: if you want to know the effect of treatment when hba1c is fixed at some value, you have to know something about what intervening on hba1c means, and you have to know which value is important. This question is not well-defined, unless interventions on hba1c directly are well-defined. Well-defined interventions are a requirement for causal inference, so you can’t get a causal answer to this question.

(The caveat to all of this is that I’m not a subject matter expert in diabetes so maybe there are ways to directly intervene on hba1c that I don’t know about – in that case, design the target trial, and emulate it as closely as possible from your data using g-methods to avoid inducing selection /collider bias).

thanks for your response. it prompted me to do some reading:

-it seems that, with modern insulins, the link between a1c and hypos has waned: [1, 2, 3]

-although the therapeutics initiative has said it’s difficult to interpret the data “because patients are often prescribed 2 or more glucose lowering medications at a time” [4]

-i think hypos should be considered an outcome alongside a1c; some are using a composite including a1c and hypos and weight gain [5]

-i feel a joint modelling approach is the way forward, a paper from this month models a1c and cv events [6]

-the ema guideline is focused on a1c as the primary outcome, and much of the remaining literature is devoted to safety studies of cv events in diabetes

To pick up in @bobmcpop’s last point - HbA1c is interpreted clinically as a proxy measure for average plasma glucose over the past 120 days. Thus - HbA1c is itself an integral over time. As @bobmcpop points out - HbAic might therefore reflect a time period both before and after the treatment was initiated - meaning that adjusting for a post-baseline HbA1c taken less than 120 days after treatment began would be extremely hard to interpret. Going further - it you take a HbA1c a year after treatment began, it then represents rolling average of a time period that began (365 - 120) = 245 days into treatment. Does it make sense to adjust for that? Perhaps if the most recent 120days are of interest for some reason.

A hypo is defined as an abnormally low blood glucose level, and Hba1c is considered to represent average plasma glucose over time - so it makes some sense that lower HbA1c is associated with a higher risk of hypos - assuming only the mean and not the variability of plasma glucose is not changed by treatment. Since modern insulin regimes are designed to improve hour by hour control of plamsa glucose - i.e. decrease variablilty, it also makes sense as @PaulBrownPhD points out, that the link between HbA1c and hypos has weakened.

Therefore - if you have multiple longitudinal measurements over follow up time, I think the joint models are a sensible approach. However - would it not make sense to directly model plasma glucose as the longitudinal component rather than HbA1c, since HbA1c represents the integral of plasma glucose over the last 120 days? @drizopoulos I’d be curious as to your thoughts on that?

(Caveat: also not a subject matter expert in diabetes)

Indeed, with joint models you have the flexibility to define the functional form that relates the history of the HbA1c with the hazard of the event of interest. For example, you could define that (weighted) integral of the longitudinal HbA1c profile of each patient is related to the hazard of the event. Moreover, with joint models it turns out that the (overall) treatment effect is time-varying, because the longitudinal outcome (HbA1c here) varies over time.

@f2harrell I hope this is not a naïve question but wouldn’t multicollinearity be potentially an issue when adjusting for two measurements of HbA1c of the same person?

Not really. You’ll be able to learn about the relative importance of both measures, and are sure to find the earlier measurement should be downweighted. And overall prediction is never harmed by even extreme correlation among predictors.