RCT with missing follow-up outcomes: ANCOVA + MI vs Mixed-Effects Modelling?

I haven’t had much luck getting answers to my question on stackexchange so I’m posting it here. To avoid cross-posting, I am willing to remove my question on stackexchange.

In a 2-group RCT with one baseline and 2 follow-up assessment time-points, assume that
(i) follow-up outcomes are missing at random
(ii) relatively strong auxillary variables are available

My question is: Which is a better modelling approach?
(i) mixed-effects modelling (which treats baseline outcome as a covariate)
(ii) ANCOVA after multiple imputation at each follow-up timepoint

I have understood that method i (mixed-effects modelling) is generally sufficient without the need for imputation. But just how strong should the auxillary variable(s) be before method ii becomes a competitive approach?

Any guidance on this would be greatly appreciated.

1 Like

Welcome to datamethods Yong Hao.

The literature has concluded that avoiding imputation is best, in the context of full likelihood models (mixed models, GLS, Markov models, others). I would never use (ii) when Y is missing at random.

1 Like


In addition to Frank’s reply above, you might find a recent thread in this forum of interest:

Linear mixed model for 3 time points

There is a fair amount of discussion there regarding mixed effects models, specifically with two post-baseline timepoints.

1 Like

Dear Prof Harrell,
Thank you for making me feel welcome and for generously sharing your insights, as always! I have understood that likelihood-based analysis is often recommended as the primary analysis in randomized studies. However, as MI uses information in auxiliary variables to reduce bias and improve precision, my lingering concern relates to possible scenarios in which MI may be preferred over mixed-effects modeling. To assuage my concerns, I did a quick literature search and found the paper by Kontopantelis et al which shows that including a moderately-correlated outcome into the imputation only marginally improves the performance of MI.

Dear Marc,
Thank you for sharing the link and your codes!

Just want to add my thanks to the growing number of people who have heaped praises about Datamethods. Datamethods is truly a treasure trove of information with many helpful experts contributing their time and expertise.


You raise a good question. I think that if there is a surrogate outcome (or secondary outcome) that is not part of the main analysis and that is (1) correlated with the main outcome and (2) displays the same treatment effect then I could see MI gaining power over “use all available main outcome data” analysis.

Which approach is even an option depends on your estimand. Using a mixed model for repeated measures can only target a limited number of estimands (e.g. the hypothetical “as if everyone had completed treatment of the interventions assigned at randomisation”), while with MI you are a lot more flexible. On some cases you may wish to use a joint model for multiple outcomes (e.g. this paper I wrote with colleagues a while ago has a nice example where that it’s very important to do that or to do a joint MI: https://doi.org/10.1002/pst.1705)

One worry about the joint model is that it does not provide marginal treatment effects (treatment effect on one endpoint ignoring the other endpdoints). I think the treatment may even appear to be weak on two endpoints but strong on each one marginally.

Dear Marc,

Thank you once again for alerting me to this helpful post. If I have understood what you have written, your default lme model specification is

follow-up y ~ group*y0  +   time*treat  +  (1| id)

whilst Jorge’s M4 (which you seem to endorse) is

follow-up y ~ time*y0  +   time*treat  +    (1| id)

In M4, an interaction between baseline outcome and time is specified presumably because the associations between baseline and follow-up outcomes wane over time.

Could you kindly clarify on the rationale for including group*y0? Are we assuming that the between-group differences vary as a function of baseline outcome?

Thank you in advance for your guidance.

Hi puayonghao,

I am scratching my head a bit on going back to review that thread from several months ago, and the discussion on that particular point. Jorge had asked about a preference between M3 and M4 in his original post, and it is possible that I mis-read his M4 as being:

M4: y ~ treat * y0 + time * treat + (1 | id)

where I mis-read the first interaction term as using ‘treat’ (treat * y0) instead of ‘time’ (time * y0), where the former is consistent with the model that I use by default.

That being said, yes, the use of “Group * T1” in my formula, which expands to “Group + T1 + Group:T1”, is to enable a consideration for the presence of an interaction between the treatment group and the baseline measurement.

That is, the slope of the change over time for each treatment group is different, predicated upon the baseline value, rather than presuming a fixed marginal treatment effect over time, where the lines are parallel to each other. Those lines may even intersect over the range of baseline values, thus reversing the direction of the treatment effect at one end of the range versus the other.


Dear Marc,

Thank you for the detailed clarification! It seems sensible not to assume a fixed treatment effect across the different baseline values, and I am wondering if specifying this interaction is akin to assessing differential treatment effects? If so, does one run into statistical power problems?

Hi puayonghao,

Any time you add more covariate degrees of freedom to the model, via additional covariates, adding interaction terms, adding regression splines, more complicated random effects, etc, there will be an impact on effective power and therefore, sample size requirements, all else being the same.

If you are applying the model on a post hoc basis to an existing cohort, you may not have a sufficient sample size (power) to assess more complicated models, and there are various issues and limitations to consider there, since you may be effectively overfitting the model.

If you are designing a new, prospective study, and you want to conduct power/sample size assessments where you are perhaps using a mixed effects model as your primary outcome analysis method, and you are using lme4 based models in R, you can use the “simr” CRAN package as one possible option:

which provides for Monte Carlo simulations within the lme4 based model framework.

In terms of differential treatment effects, the use of interaction terms can be part of the process to assess those. There are numerous papers and guideline documents on the subject, and I might point you to a relatively recent publication:

and where the first several references therein are also good additional resources to review.


Thank you for the very helpful clarification and for pointing me to useful resources!

Dear Marc, two questions, if possible.

  1. Is there any problem in using the code y ~ time*y0 + time + time: treat + (1 | id)
    instead of
    y ~ time*y0 + time* treat + (1 | id)
    for a more convenient (to me) interpretation of the coefficients?

In this case, y includes two time points (y1 and y2)

The model fit is the same, but can that model backfire somehow?

  1. In this message, did you mean “latter” instead of “former” as well?

That is time*y0, not group*y0 is your default?

Thanks once again!

To me it’s a little better to use * so you ensure that low order effects are never omitted when there are high order effects. I would use y ~ time * (y0 + treat) + (1 | id) if using random effects model.


Hi @JorgeTeixeira,

On your first query, the formula that you have above:

does not include a main effect for ‘treat’, that is a " … + treat + … " term, which may have been a typo.

To Frank’s point, under the principle of hierarchy, if you are including an interaction term in the model, you should also include the main effects for the terms involved in the interaction.

Thus, Frank’s formula, which includes

expands to:

time + y0 + treat + time:y0 + time:treat

thus, taking advantage of R’s simplified syntax for model formulae.

Note that Frank’s formula does not include an interaction between y0 and treat, which I would include.

If you wanted to include all main effects and their second order interactions, you could write the formula as:

y ~ (time + y0 + treat) ^ 2 + (1 | id)

which would expand to:

y ~ time + y0 + treat + time:y0 + time:treat + y0:treat + (1 | id)

I would honestly be less concerned with the easier interpretation of the coefficients in this setting, and more focused on having the correct model specification, given the study design, sufficiency of data, and any other considerations.

One can then generate predictions and relevant contrasts from the model to assess treatment and time effects, and other relevant parameters. In R, for example, I use Russ Lenth’s ‘emmeans’ package to generate various relevant contrasts and simultaneous confidence intervals, with corrections for multiple testing as apropos.

On your second question, given the time since the original discussion on your post last June, and some of the confusion since then, it may be easier to just re-post the initial formula that I use in this setting, with the proviso that this is used in the setting where you have a baseline measurement, and there are at least two post-baseline time points and measurements. I am modifying it here for consistency with the lme4/lmer syntax used above, as opposed to the earlier lme syntax that I originally posted:

y ~ treat + time + treat:time + y0 + treat:y0 + (1 | id)

So I do have the “group * y0” interaction term, but do not have a “time * y0” interaction term. However, as I noted above, you could include both, and assess as apropos, if that makes sense.

The above, using “*”, would simplify to:

y ~ treat * time + y0 * treat + (1 | id)

Marc, thanks once again!


It was not a typo the removal of “group”. It was based in what Solomun calls the multilevel ANCOVA.

I like the fact that its summary() gives the estimate for all the time points.

But I do agree that might not be the best reason for model selection. Might have further issues too?


So, if one is using

m1 ← y ~ treat * time + y0 * treat + (1 | id)

and there are is baseline (0month), and two follow-ups (“3month”, “4month”) measurements, would you use something like this with emmeans to obtain the MD at 3- and 4-months?

contrasts_m1 ← emmeans(m1, specs = ~ group:time, at = list(time = c(“3month”, “4month”)))

pairwise_contrasts_m1 ← pairs(contrasts_m1)


Thank you.

For both models, one could get any estimate for any type point by using predict().


Note that the incantation that you use above will give you all possible pairwise comparisons, both within and between groups, and will also by default, use the Tukey HSD method to adjust for the multiple comparisons.

One additional note, which is that, as per prior discussions, the baseline observation should not be part of the data set that you use for the initial model. So if you have two post baseline follows ups, as in your example, you should only have two records for each patient in the dataset for the model, not three.

In that case, you do not need to use the ‘at’ argument to emmeans(), since it will use each of the two post baseline time points by default.

For obtaining the model estimated means at each post baseline time point, just using emmeans() will get you that as in the first line of code you have above. So the object ‘contrasts_m1’ above will contain those along with their confidence intervals.


Thank you for sharing the interesting blog post by Kurtz! I’m asking the following question for my own learning and understanding: In an RCT with one baseline and one follow-up assessment, omitting the main group effect rather cleverly forces both treatment groups to share the same mean baseline value. Will this method work for an RCT with 2 or more follow-up timepoints?

Thanks Marc.

In this case, baseline was only as covariate. So, all good, right?

It was not as an outcome value… which is what I presume you mean as “the initial model”.

Sorry if I have missed an early post about this.

Best wishes