Power Calculations in Longitudinal Mixed Effects - from two measurements to three measurements

JorgeTeixeira · April 9, 2026, 1:35pm

Hi everyone. In recent weeks, I’ve been going down the rabbit hole about power calculations for longitudinal mixed effects.

One: in PAS software, it states that both in the GEE and mixed effects options, the power remains the same as we move from two measurements to three measurements

Second: The same results are yielded by the software WebPower - Statistical Power Analysis and Sample Size Planning for Linear mixed-effects model

Three: it appears that many of these calculations are based on the book Sample Size Calculations for Clustered and Longitudinal Outcomes in Clinical Research. Sponsored by LLMs, I was able to reproduce their table 5.3

.

Then I extended the codes to two, three, and four measurements. In that simulation, actually, the power increased a little bit, but almost no impact.

So questions are:

 Are this references and code correct? I find it extremely counterintuitive and surprising that adding a third measurement has no/almost no impact on power.

I was expecting a 10 to 20% decrease on the sample size depending on the circumstances

 On that note, does anyone have a verified R script that correctly calculates power for longitudinal designs in these scenarios?

I have attached some reports. Thank you

Rplot02.pdf (34.6 KB)

power_simulation_results_wide.pdf (236.3 KB)

Report6 gee.pdf (313.7 KB)

f2harrell · April 9, 2026, 3:17pm

Side note: random effects are not the most natural way to model longitudinal data, and when you use random intercepts, adding more than around 7 repeats within subject adds no statistical information. That’s partially because random intercepts means compound symmetric correlation structure which means that time is not treated as directional. I learned this from here. The bottom line is that random effects are not likely to fit the true correlation structure.

MSchwartz · April 9, 2026, 7:06pm

@Frank,

An FYI, while the older lme() function in the nlme package by Doug et al supported AR1 correlation structures with a continuous response, that had been lacking until just last month in lme4, which was a long standing and frustrating issue. The default in lme4, I believe, has been an unstructured correlation matrix.

Now with version 2.x for lme4, just released early last month, Ben et al have added new features to make it easier to specify alternative correlation structures, which brings those features inline with glmmTMB, which has been an alternative for many of those applications that required these, and other features.

The combinations of these ME model options and the emmeans package by Russ Lenth, have offered a great deal of flexibility in modeling longitudinal data and generating relevant contrasts.

More info from Ben here on the recent lme4 updates:

Regards,

Marc

f2harrell · April 9, 2026, 7:21pm

That’s excellent Marc - a nice addition to lme4. Adding random effects to AR(1) will help meet correlation structure assumptions, and having AR(1) in the model will make the random effects smaller which adds stability. In a similar vein I’ve seen one example where there was lack of fit of a Markov-1 ordinal model until random intercepts were added.

When there are random effects, it is more natural to use Bayesian models, which handle them better than trying to approximate marginal sampling distributions.

JorgeTeixeira · April 10, 2026, 8:46am

Thank you, Frank, once again. Please allow me to offer some pushback and ask a few further questions.

First, please note that I am not referring to scenarios with over seven measurements, but rather specifically the transition from two to three.

Additionally, this does not apply exclusively to random effects but also to GEE.

The previous software also indicates that when moving from compound symmetry to AR(1), the power actually decreases.

Regarding questions involving two, three, or four follow-up measurements: if I understood correctly, your preference would be for GLS or Markov models?

In those models, do you have to manually specify the correlation matrix during analysis, or does the models handle that automatically in the background?

Also, out of curiosity, do you believe GEE is better than linear mixed-effects models for parallel RCTs?

JorgeTeixeira · April 10, 2026, 8:48am

I think it is important to emphasize that in my simulation, I used time as a continuous variable with only a random intercept and no random slope. The calculations were based only on p

I have attached a PASS report showing an increase in power with more than seven measurements. To reiterate, the primary issue I’m highlighting is the transition from two to three measurements.

JorgeTeixeira · April 10, 2026, 8:48am

Thank you, Mark. Do you think any particular structure is superior for an RCT with two, three, or four follow-up measurements? This would be helpful for pre-registration in such cases.

I actually previously assumed that LME used compound symmetry in the background!

f2harrell · April 10, 2026, 12:16pm

I don’t think that is possible without a significant lack of fit of AR(1).

Yes and if you want to hedge your bets regarding goodness of fit of the correlation structure use Markov models with random intercepts.

For Markov you specify how the current observations depend on past observations from the same subject. For generalized least squares you specify the correlation structure and get maximum likelihood estimates of the parameters of that structure (one parameter for AR(1)).

No. GEE can be inaccurate in estimating regression coefficients by assuming a nonsense working independence model. And GEE requires that dropouts and missing data are missing completely at random. Full modeling methods only require the missing at random assumption.

MSchwartz · April 10, 2026, 3:23pm

@JorgeTeixeira when analyzing data with baseline (pre-treatment) and multiple post-treatment time points, I prefer to use an AR(1) structure as a starting point. It logically/clinically makes sense and is why there is the long standing issue of using lme4.

If, for some reason, your actual data do not meet those assumptions, you can then explore alternatives, but if you need to pre-specify these details in an SAP, AR(1) is what I would use.

With respect to the older lme() function in the nlme package that is part of Base R + Recommended packages, the description of the “correlation” argument in the documentation is as follows:

an optional corStruct object describing the within-group correlation structure. See the documentation of corClasses for a description of the available corStructclasses. Defaults to NULL, corresponding to no within-group correlations.

Apparently, over time, the last sentence regarding the default has led to confusion, with some interpretations being unstructured and others being independence.

Finding what would reasonably be considered the definitive reference from 1998 by Jose and Doug:

https://www.stat.cmu.edu/~brian/720-2007-source/week07-08-ideas/pinheiro98mixedeffects-Sguide.pdf

on the bottom of page 19 is the following:

The optional argument correlation is used to specify a correlation structure and the optional argument weights is used for variance functions. By default, the within-group errors are assumed to independent and homoscedastic.

so that might help to mitigate some of the confusion. It is also why I have never used the default, and prefer to explicitly define these details.

JorgeTeixeira · April 15, 2026, 4:35pm

Thank you, both.

Were these the models you had in mind?

library(nlme)
library(lme4)
library(emmeans)
library(ggplot2)
library(dplyr)

── 1. Data 

url_raw ← “https://github.com/jorgemmteixeira/nlmeU-datset/raw/main/data/armd0.rda”
load(url(url_raw))

Filter to post-baseline rows (baseline visual0 remains as a fixed covariate)

armd0_sub ← armd0[armd0$time > 0, ]
armd0_sub$time_f_ord ← factor(armd0_sub$time.f, ordered = TRUE)
armd0_sub$subject ← as.factor(armd0_sub$subject)

Compute and plot empirical variogram

vgm_emp ← Variogram(gls_ols, form = ~ time | subject)
plot(vgm_emp, main = “Empirical Semi-variogram (Initial Check)”)

── 2. Model Fitting (No Random Slopes) ─────────────────────────────────────

— nlme GLS (Marginal Models) —

gls_ar1   ← gls(visual ~ visual0 + time * treat.f, data = armd0_sub,
correlation = corAR1(form = ~ tp | subject), method = “REML”)

gls_exp   ← gls(visual ~ visual0 + time * treat.f, data = armd0_sub,
correlation = corExp(form = ~ time | subject), method = “REML”)

gls_unstr ← gls(visual ~ visual0 + time * treat.f, data = armd0_sub,
correlation = corSymm(form = ~ tp | subject),
weights = varIdent(form = ~ 1 | time.f), method = “REML”)

— lme4 LMM (Conditional Models - Random Intercept Only) —

lme_cs   ← lmer(visual ~ visual0 + time * treat.f + (1 | subject), data = armd0_sub)
lme_ar1  ← lmer(visual ~ visual0 + time * treat.f + ar1(time_f_ord + 0 | subject), data = armd0_sub)
lme_diag ← lmer(visual ~ visual0 + time * treat.f + diag(time.f + 0 | subject), data = armd0_sub)

— Markov Transition Model (OLS) —

armd_lag ← armd0_sub 

group_by(subject) 

mutate(visual_prev = lag(visual)) 

filter(!is.na(visual_prev))

markov_ols ← lm(visual ~ visual0 + visual_prev + time * treat.f, data = armd_lag)

── 3. Extract Estimates and Calculate CI Widths ────────────────────────────

extract_52 ← function(mod, name) {
emm ← emmeans(mod, pairwise ~ treat.f | time, at = list(time = 52))
res ← as.data.frame(summary(emm$contrasts, infer = TRUE))
data.frame(Model = name, Estimate = res$estimate, Lower = res$lower.CL,
Upper = res$upper.CL, P_Value = res$p.value)
}

Combine all except Markov

results ← rbind(
extract_52(gls_ar1,   “GLS AR(1)”),
extract_52(gls_exp,   “GLS Exp”),
extract_52(gls_unstr, “GLS Unstr”),
extract_52(lme_cs,    “LMM CS”),
extract_52(lme_ar1,   “LMM AR(1)”),
extract_52(lme_diag,  “LMM Diag”)
)

Add Markov Manually (Calculated at t=52)

m_summ ← summary(markov_ols)$coefficients
m_est  ← m_summ[“treat.fActive”,1] + m_summ[“time:treat.fActive”,1] * 52
m_se   ← sqrt(vcov(markov_ols)[“treat.fActive”,“treat.fActive”] +
(52^2)vcov(markov_ols)[“time:treat.fActive”,“time:treat.fActive”])
results ← rbind(results, data.frame(
Model = “Markov OLS”, Estimate = m_est, Lower = m_est - 1.96m_se,
Upper = m_est + 1.96*m_se, P_Value = 2 * (1 - pnorm(abs(m_est/m_se)))
))

── 4. Summary Table (Ordered by CI Width) ──────────────────────────────────

final_table ← results 

mutate(CI_Width = Upper - Lower) 

arrange(CI_Width) 

select(Model, Estimate, Lower, Upper, CI_Width, P_Value)

print(final_table)