RMS Modeling Longitudinal Responses

bklamer · February 1, 2024, 6:08pm

@f2harrell could you clarify what was not a good idea? Unless I’m mistaken, your comment was referring to a model which includes the baseline measurement as an outcome. The text shared by @FurlanLeo showed the baseline adjusted model for 2 post-treatment outcome measurements, which appears to be the method you prefer.

f2harrell · February 2, 2024, 10:33pm

Yes I always use baseline as baseline. So @FurlanLeo had 2 follow-ups making correlation assumptions less of of assume, but still continuous time should be used if times vary much.

feizuo · February 3, 2024, 1:08am

Regarding “I am finding that the baseline needs to be nonlinearly modeled (using e.g. a restricted cubic spline) for these types of outcomes because some patients can start off at extremely bad levels and get major benefits from treatments.”

Do you interpret the rcs(baseline) variable here, or no interpretation needed and it’s simply being adjusted for in the model nonlinearly because it’s better than the linear form?

f2harrell · February 3, 2024, 12:55pm

You can choose either to interpret it or not interpret it. You can estimate anything you want from this model, including the expected change from baseline over a grid of possible baseline values.

feizuo · February 7, 2024, 5:35pm

RE: “You can choose either to interpret it or not interpret it. You can estimate anything you want from this model, including the expected change from baseline over a grid of possible baseline values.”

Below are some findings comparing rcs(baseline) vs. the raw baseline values using an ANCOVA model for our seizure dataset:

The raw baseline values appear to be a much stronger predictor than rcs(baseline)

raw baseline values: p<.00001, with a fairly small estimate
rcs(baseline): none of the segments are statistically significant, though the estimates for each segment are about 100 times larger than that of the linear form
visually inspecting the baseline values plotted against the outcome (follow-up values) adjusting for treatment group, it appears okay to use the raw baseline values

The choice of either using the linear or nonlinear form of baseline didn’t seem to affect the estimates of the treatment groups compared to placebo.

similar estimates for the treatment-to-placebo comparisons and p-values for the 2 models
similar adjusted r squared for the 2 models

Residual plots for both appear to be similar. However, both models showed some issues for the two tails in the residual plots. ANCOVA likely is not the best approach if one can explore ordinal longitudinal approaches.

In this case, would you just use the linear form of baseline if one has to use the ANCOVA model? Is the decision based on p-values and/or what makes sense visually? Thank you.

f2harrell · February 8, 2024, 2:13pm

Fundamental problem: you can’t interpret individual spline terms. Instead compute AIC for the linear model and AIC for the spline model, or just look at chunk \chi^2 that combines all terms for baseline.

FurlanLeo · May 25, 2024, 6:12pm

Hello everyone,

I’m currently working on a longitudinal dataset (2 parallel groups RCT; baseline plus 2 follow-up assessments, at weeks 0, 4, and 16, respectively). Sample size is 70.
I fitted a Generalized Least Squares model using rms::Gls and obtained the following results:

m <- Gls(Response ~ Group*Time + Response_0 + Age, B = 500, data = data, correlation = corCompSymm(form = ~ Time | ID))

Tests of association

Group: \chi^2 = 6.06, df = 2, P = 0.048

Group \text x Time: \chi^2 = 2.68, df = 1, P = 0.10

Contrasts

Week 4: -4.6 [95% CI(-8.8 to -0.5)], P = 0.018
Week 16: -0.72 [95% CI(-4.7 to 3.9)], P = 0.76

Note

Time was modeled continuously.

Questions

The test of association for Group was barely significant, and the test for the Group \text x Time interaction was insignificant. The group contrast was significant at week 4 and insignificant at week 16.

If there is a main effect of Group but no Group \text x Time interaction, doesn´t this mean that the treatment effect is homogeneous across time? But this is not what the contrasts are showing.

I suppose this is related to the fact that we should not give too much emphasis to the P-values (from anova and contrasts), but rather focus on the CIs, which btw are too wide to exclude an almost irrelevant effect at week 4?
Would the overall conclusion here be that the trial was largely underpowered and therefore inconclusive?

Thanks!

f2harrell · May 25, 2024, 7:50pm

You pre-specified a reasonable model and should not use p-values to drive any changes in the model. The contrasts take the uncertainties about interactions into account, i.e., confidence intervals are properly wider since you didn’t know beforehand that interaction was absent.

FurlanLeo · May 25, 2024, 8:26pm

Many thanks for the reply, @f2harrell!

But if there’s no interaction and there seems to be a treatment effect at week 4, shouldn’t the two CIs from the contrasts be similar? Indicating a homogeneous effect at weeks 4 and 16.

My understanding is that if there’s no interaction, then the estimated treatment effect should be (roughly) the same across time. But this understanding is incorrect apparently…

f2harrell · May 25, 2024, 8:42pm

Without looking in detail I think you’re confusing “no impressive evidence of interaction” with “no interaction”.

FurlanLeo · May 26, 2024, 12:11pm

Ok, I think I got it…

Let’s supose we have a different scenario:

Model

m <- Gls(Response ~ Group*Time + Response_0 + Age, B = 500, data = data, correlation = corCompSymm(form = ~ Time | ID))

Tests of association

Group: \chi^2 = 17.4, df = 2, P = 0.0002

Group \text x Time: \chi^2 = 13.8, df = 1, P = 0.0002

Contrasts

Week 4: -7 [95% CI(-10.5 to -3)], P = 0.0003
Week 16: 2.3 [95% CI(-2.2 to 6.8)], P = 0.3

Questions

Now the anova gives a clearer picture, providing evidence for both a Group effect and a Group \text x Time interaction.

Accordingly, the contrasts show that the treatment effect is not homogeneous across time, being present at week 4 but waning at week 16.

My point is that in both scenarios (this and the one from the post above), the contrasts are sort of telling the same story, i.e., that the Group effect varies with Time. However, the anova from one scenario is sort of different from the anova from the other scenario.

At the end of the day, should I give more importance to the contrasts rather than to the anova when interpreting the results?

Thank you!

f2harrell · May 26, 2024, 4:38pm

They are both valuable, although a Bayesian model would be of more direct value for the inference part.

I would emphasize the time x group interaction test, the overall group test (beautiful test of group difference at any time), and the two compatibility intervals, plus some minor emphasis on point estimates.

FurlanLeo · May 27, 2024, 8:11pm

Perfect. Many thanks.
I suppose the order would be the following:

Multiple df test of Group effect;
Test of Interaction (Group x Time) effect;
Compatibility intervals from the contrasts;
Point estimates from the contrasts.

danny · June 13, 2024, 10:00am

@f2harrell , I tried using two time variables I described, so that the slope is allowed to change as subject enters a different follow-up phase. However the slope change is abrupt and not smooth. I’m wondering whether it’s possible to encode this using e.g. your gTrans function to make the fitted curve smooth?

f2harrell · June 13, 2024, 12:48pm

That should work, whether smooth or abrupt.

FurlanLeo · June 13, 2024, 8:47pm

The statistical analysis of the article below is intriguing to me, and I would greatly appreciate some guidance on this.

Tailored Sitting Tai Chi Program for Subacute Stroke Survivors: A Randomized Controlled Trial | Stroke (ahajournals.org)

In brief:

N=160;
12 weeks intervention;
2 Groups (Tai Chi vs Control);
Assessments at T0(baseline), T1(week 1), T2(week 8), T3(week 12), and T4(week 16);
Analysis method: GEE (Response ~ Group*Time);
Time modeled as categorical, instead of as continuous (the latter being the preferred way, according to RMS - Chapter 7);
Baseline assessment modeled as response, instead of as covariate (the latter being the preferred way, according to RMS - Chapter 7).

The authors assessed the intervention effect at each time point by looking at the \beta of the respective Group*Time interaction. This is what is intriguing to me. Shouldn’t we assess treatment effects in longitudinal studies by calculating contrasts?

After doing a few simulations, I realized that when the baseline assessment is modeled as a response to treatment (as in the study above), the coefficients of the Group*Time interaction terms are practically the same as the respective group contrasts. This is not the case, however, when the baseline assessment is modeled as a covariate. Also, this happens with both GEE and Generalized Least Squares.

Why is that?
Shouldn’t the authors of this study have calculated and reported the group contrasts?

Thank you!

martinspn · June 13, 2024, 10:36pm

Hi Leo

When time is modeled as categorical and T0 is the reference, the model will look like

y = a*treatment + b*time + c*treatment*time + intercept

Where a is the between-group difference at time zero, and b is the effect of time in the reference group. Of course, you’d have a different beta per moment since time was modeled categorically.

We can reshape this and get

y = treatment*(a + c*time) + b*time + intercept

The contrast between treatment (1) and control (0), thus, will be

y1 - y0 = a + c*time

However, in a randomized trial a will likely approximate zero, so c (the interaction coefficient) will approximate the contrast.

Once you use the baseline variable as a covariate and not as an outcome, your time reference switches to time = T1 (not T0). The interactions, therefore, refer to the difference in treatment effects across time points, e.g., for T2, c would represent the difference between the effect when time = T2 and the effect when time = T1.

Now you have

y = a*treatment + b*time + c*treatment*time + d*baseline_value + intercept

However, a does not represent the baseline difference anymore, but the difference at time = T1.

Again,

y1 - y0 = a + c*time

When time = T1 (new reference), the contrast you are looking for will be a.

When time = T2, it will be a + c, and c would assume a different value for each time point.

Thus, the contrasts you are looking for would emerge from the sum of a and c at the moments of interest for both scenarios. However, when the baseline value is considered in the outcome, a + c becomes very close to c.

f2harrell · June 14, 2024, 12:50pm

Nicely done. This approach is more trouble that treating baseline as baseline, and fails the @Stephen issue that baseline cannot be a response to treatment. You’ve shown in linear models there are tricks to force the alternative approach to work, but in general with nonlinear and ordinal models I think it’s asking for more trouble.

FurlanLeo · June 14, 2024, 2:54pm

Hi @martinspn,
thank you so much for your thorough reply, it was really helpful.

In that case then, the authors should have indeed used the contrasts for the treatment effect at each time point, since a only approximates zero, thus the contrasts would take into account subtle differences between the groups at baseline. Correct?

On the other hand, it’s not a valid approach to model the baseline as a response. Correct @f2harrell?

f2harrell · June 16, 2024, 12:33pm

Correct, that’s not valid in my view and just adds complexity to boot. Contrasts within a model that has baseline will automatically adjust for baseline differences (not much of an issue in randomized trials) and for strong baseline effects (helps power in all cases). Simultaneous confidence intervals would be nice, and usually emphasize the last time point contrast. With an ordinal Markov model you can get something else very valuable: the difference in mean times in certain states.