Hello all, it’s been quite some time since my last interaction with the datamethods community. I have come across the following puzzle and hope someone (especially @f2harrell) can illuminate the issue.
glimpse(data)
Rows: 201
Columns: 5
Impairment 81, 53, 66, 77, 64, 28, 34, 65, 55, 39, 41, 46, 76, 70, 64, 50, 6…
Change 31, 3, 24, 10, 52, 16, 19, 50, 21, 29, 21, 31, 0, 60, 11, 39, 44,…
MEP Positive, Negative, Positive, Positive, Positive, Positive, Posit…
Baseline 19, 47, 34, 23, 36, 72, 66, 35, 45, 61, 59, 54, 24, 30, 36, 50, 3…
FollowUp 50, 50, 58, 33, 88, 88, 85, 85, 66, 90, 80, 85, 24, 90, 47, 89, 7…
The variables Baseline and FollowUp were created as follows:
Baseline = 100 - Impairment
FollowUp = Baseline + Change
Why do the two models below lead to different conclusions?
m1 = lrm(Change ~ rcs(Impairment, 3)*MEP, data = data, x = TRUE, y = TRUE)
anova(m1, test = ‘LR’)
m2 = lrm(FollowUp ~ rcs(Baseline, 3)*MEP, data = data, x = TRUE, y = TRUE)
anova(m2, test = ‘LR’)
In m1, there is an association between Impairment and Change, MEP and Change, and the interaction between Impairment and MEP is significant. However, in m2 there is an association only between Baseline and FollowUp, but not between MEP and FollowUp, and the interaction between Baseline and MEP is not significant. Additionally, the R² in m1 is 0.12, whereas in m2 is 0.77.
Since I first came across Frank Harrell’s material on why change score analyses are not appropriate I always frown upon it whenever I see it.
In the case above, would it be correct to conclude that the interaction found in m1 is artifactual?
Thanks!