Exemplary QOL analyses that avoid change-from-baseline blunders?

f2harrell · December 4, 2019, 11:28pm

The ISCHEMIA study QOL analyses led by John Spertus are role models for how to do this, avoiding change from baseline and using Bayesian models to get more interpretable and accurate results. We’ll have to wait for a paper or preprint to see analysis details.

I have a perfect example but of how a proper analysis yields a much different (and more correct result) but it uses proprietary data so I can’t show it. But here’s a description. The Hamilton-D depression scale is often used in antidepressive drug development. Change from baseline to 8w is a very common outcome measure even though (1) this disrespects the parallel-group design and (2) difference in Hamilton-D were never shown to be valid patient outcome measures that can be interpreted independent of baseline.

I re-analyzed a pharmaceutical industry trial using the best available statistical approach: a distribution-free semiparametric proportional odds ordinal logistic model on the raw 8w Ham-D value, adjusted flexibly for the baseline Ham-D by incorporation of a restricted cubic spline function with 5 default knots. Here was the result and ramifications:

Unlike the linearity required for valid use of a change score, the relationship between baseline Ham-D and 8w Ham-D was highly nonlinear. The shape was similar to a logarithmic function, i.e., very high baseline Ham-D can be “knocked down” and become a moderate or low 8w Ham-D. There was a flattening of the curve starting at Ham-D=22.
This function shows that an excellent therapeutic effect may be obtained in severely depressed patients.
It also shows that an average change score, which assumes not only linearity but a slope of 1.0, is highly misleading. The excellent potential for high baseline Ham-D to get lower will be averaged in with the less amount of change from lower baseline Ham-D patients. The resulting average change from baseline underestimates treatment effect in severely depressed patients and overestimates. The average change from baseline may not apply to any actual patient.
The proper ANCOVA respects the goal of the RCT: if patient i and patient j both start with a Ham-D of x but are randomized to different treatments, what 8w Ham-D are they likely to experience?
ANCOVA using ordinal regression completely handles floor and ceiling effects in patient outcome scales.