Alpha spending/information fraction when controlling for baseline covariate

I am helping with a statistical analysis plan for an RCT. We want to control for the outcome variable at baseline and also use sequential analysis with alpha spending. Sequential designs are new to me so I am trying to read up on the literature, but all the examples and discussions I can find just discuss simple models comparing two groups without other covariates.

If I understand things correctly, the theory requires a normally distributed test statistic and determining the information fraction at the individual interim analyses. Mimicking the considerations for a simple comparison of two means I could use the fitted coefficient of the treatment and its standard error to get the test statistics and use the number of subjects enrolled as the information fraction, but it is far from obvious to me that this is a good approach. My specific questions are:

  1. Wouldn’t the distribution of the baseline values affect the correlation between effect estimates at the interim analyses and thus the information fraction?
  2. Doing a z-test on a regression coefficient seem suboptimal, shouldn’t I at least use the corresponding quantile of a t-distribution? Or - even better - is it possible to do likelihood-ratio test with alpha-spending?
  3. Am I missing something basic? Is there a good reference for this sort of design?
1 Like

Ah, I have the answer: Jennison & Turnbull (1997) shows that one can apply the normal theory to regression models and when the test statistics are not fully normal, they claim that it is often good to just use the significance thresholds from the normal theory but with the actual distribution of the test statistic. They refer to Jennison & Turnbull (1991) for more details.

So the answer to 2) above seems to be to use a corresponding t-distribution

Additionally Gange & Demets 1996 claim that using the standardized regression coefficient is justifiable for generalized estimating equations (and hence for simple linear models as well). They also show that the information fraction needs to “guessed” to an extent (as it requires knowledge of \text{Var}(\beta) after all subjects have been enrolled. They use simulations to demonstrate that the fraction of subjects enrolled is a good surrogate for the actual information fraction.

So the answer to 1) above seems to be that using the number of subjects is likely OK.

1 Like

A side note: This is trivial with purely Bayesian designs, and such designs would probably put some statisticians out of a job because they don’t require one-off solutions and don’t need to deal with sampling distributions.