Use of conditional estimands in regulatory settings

Dear Colleagues,

Frank Harrell and Stephen Senn have made convincing arguments in favor of using conditional estimates of treatment effect in RCTs rather than unconditional (a.k.a. “marginal”). The latter will provide a distorted view to the extent that the trial’s sample composition differs from the target population’s composition (which it almost always does). The conditional estimate provides our best guess of the treatment effect for a patient with a given set of values on the vector of covariates.

For non-collapsible estimates of treatment effect, the conditional effect of treatment can differ depending upon the combination of covariate values used. This raises the question of how best to prespecify an estimate that will be used to test a primary statistical hypothesis that the treatment “worked.” Regulators tend to want such an overall test before trying to interpret the magnitude of the effect. One might respond that it is naive to ask for a single test to “show significance” before examining conditional estimates of effect for different combinations of covariate values. But then we need a tractable approach for evaluating treatment effects without reverting to a fishing expedition.

What are thoughts about the most principled way(s) to go about testing for treatment effects using non-collapsible, conditional estimates of treatment effect?

Welcome to datamethods Kevin.

There are some interesting issues at play.

  • The FDA guidance on covariate adjustment in RCTs invites sponsors to compute conditional estimates, and also mentions the utility of unconditional estimates.
  • Unconditional estimates are not concordant with the volunteerism-based convenience sampling in RCTs.
  • In linear models, conditional treatment effects and unconditional treatment effects are estimating the same thing, but not in nonlinear models.
  • We need to recognize that any one-number summary is inadequate, and multi-number summaries are quite informative.

The above link shows the advantage of providing n estimates when there are n patients in the trial, and showing how absolute risk reduction is related to baseline risk. When treatment does not interact with any of the pre-specified baseline covariates, the p-value and the Bayesian posterior probability of treatment benefit are the same whether computed from odds ratios or from any covariate-specific absolute treatment difference. This makes things easier from a regulatory perspective. So the more interesting questions are related to quantify evidence for more than “just positive” treatment effects and come down to

  • What is the best way to do statistical inference on non-null effects. For example the posterior probability that the treatment lowers the incidence of a bad outcome by more than \epsilon will decrease with \epsilon. How do we choose the covariate settings to use in the calculations, and how do we choose \epsilon? It’s probably best to make a graph with continuously-varying \epsilon on the x-axis.
  • How can we choose representative baseline covariate settings when we don’t want to show n absolute treatment benefit estimates? One naive approach is to compute quintiles of absolute risk reduction and find covariate settings that give rise to those values.
1 Like

Thanks, Frank. This is very helpful. And I apologize for missing it, but how might we demonstrate that the treatment effect is “non-null” before exploring the nature of the effect further? I think that’s the part where some of our regulatory colleagues might be concerned. If we work within a conditional framework, then I’m evaluating the null that there is no treatment effect for patients with this set of values on the covariates. If we work within an unconditional, we are evaluating the null that there is no treatment effect overal for the patients included in the study. Both are limited, which gets us back the point that you have made over and over again about the insufficiency of single-number summaries of effect.