Decades of underpowered RCT

Not quite sure what your point is - I don’t have any reference for it. But this is very much the way trials are traditionally conducted and analysed. (I can read your comment as not agreeing with that but I’m not sure that’s what you meant)

Wouldn’t this also make sense if you just used the study to update your information on the standard error component of the power calculation? So it becomes "given our updated estimate on event rate/noise do we have a chance of detecting the effect we were originally interested in? I thought the main sin of post-hoc power is that it uses both the estimate of standard error and treatment effect so it just becomes a re-calculation of the p-value?

1 Like

Taking your second point/question first, vis-a-vis post hoc power which, of course, retrospectively uses the end of study observed data only to re-estimate power, there are numerous papers and online discussions on the topic, with this one from Andrew Gelman from 2019:

Post Hoc Power - Gelman

a discussion in this forum just prior to the above:

“Observed Power” and other “Power” Issues

and a 2021 paper by Andrew Althouse, who is a participant in this forum and initiated the thread above:

Post Hoc Power: Not Empowering, Just Misleading

On your first point, there are a plethora of considerations vis-a-vis interim futility monitoring and possibly associated sample size re-estimations, which at the top of the flow chart would be, are you using a frequentist or Bayesian approach. In conjunction with that are considerations for the type of endpoint (e.g. binary, continuous, time to event), along with other relevant study specific considerations that will impact the modality to be used to support interim decision making.

I would point you to a very recent (June 2025) open access paper/tutorial as a starting point, if you want to go down the rabbit hole a bit:

Futility Monitoring in Clinical Trials

There are numerous references at the end that are also of value, along with entire books on the subject, as there is no “one size fits all” approach to interim monitoring and decision making.

One general distinction that I would make is to keep in mind that many (most?) clinical trials begin with overly optimistic estimates of the treatment effect size. In the absence of formalized mid-study monitoring and adjustments, that will frequently lead to underpowered and inconclusive studies, which was the original basis of this thread.

Thus, in the context of formalized interim monitoring, the key question may not be “what is the probability that we will observe the original a priori hypothesized effect size?”, but rather “what is the probability of a non-null result that is still clinically relevant?”

2 Likes

Just making sure I’m not misunderstood the main difference between post-hoc power and what I described above is that what I described would be called design analysis by Gelman, which he supports as a way to contextualize whether the study could have reliably detected effect sizes of interest given the amount of information it brought. I would also agree that this is an aside and totally different from what you proposed re: conditional power which (can) mix all of these ideas while also taking an interim/forward looking view.

1 Like