FDA Draft Guidance: Use of Bayesian Methodology in Clinical Trials of Drug and Biological Products

New DG dropped, just putting the link here as I’m sure it will be of interest!
https://www.fda.gov/media/190505/download

7 Likes

This is a major step forward. I’m proud of FDA CDER+CBER doing this. The draft guidance finally recognizes that there are non-frequentist operating characteristics and that it’s not an absolute necessity to control type I assertion probability \alpha if you either strongly justify the analysis prior or provide sensitivity analyses to estimate Bayesian operating characteristics when the analysis prior disagrees with the sampling (design) prior. For example you can simulate the probability that a purely Bayesian decision rule still results in the correct decision in the face of a prior mismatch.

:new_button: To me (and not speaking for the FDA in any way) the most important features of the draft guidance are

  • A Bayesian guidance only existed for the Center for Devices & Radiologic Health, under the superb leadership of Greg Campbell. Drugs & Biologics has needed a guidance for a long time, because the issues are so different from the device world.
  • The presence of the document by itself symbolizes legitimacy of Bayesian design and analysis for clinical trials.
  • The document mentions that the use of Bayes is possible for simple designs where frequentist options also apply.
  • To my knowledge this is the first FDA document that states the possibility that \alpha is not a necessity when assessing operating characteristics.
  • This is the first FDA document stating that there are legitimate and unique Bayesian operating characteristics (the correctness of made decisions) that have almost nothing to do with \alpha. Details about Bayesian operating characteristics may be found here.

There are challenges that need to be addressed, for example:

  • More details are needed to operationalize Bayesian design and statistical analysis plans. This need is partially met by the FDA C3TI Bayesian Demonstration Project where more example sketches of Bayesian SAPs will be added.
  • Sponsors need to see that this does not represent a lowering of the evidence bar. Sometimes it represents raising the bar. A key principle of Bayes is optimizing decisions one trial at a time.
  • Clarity is needed about the sources of data that are likely to be approved in formulating the prior for the treatment effect, or for pooling past and current data (as discussed here). FDA reviewers are quite skilled in knowing when a data source is unbiased or when bias is adequately accounted for in data model formulation. The proportion of prior data sources that are likely to be deemed relevant for Bayesian analysis is expected to be small.
  • Clarity is needed to help sponsors know what to expect to be needed for choosing priors. The majority of cases will not have unassailable data available, and for those the prior represents a restriction on the likely treatment effects. For example we may only know that a treatment is not curative, so not only can an odds or hazard ratio not be zero, it is very unlikely that the effect ratio will be below 0.25. A skeptical prior on the log effect ratio will encode that knowledge.
  • Sponsors need to realize the full power of fully sequential Bayesian designs when they are careful about the choice of prior, because there is no multiplicity adjustment that is possible or sensible for Bayesian sequential analysis. Sponsors need to understand that reductions in sample size with Bayesian analysis comes more from early stopping for efficacy and especially for inefficacy instead of from using prior data to boost Bayesian power. Side note: Bayesian operating characteristics of frequentist group sequential methods are very conservative due to Bayesian methods not needing any multiplicity adjustment since there is no \alpha to spend. Group sequential methods may yield reliable results when a stopping boundary is crossed, but take far too long to cross the boundary.
10 Likes

Nice to learn also that in some domains the FDA is still functioning in the public interest.

2 Likes

Having had for 7 years now the opportunity to work closely with FDA reviewers in CDER I can safely say that there is a huge dedication to the public interest. The main hindrance to advancement has been statisticians not getting Bayesian training in graduate school, and clinging to their first-learned statistical paradigm because of familiarity.

4 Likes

Same here in my tenure - 14 years. Sorry to take the topic on a political tangent – read recently: Pazdur warns that politics, ‘chaos’ are damaging FDA. “Veteran drug regulator, who left the agency last month, says industry hasn’t fully realized extent of the issues.”

On a related note, learned today about a parallel (and much shorter) concept paper from the EMA outlining their planned approach to developing guidance on Bayesian methods in clinical trials. The public comment period closes April 30, 2026 and those interested can submit via EUSurvey.

Given our ongoing discussion of the FDA draft guidance here, this provides a useful comparison point and an opportunity for the statistical community to provide input to EMA. From my understanding (correct me if I am wrong), the EMA document is not meant to be a full guidance but rather a description why EMA thinks additional guidance is needed and what topics/questions the future reflection paper should tackle.

In contrast to the FDA document, the EMA framing suggests Bayesian methods require special justification rather than being recognized as a coherent inferential framework. The emphasis on “error control” and “lack of control of type I error rate” throughout the document suggests the EMA is still viewing Bayesian approaches through a fundamentally frequentist lens. This contrasts sharply with the FDA draft guidance’s recognition that there are legitimate Bayesian operating characteristics (probability of correct decisions, Bayesian power, expected bias and MSE of estimates averaged under a prior) that do not reduce to α-control.

For example, the EMA paper asks: “How to assess error control for both primary and secondary endpoints in the absence of frequentist inference?” and “How to deal with lack of control of type I error rate?” The FDA guidance, by contrast, explicitly states that calibration to Type I error rate “may not be applicable or appropriate” in Bayesian settings and provides detailed discussion of alternative approaches to specifying success criteria.

The proposed timeline extends to June 2028 for the final reflection paper representing a significant lag behind current FDA thinking and the broader methodological literature. Given that EMA is seeking feedback, this may be an opportunity to encourage alignment with the more sophisticated treatment in the FDA guidance. Specifically:

  1. Recognition that Bayesian methods do not require special justification when the design and analysis are coherent.

  2. Acknowledgment of Bayesian-specific operating characteristics beyond Type I error.

  3. Guidance on prior distributions beyond just informative priors for borrowing.

Additional thoughts welcomed. Europe has exceptional statisticians and clinicians who have made foundational contributions to Bayesian methodology in clinical trials. Additionally, the existence of the FDA draft guidance may serve as a catalyst, providing both a template and external pressure for the EMA’s thinking to evolve. Particularly since the concept paper is a starting point for consultation rather than a final position.

5 Likes

Pavlos the points you raised are the key ones to express extreme concern about in the EMA document. Behind the scenes is the fact that while frequentists are always asking Bayesians to simulate frequentist operating characteristics, Bayesian never ask frequentists to simulate Bayesian operating characteristics. Had this been commonplace, we would have ample demonstrations of why it’s a bad idea to control a transposed-conditional probability. This reminds me of an episode of The Office where one team of office workers wins a trivia contest at a party and the other team, of which the boss is a member, demands that to get the prize the first team must also win a “shoe toss over the roof” contest.

Practically speaking, for many applications frequentist methods will be relatively OK when strong frequentist evidence for an effect is found, when one does not care about clinical significance. But the sample size needed by the frequentist method will be too large or with sequential testing the frequentist approach will take too long in achieving a result. If one is interesting in abandoning ineffective treatments early, frequentist methods will take way too long.

4 Likes

The process seems to have begun with a June 2025 workshop, mentioned in the Concept Paper. I note several familiar names on the agenda.

2 Likes