Is anyone aware of simulation studies that model the power required for various assumptions regarding an interaction effect in a RCT setting? People quote Gelman’s 16x sample size based on 2x2 factors, but was curious about the probability of detecting true and false interactions in an RCT setting.

# Power for interaction effects

Gelman’s rule of thumb applies. But the “true and false” part is crying for a Bayesian approach.

i guess simulations aren’t needed, just equations, although i haven’t seen Gelman’s derivation i guess it’s straightforward

What if three terms interact? Would the sample have to be multiplied by 32? As for the suggestion to use a Bayesian model, the question is whether the low sample size to assess the interaction may cause some spurious change in the posterior probability of effect of the interaction.

To estimate a 3rd order interaction in the easiest case (2x2x2 all sample sizes equal) requires 16 x the sample size needed to estimate a main effect. For hypothesis testing I’m not sure.

Under these conditions, is Bayesian evidence reliable? Roughly speaking, when can I say that data are sufficient to support posterior probabilities of effect arising from third-degree interactions?

The answer depends entirely on how much information comes from the past and how much information comes from the data. For practical purposes the best Bayes has to offer here is to partially disbelieve third-order interaction effects and to somewhat disbelieve second-order interaction effects so that effects of interest have the right variance-bias tradeoff.