Random effects analysis in clinical trials

alvespedros · November 3, 2025, 6:58pm

Hello,

I’m working on a SAP for a clinical trial with three levels of clustering: physicians, hospitals, and cities. My outcome is binary, and I’m particularly interested in the conditional interpretation of the odds ratio in a GLMM framework. For the analysis, I plan to use a maximal random-effects structure, as suggested by Barr (“Random effects structure for confirmatory hypothesis testing: Keep it maximal”), which, in this case, consists of including the three clustering levels as random intercepts in the model.

However, I understand that this may not be feasible due to convergence issues with GLMMs, especially in the binary case, and I propose two alternative strategies should this occur: (1) fit a GLMM with only the city level as random effect, or (2) if convergence issues persist—although this is unlikely—use a GEE (marginal interpretation) with clustering at the city level.

My questions are:

(1) If the maximal model does not converge, is it appropriate to remove the physician and hospital levels from the main analysis and consider them only in subgroup analyses, or should I instead include them as fixed effects for covariate adjustment in the main model?

(2) A colleague suggested conducting a poolability analysis to determine in advance whether the physician and hospital levels should be included in the model. However, I have not seen this approach in the papers and SAPs I have reviewed. Is this common? I would also appreciate references on this topic, as this is the first time I’ve heard of it and I have not found much material online.

(3) This is a more technical and personal question that won’t directly affect my SAP. I will use the lme4 and glmmTMB R packages for the analysis. If I fit a maximal model, I won’t be able to calculate the marginal coefficients from the regression, since neither of these packages supports it as far as I know. Is there any other R package that permits this kind of calculation with more than one random effect?

f2harrell · November 3, 2025, 9:44pm

Welcome to datamethods Pedro!

Really excellent questions. A Bayesian hierarchical model is preferred here as it makes far fewer approximations than frequentist models. If running R brmsor rstanarm when you look at diagnostics, the effective sample size for posterior draws for the 3 random effects variances will tell you about identifiability.

I was expecting you to say that when reducing this to one level of clustering, it would be the smallest level (physician).

When there are very few physicians that are found in more than one hospital or city, is it even possible to estimate random effects for hospital and city? I’ve been wondering about that for a while.

alvespedros · November 4, 2025, 11:36am

Thank you for the welcome and your response.

All my secondary analyses are described within a frequentist framework, so I would prefer not to change it only for the main outcome, as this could generate too many questions. Also, it is not feasible for me to rewrite the whole SAP to match this change. However, I will try to run some simulations to see whether this approach would affect my current sample size and keep it as a Plan C.

I agree that considering only the physician level is a more intuitive approach; however, my sample size calculation actually considered only the city level. This was due to the lack of information I had about how many physicians and centers would be enrolled in the trial, and there was a reference stating that using the highest level of clustering provides a conservative estimate of sample size.

I don’t have a answer for that, so this is just a rant, but I think will depend especially if physicians are nested or crossed within the hospitals and whether you are trying to estimate only the random intercepts or the random intercepts and slopes. In my case, for example, I don’t expect the same physician to work on two different hospitals (nested). Some others states that a minimum of 5-6 levels of random effects are needed to get better estimates than a classical model (Data Analysis Using Regression and Mulitilevel/Hierarchical models by Gelman and Hill (2007)) and considering the design I’m working on - stepped wedge - the authors asks for a minimum of 9 clusters (The batched stepped wedge design: A design robust to delays in cluster recruitment - Kasza (2022))