Stratification, block-randomization, etc



Say I’m planning a randomized trial (time-to-event outcome) with 4 sites and want to balance randomization for each site. Am I then committed to stratifying by site in analysis? Or including site as a fixed effect in regression models?

My understanding is: stratified randomization -> stratified analysis
and block-randomization for balance —> consider factors as fixed effects

Balance is obviously good, and stratification may be necessary if baseline hazards have a very different shape. But as you stratify by more and more factors, moving toward matched-pairs randomization, the large number of strata seems to decimate study power.

I’m available with the literature pointing out drawbacks of matched case-control studies (Pepe et al, Clin Chemistry 2012). Are there any manuscripts or texts that lay out these issues for trial design and analysis?


yes, according to ich e9 section 5.7 [ICHE9]

stephen senn has written about randomisation and balance eg when he was respondiong to that paper in social sci and medicine: "Perfect balance is not, contrary to what is often claimed a necessary requirement for causal inference, nor is it something that randomisation
attempts to provide. " [blog post]


To add further to what Paul nicely stated, Senn has stressed that the analysis model should dictate the execution of the randomization and not vice-versa. And I think that blocked randomization is overused. The only real reasons I can think of blocking within center are

  • it’s hard to maintain blinding if you don’t
  • you don’t want to induce any kind of calendar time effect on outcomes if randomizations at one center end up being AAABBB

I’d love to hear more thoughts about that.