I recently came across the documentation for Gorilla, an experiment building software. Within their randomisation tools, they have the option for “Attrition Sensitive Nodes”. This aim of this is to ensure that, after removing participants with missing outcome data, there are an equal number of participants in each arm.
It does so using block randomisation, but where people with missing outcome data are removed from their block. Then, the next people who enter the study are assigned to fill in the vacated space in this block.
Here is how they describe it:
- Imagine we have a simple between-subject experiment with two groups. The Randomiser leads to Group A and Group B via two checkpoint nodes. We want 12 participants overall and the randomiser is set to 2:2 Balanced.
- We launch the experiment.
- Participants come into our experiment and get assigned as follows: AABB, ABAB, BABA
- Our experiment is now full and participants can no longer join. Great!
- Participants start completing our experiment. Participants AABB, BB, BABA all complete. 2 remain live both on the A branch. It might be that these participants have got bored and wandered off.
- After the appropriate time, both these participants are automatically rejected.
- Our experiment is no longer full.
- New participants join.
- Gorilla checks to see if any branch assignments have been returned due to attrition. They have (AA).
- These assignments are handed out to new participants until they are consumed. The Allocator Node does exactly that: it will give us complete information on how many participants were recruited, how many completed and how much attrition there was in each condition, before reassigning new participants to a branch. On the face of it, it’s simple. But we also need to factor in Version Control across experiments and how to manage this.
This approach seems problematic to me. In the example given above, participants who fill in the vacated spaces in the blocks are essentially not being randomised: they are being assigned with 100% probability to one group or the other. Moreover, the purpose of this approach is to account for differential attrition. But, if there is differential attrition, this method guarantees that participants in the high attrition group will be recruited later on average than those in the low attrition group.
I have three questions:
What do people think about this approach?
Is there an analysis that could salvage data from a trial that used this approach?
Are there better ways to ensure group sizes are relatively balanced after removing participants with missing outcome data (or is this goal fundamentally misguided)?