I’m working with a prospective collaborator to design cluster randomized trial where the primary endpoint is going to be “achievement of goals” for trainees (cluster=institutions). They would like to make the primary outcome “number of goals achieved” (edit for clarity: analyzed on the level of each trainee - i.e. if the first trainee sets 8 goals and achieves 5, their response=5. If the second trainee sets 10 goals and achieves 3, their response=3).
One problem is that in the program structure, the trainees do not have the same number of goals (some have as few as 5-7, others have 20 or more). I have been advocating that instead, we should use each goal as the unit of analysis (outcome=yes/no if the goal is achieved) with a hierarchical model accounting for clustering by institution and then trainees nested within institutions.
One problem is that I’m not sure how to easily write the power and sample size calculation for this (since I’ll have to explain that there are two levels of clustering and the inherent assumptions we will have to make about correlation of goals’ likelihood to be achieved within a trainee and then within institutions). Would love some feedback or any useful resources.