I’m working with a prospective collaborator to design cluster randomized trial where the primary endpoint is going to be “achievement of goals” for trainees (cluster=institutions). They would like to make the primary outcome “number of goals achieved” (edit for clarity: analyzed on the level of each trainee - i.e. if the first trainee sets 8 goals and achieves 5, their response=5. If the second trainee sets 10 goals and achieves 3, their response=3).
One problem is that in the program structure, the trainees do not have the same number of goals (some have as few as 5-7, others have 20 or more). I have been advocating that instead, we should use each goal as the unit of analysis (outcome=yes/no if the goal is achieved) with a hierarchical model accounting for clustering by institution and then trainees nested within institutions.
One problem is that I’m not sure how to easily write the power and sample size calculation for this (since I’ll have to explain that there are two levels of clustering and the inherent assumptions we will have to make about correlation of goals’ likelihood to be achieved within a trainee and then within institutions). Would love some feedback or any useful resources.
The last paragraph is the really hard part, so I’ll try to ignore it
You could make the goal the unit of analysis but I’d lean towards another approach. Analyze the number of goals achieved using the proportional odds model, and have as adjustment covariates a regression spline in the number of goals set. Or use a 2nd or 3rd order polynomial. Flexible modeling of the number of goals set should allow for adjustment for that, and allow one to answer a question such as “did two individuals who began with the same number of goals tend to achieve the same number of goals?”
I did give this some consideration, and from the responses I have been getting from the collaborator, think this sounds more palatable to them. They seem to like “number of goals achieved” as the outcome, and I did float the idea that we could do this if adjusting for the number of goals set. As you alluded…
Makes a lot of sense. I’ll bounce that back to them and see if it strikes their fancy, but still would enjoy any further discussion on the subject here for the sake of learning.
So how would you calculate power for this? Simulation I presume?
Were there no clustering or covariate adjustment, the R
posamsize functions would do the trick.
A colleague on the Palliative Care Research Cooperative pointed me to a nice shiny app, including references, for calculating power for clustered trials. I haven’t used it or looked very closely at it, but you may find it useful.