I am helping with the planning of a clinical trial looking at an intervention in a particular group of newborns.

Primary interest is in the effect on cognitive delay, and will be assessed using a standardised measure at 2 years old. Mortality is expected to be rare but not negligible (1-5%), obviously leading to some missing data on that cognitive endpoint.

I’ve looked into some options in this scenario and concluded that a composite endpoint seems to be a good fit. In particular, a worst-rank score analysis (where deaths are assigned a score lower than the worst score on the functional outcome, and the analysis is rank-based) appears to be appropriate. I found @f2harrell’s paper on power calculations for this type of outcome.

But there are some complications that have made me wonder about possible extensions of this general approach.

- A binary cognitive outcome. The rank-based approach works with a single functional outcome, and you can either treat the deaths as tied (all getting the same score) or untied (using their survival time). In this case, the actual cognitive measure is multivariate, and the current plan is to take the outcome as a (pre-defined) low score on
*any*of the domains. A simple solution would be to use the tied-deaths approach, which would leave us with a three-level ordinal outcome: 0 (death < 2 y) / 1 (alive, low score) / 2 (alive, high score). Would there be a more efficient approach, if all of the components of the cognitive outcome are important? - Loss to follow-up. There is definitely a possibility of missing data due to loss to follow-up, either (a) not knowing survival status at 2 years, or (b) knowing that the child is alive, but being unable to perform the cognitive testing. If we were using a univariate cognitive outcome, and the untied deaths approach, it seems like both of these could be handled using a survival analysis method, treating them as censored data (although we would need to be careful about informative censoring, especially for (b)), so would a log-rank test be appropriate? Weighted towards later ‘times’ because primary interest is in the scores?
- Adjusting for covariates. We plan to stratify randomisation by site, and so the analysis should stratify by site as well. A stratified rank-sum or log-rank test could be used, but if we have other continuous covariates, would it be reasonable to use an ordinal regression model (if using the 0/1/2 outcome) or Cox model (if accounting for loss to follow-up)? In each case, loosening the proportional odds / hazards assumption?

I’d love any advice about any of these aspects, and especially if you have seen them used before. This is only at the grant application stage, so space is at a premium and it’s important to provide a convincing argument to a reviewer who might not be familiar with the specific area.

Thanks