Extensions of 'worst-rank' score methods with outcomes truncated by death

mdonoghoe · August 16, 2021, 12:23am

I am helping with the planning of a clinical trial looking at an intervention in a particular group of newborns.

Primary interest is in the effect on cognitive delay, and will be assessed using a standardised measure at 2 years old. Mortality is expected to be rare but not negligible (1-5%), obviously leading to some missing data on that cognitive endpoint.

I’ve looked into some options in this scenario and concluded that a composite endpoint seems to be a good fit. In particular, a worst-rank score analysis (where deaths are assigned a score lower than the worst score on the functional outcome, and the analysis is rank-based) appears to be appropriate. I found @f2harrell’s paper on power calculations for this type of outcome.

But there are some complications that have made me wonder about possible extensions of this general approach.

A binary cognitive outcome. The rank-based approach works with a single functional outcome, and you can either treat the deaths as tied (all getting the same score) or untied (using their survival time). In this case, the actual cognitive measure is multivariate, and the current plan is to take the outcome as a (pre-defined) low score on any of the domains. A simple solution would be to use the tied-deaths approach, which would leave us with a three-level ordinal outcome: 0 (death < 2 y) / 1 (alive, low score) / 2 (alive, high score). Would there be a more efficient approach, if all of the components of the cognitive outcome are important?
Loss to follow-up. There is definitely a possibility of missing data due to loss to follow-up, either (a) not knowing survival status at 2 years, or (b) knowing that the child is alive, but being unable to perform the cognitive testing. If we were using a univariate cognitive outcome, and the untied deaths approach, it seems like both of these could be handled using a survival analysis method, treating them as censored data (although we would need to be careful about informative censoring, especially for (b)), so would a log-rank test be appropriate? Weighted towards later ‘times’ because primary interest is in the scores?
Adjusting for covariates. We plan to stratify randomisation by site, and so the analysis should stratify by site as well. A stratified rank-sum or log-rank test could be used, but if we have other continuous covariates, would it be reasonable to use an ordinal regression model (if using the 0/1/2 outcome) or Cox model (if accounting for loss to follow-up)? In each case, loosening the proportional odds / hazards assumption?

I’d love any advice about any of these aspects, and especially if you have seen them used before. This is only at the grant application stage, so space is at a premium and it’s important to provide a convincing argument to a reviewer who might not be familiar with the specific area.

Thanks

f2harrell · August 16, 2021, 12:29pm

I think things clear up when you get down to the basic data: in a given week what is the worst think that happened to the infant? A longitudinal ordinal model can analyze such data, handling censoring, death, and a variety of other issues. A longitudinal model does not make have to choose whether an early mild event is worse than a late bad event. See COVID-19 for detailed examples.

mdonoghoe · August 16, 2021, 9:27pm

Thanks for your response Prof Harrell. I very much like the longitudinal ordinal models but I am not sure they can be used here: the primary outcome that the investigators are interested in is cognitive delay, the assessment of which (via Bayley 4) requires an ~hour-long session with a qualified assessor. So it is only really feasible to do at a single time point.

f2harrell · August 16, 2021, 9:58pm

In the special case with vital status follow-up did not extend beyond when the Bayley 4 is assessed you can use a single ordinal measurement that has death as the worst level.

mdonoghoe · August 16, 2021, 10:30pm

Thanks. And given the possibility of right-censoring (not knowing vital status at 2y; or knowing they are alive but not being able to assess the Bayley 4), would that be best analysed with a log-rank test? Or, if wanting to account for covariates, a proportional hazards model?

f2harrell · August 17, 2021, 1:30am

An ordinal model can allow for interval and left and right censoring. The Bayesian proportional odds model function blrm in the R rmsb package handles that.

pmbrown · August 25, 2021, 4:05pm

this open access paper considers functional scale + mortality outcomes and contrasts modelling alternatives: Comparing methods to combine functional loss and mortality in clinical trials for amyotrophic lateral sclerosis - PMC, it’s a simulation study though and it’s not immediately obvious how many timepoints they considered. i prefer modelling to the composite approach

davidcnorrismd · August 31, 2021, 11:36am

Are the investigators really planning to throw away all the information from 2 years of longitudinal development of the children? That seems totally out of step with any reasonable conception of child development, to say nothing of essential statistical principles regarding efficient use of information.

mdonoghoe · August 31, 2021, 9:51pm

It does appear that they are not planning to collect developmental outcomes in the interim period, except for a couple of secondary outcomes at 2–4 months.

This is not my area of expertise, but the impression that I had was that it was important to use an internationally validated scoring system (the trial is aiming to open across 10+ countries), and given the costs of administering Bayley 4, it would only be feasible at a single time point.

If you know of interim measures that could be collected more easily, I’d love to be able to suggest them to the investigators. I guess integrating two different types of outcomes (one collected longitudinally, one — of primary interest — only at the end of follow-up) could be done in a longitudinal ordinal model, with careful thought about how they relate to one another?

f2harrell · August 31, 2021, 11:00pm

Yes, and as I mentioned elsewhere it is sometimes possible to incorporate interval censoring for a subrange of an ordinal scale when that part is not collected.

davidcnorrismd · September 1, 2021, 11:58am

Pediatrics is absolutely chock-full of milestones. Here is a recent attempt to update normative descriptions, for example. Your investigators will not be lacking in knowledge of such milestones, but rather in imagination as to how they might be assimilated into the analysis. It is in respect to the latter that the statistician has a supreme opportunity to make a contribution.

mdonoghoe · September 1, 2021, 10:02pm

Thanks very much both.

Do either of you have examples of assimilating & analysing various types of outcomes like this (e.g. parent-reported milestones, collected regularly + paediatrician-administered assessment, performed once)?

The difference I am seeing from @f2harrell’s examples earlier in this thread is that we have different measures of the same underlying thing (child development), rather than measures of different outcomes that can be directly compared and ranked (home / hospitalised / ventilated / death). My first thought is some kind of latent-variable model for each component, which could take the form of an ordinal regression model. Individuals who reach the 2 year timepoint would inform the correlation between the components.

But it would be good to know if I am on the right track, and if not, how other people have approached it.

f2harrell · September 2, 2021, 2:56am

That’s a very interesting setup and I hope someone with more experience with that will respond. Multivariate regression would essentially use the first canonical variate constructed from multiple outcomes, i.e., the linear combination of them that has the highest R^2 against the best linear combination of the right hand side variables. That is not clinical enough. A user-defined combination scale is the only other thought that yields a univariate response for each time point. Perhaps better is a shared random effect across two models, if you can scale both responses to a common scale.

davidcnorrismd · September 2, 2021, 11:46am

As far as I know, state-space modeling is the accepted framework for conceptualizing this type of data assimilation to latent variables. A state-space model specifies system dynamics in terms of latent state variables that evolve according to dynamical laws with unknown parameters; various measurements are available, from which we way hope to infer the latent state history and/or the process parameters [1,2].

For inspiration, I might suggest browsing the bibliography for the R pomp package. For example [3] assimilates anonymized mobility data and hospital discharge data to a model of COVID-19 epidemiology; see p.5 of its Appendix. The field of psychometrics has ample precedent for latent-variable modeling, of course, and probably can claim much credit for developing the methods [4]. But I’m unaware of any dynamical models of this sort in psychometrics. In designing this trial, you may have an opportunity to advance the field.

Künsch HR. Particle filters. Bernoulli. 2013;19(4):1391-1403. doi:10.3150/12-BEJSP07
Kantas N, Doucet A, Singh SS, Maciejowski J, Chopin N. On Particle Methods for Parameter Estimation in State-Space Models. Statist Sci. 2015;30(3):328-351. doi:10.1214/14-STS511
Wang X, Du Z, Johnson KE, et al. Effects of COVID-19 Vaccination Timing and Risk Prioritization on Mortality Rates, United States. Emerg Infect Dis. 2021;27(7). doi:10.3201/eid2707.210118
Meehl PE. A Funny Thing Happened to Us on the Way to the Latent Entities. Journal of Personality Assessment. 1979;43(6):564-581. doi:10.1207/s15327752jpa4306_2