Bayesian Proportion Odds with repeated measures

micah · June 6, 2024, 8:39pm

I’m trying to find the best approach to analyze data that contain multiple follow up measurements. The outcome variable is the Knee injury and Osteoarthritis Outcome Score (KOOS). The KOOS is the mean of numerous Likert items that are converted to a value between 0 and 100. I’ve previously used the Bayesian proportional odds model blrm from the rmsb package. However I’m not sure if this can account for serial data. brm from the brms package can model repeated measures but when using an ordinal model (family = cumulative(“logit”)) gives the following error: Error: Family ‘cumulative’ requires either positive integers or ordered factors as responses.

For clarity this is the model that gives the error
f <- brm( KOOS.Pain ~ s(age) + s(BMI )+ sex + (1| subjectId) + immediate.relief + visit + surgery + KOOS.Pain.baseline, data = d, family = cumulative("logit") )

and this model “works” but I’m not sure if is accounting for the correlation between subject visits

f <- blrm( KOOS.Pain ~ rcs(age, 3) + rcs(BMI, 3) + sex + cluster(subjectId) + Immediate.relief + visit + surgery + KOOS.Pain.baseline, data = d )

arthur_albuquerque · June 12, 2024, 3:14pm

What you want is a first-order Markov ordinal model, see examples by @f2harrell in https://hbiostat.org/proj/covid19/

The ORBITA-Cosmic trial has also applied this model here

f2harrell · June 13, 2024, 12:53pm

And now our tutorial article is out: https://onlinelibrary.wiley.com/doi/10.1002/sim.10133

arthur_albuquerque · June 14, 2024, 9:00pm

It is one of the best tutorial papers I have ever read.

micah · June 19, 2024, 9:41pm

Thanks both. I’ll take a look at those papers/sites.

micah · July 30, 2024, 10:13pm

This may not be the appropriate place for this question but it is a follow up to my previous query. As per Frank’s published tutorial I conditioned the model for the current outcome (Y) on the previous outcome (Yprev) for each subject. However, as I have missing data I also ran aregImpute to create multiple imputations of my dataset. It seems to me that multiple imputation should be performed first then the Yprev variable created. Otherwise you will be imputing values for Yprev that should just be copied from Y. That being the case how would you create Yprev given the output from aregImpute?

f2harrell · August 2, 2024, 12:55pm

The way that yprev should be imputed is the way you would impute longitudinal data in general. There are are least three approaches:

Use predictive mean matching on the tall and thin dataset ignoring within-person correlations, i.e., use aregImpute or mice the usual way.
Make the dataset wide instead of tall and thin, and impute the missing times by imputing the missing variables in the standard way. This assumes regular measurement times and probably works best.
Use strictly the other longitudinal measurements for the person to do a within-per imputation, looking at the measurements both forwards and backwards in time. Stata has a nice module for doing this; I haven’t explored R for this.

micah · August 5, 2024, 4:32pm

Taking your first approach I would run aregImpute on the tall thin dataset that includes yprev. This would impute values for yprev that do not always match the actual y from the previous timepoint. Should I not worry about this?

f2harrell · August 5, 2024, 8:30pm

You would impute Y (the current record’s response) not yprev. You would only use imputed Y when they are needed to be used as yprev, and would leave a record as having missing Y when the current Y is missing.