RMS Ordinal Regression for Continuous Y

Regression Modeling Strategies: Ordinal Regression for Continuous Y

This is the 15th of several connected topics organized around chapters in Regression Modeling Strategies. The purposes of these topics are to introduce key concepts in the chapter and to provide a place for questions, answers, and discussion around the chapter’s topics.

Overview | Course Notes

Additional links

RMS15

Q&A From May 2021 Course

  1. To evaluate if BMI of height/weight measurements do better, why is the ratio of log height log weight is -2.4? How did you compute -2.4? Is it always -2.4? Why did you use a log-log model? Why not linear model? Log BMI = log weight minus 2 * log height. So on the log scale BMI assumes the ratio of coefficients is -2, and we check that by fitting log height and log weight as separate variables.
  2. Can you give a reference on how to interpret a nomogram? What do points and total points mean? See this beautiful post: regression - Clarifications regarding reading a nomogram - Cross Validated

I have a question regarding contrasts in orm models for continuous outomces. I have built an ordinal regression model for a continuous outcome (minutes vigorous physical activity per week) that is bounded at 0 and skewed. Quite a few observations are 0, about 26%. As far as I know, these issues should not pose a problem for these kind of models.

The model contains one focal categorical predictor with four groups and some adjustment variables. The book and course notes nicely explain how to get predicted values and plots for various functions (the mean, median, quantiles etc.). I have no problem creating predicted values with compatibility intervals for each group separately. However, my goal is to present contrasts/differences between predicted values for the four groups. For example, I’d like to say something like this (assuming that the model contains only the groups, BMI and age included with splines):

Fixing BMI at 25 and age at 30, the estimated difference between the medians of group 1 and 2 is XY (95%-CI: …).

I couldn’t figure out how to do this using Predict or contrast.

Or would you recommend presenting contrasts of coefficients? I find that harder to interpret compared to predicted values.

On a different note, I’d appreciate an example using ExProb with conf.int. The function returns an error must specify X if conf.int > 0 but the documentation of ExProb doesn’t state what X is in this case.

I’m less clear about compatibility intervals right now but contrast() will provide the contrasts you seek, e.g., contrast(fit, list(group=1, bmi=25, age=30), list(group=2, bmi=25, age=30).

The trick with ExProb is that you have to see which arguments it creates in the generated function, and if the first argument is not the linear predictor, you need to create a little wrapper function that transfers its first argument to the appropriate one in the generated function.

1 Like

Thanks Frank.

I think I can get the contrasts on the level of the linear predictor but I’m struggling to convert the contrasts to differences of predicted means or medians. For example:

qu <- Quantile(mod) # mod is an orm-model
med <- function(x) qu(0.5, x)

Predict(mod
        , vgroup = c("0", "1") 
        , fun = med
        , conf.int = 0.95
)

  vgroup age sex      BMI education_level living_with_others_y_n location     yhat    lower    upper
1      0  41   2 23.23563               5                      1        1 2849.239 2433.784 3311.387
2      1  41   2 23.23563               5                      1        1 2559.827 2011.234 3243.239

The difference of the predicted medians between the groups is 2849.24 - 2559.83 = 289.41. Applying contrast:

ctrst <- rms::contrast(
  mod
  , list(vgroup = c("1"))
  , list(vgroup = c("0"))
  , conf.int = 0.95
)

age sex      BMI education_level living_with_others_y_n location   Contrast      S.E.     Lower    Upper
1  41   2 23.23563               5                      1        1 -0.1502855 0.1666094 -0.476834 0.176263
     Z Pr(>|z|)
1 -0.9    0.367

The contrast of -0.15 is identical with the coefficient for vgroup=1 as vgroup=0 is the reference level. But supplying fun to the print function in order to get difference in medians instead of coefficients doesn’t seem to get the correct contrasts:

print(ctrst, fun = med)

 age sex      BMI education_level living_with_others_y_n location Contrast S.E.    Lower    Upper    Z
1  41   2 23.23563               5                      1        1 1899.218   NA 1498.216 2415.777 -0.9
  Pr(>|z|)
1    0.367

The difference of 1899.22 is much higher than expected. What am I missing here?

My apologies. fun= is an argument to contrast.rms (not print), but it’s only implemented for (1) a Bayesian model fitted with rmsb::blrm or (2) simple transformation such as anti-log. If you don’t want to do this with a Bayesian model (which provides exact inference for complex derived parameters such as the mean and quantiles) the best bet is to bootstrap the estimated differences to get a bootstrap confidence interval. The rms::bootcov function will produce a helpful matrix of bootstrap regression coefficients but that doesn’t help you plug them right into the derived parameter calculation, so you might as well bootstrap orm fits.

1 Like

Thank you for this clarification, I really appreciate your help. I’ll look into bootstrapping the whole orm model.