RMS Ordinal Logistic Regression

Regression Modeling Strategies: Ordinal Logistic Regression

This is the 13th of several connected topics organized around chapters in Regression Modeling Strategies. The purposes of these topics are to introduce key concepts in the chapter and to provide a place for questions, answers, and discussion around the chapter’s topics.

Overview | Course Notes

Additional Resources

RMS13

2 Likes

I have several questions regarding the concordance statistic, in the setting of a multivariable proportional odds logistic model and specifically how it is calculated by lrm(), calibrate() from the rms package. This question may be generalised to: “Which of the discrimination indexes returned by lrm() apply to the whole model, and which are specific to each outcome level?”

Regarding the c-statistic [(Sommers’ D + 1)/2], I imagine it is some generalisation of the proportion of all pairs of patients in which Yj >= Yi when betaXj >= betaXi, but how is the intercept included? Does the C-statistic vary for different levels of the outcome or is there one global C-statistic given for the entire proportional odds model, and if so, is there even any interpretation or value for a outcome level specific C-statistic?

1 Like

Good question. D_{xy} and c are whole-Y measures of pure predictive discrimination. They are based on all possible pairs of observations with different Y values. Since in the PO model the intercepts apply equally to all X, the full linear predictor and linear predictor ignoring the intercepts have a 1-1 relationship to each other so you can ignore the intercepts.

1 Like

I assume this would also apply to the optimism-corrected calibration slope too, but not the optimism corrected intercept, Emax nor Brier’s score - is this correct?

Yes I think that’s correct

1 Like

The majority of my audience (clinicians) have difficulty understanding the output of the proportional odds model. Is there any reason why I shouldn’t convert the parameter point estimates and intervals from OR into the actual outcome (in my case KOOS scores) and only present those data?

2 Likes

No reason not to, you just have several choices. Ordinal regression starts with estimation of exceedance probabilities. These lead to estimating the whole distribution of Y | X. For a truly continuous Y you can summarize the whole distribution with the predicted median if you must. For continuous or discrete Y that is approximately interval-scaled you can predict the mean. These are obtained by the Quantile and Mean functions applied to an ordinal regression fit object in the rms package.

2 Likes

to identify patients who would most benefit from a medication review by a pharmacist before discharge we reviewed 939 discharge letters ( of different patients) for drug related problems. Since not all errors are equally important we used an ordinal scale with 5 levels , level 5 being worst. I have 2 continues independent variables ( modeled with RCS, k=4) and 4 categorical independent variables.

  1. is it reasonable to combine levels of Y of in certain levels there were too few observations? 399 patients had no errors, level 1 = 11 patients , level 2= 255, level 3=166, level 4= 103, level 5= 5
  2. I used RMS, and I know how to get the predicted probabilities for a new patient , but I want to extract the formula, because I want to use it in a clinical software ,to get predictions and decide which patiens discharge letter to review. ( for example those with probability above 0.5 to be in the highest two Y levels). is there a code to do that? I can do it by myself but the RCS part is a little bit complicated

Thanks

You said ‘dependent variable’ in two spots where I assume you meant ‘independent variable’.

If you are willing to make the proportional odds assumption then it is not a good idea to combine categories unless you have evidence that the categories being combined are equivalent.

You can easily get the predicted probabilities you want. For the case of \Pr(Y \geq 4) you need to use just the second-to-last intercept. You can get an algebraic form of the fitted model using the latex function (which works when using R markdown or when directly making a \LaTeX report. To get the form of just the linear predictor part of the fitted model you can use Function(fit object) on any system. The generated R code will use just one intercept (I forget whether it’s the first or the middle one) and you can insert whichever intercept you want.

1 Like

thanks.
I edited the post, of course it is “independent”.

I am doing a study using OLR. The study tries to assess the satisfaction of ground level stakeholders (scale of 1(extremely dissatisfied) to 5(highly satisfied). The independent variables of the model are treated as continuous scale variables (variables are valued on a scale from 1 to 3). The model was developed using spss. It fulfills the proportionality odds ratio and the misclassification rate of the model is around 50%. I understand that it is not a classification model, apart from the goodness of fit of the model, the pseudo r square value and fulfilling the test of parallel lines assumption, is there any other method to validate the model?

When having an ordinal scale dependent variable, is there a need to use accuracy scoring rules to predict whether the output will fall in a satisfied category or dissatisfied category of population?

1 Like

Hi Sunena, as you mentioned ordinal logistic regression (OLR) is not a classification method so classification accuracy is not something to evaluate. You have look at the proportional odds assumption, and some methods for assessing that are anti-conservative, so if you didn’t find a problem it’s probably OK. Pseudo R^2 tells you about (mainly) predictive discrimination, not about goodness of fit per se. That leaves possible goodness-of-fit assessments based on checking linearity and absence of interaction. If the predictors are 3-levels each you might do a likelihood ratio \chi^{2} test comparing your existing model to one in which 2 indicator variables are created for each of the predictors. This will give you a global test of linearity.

For validating the model in terms of it not overfitting the data you can use the “optimism bootstrap” which is easy to do with the R rms package’s lrm and validate functions.

1 Like

I just wanted to clarify a few more doubts regarding this. According to my study, the
Dependent variable is satisfaction of the respondent: Likert scale of 1 (very dissatisfied) to 5 (very satisfied). There are 23 IV’s (they have been valued on a scale of 1 to 4 in some case, 1 to 3 in others),

Some of the Independent variables are:

  1. V1: Preservation of form and built fabric:

qualitative valuation quantitative valuation descriptive valuation
(by respondents)

    Good                                      4                                Well maintained 
   Average                                  3                            Moderately maintained 
      Bad                                      2                                 Lightly maintained
    Worst                                     1                                    Not maintained
  1. V2:Environmental enhancement

    qualitative valuation quantitative valuation descriptive valuation

           Good                                 3                           Heritage buildings and its 
                                                                                 surroundings are well maintained
    
          Average                            2                            Heritage buildings are identifiable 
                                                                                     among new constructions
    
              Bad                               1                            Unable to identify the historic built 
                                                                              fabric due to complex constructions|
    

I had treated the variables as covariates while doing it in spss since they were turned into scale values. Is that the correct way, or should they be converted into dummy variables since they were categorical? , as i was not able to get a valid answer.

The output i received upon using them as covariates in OLR is as follows

Is this the right method? I am not able to find any data pertaining to the type of IV’s used in my case, which is why the query.
Sir, I was not able to understand the part regarding the global test of linearity also.

Thanking in advance

Use of indicator variables so as to not assume linearity would more than double the number of parameters to estimate. Unless your sample size is say > 1000 I wouldn’t do that.

I am analyzing chemicals data that is continuous and extremely skewed(heavy right tail). Not normal. Some folks suggested doing Quantile regression but what I understand is that Quantile regression is inefficient. So should I propose to my group members that we look into Ordinal Multinomial regression and not Quantile regression?

Quantile regression is good if N is very large and if there are very few ties in the data and no detection limits. Multinomial is the opposite of ordinal. Multinomial means that the ordering of Y values is not used. Ordinal regression works great for continuous Y and for Y having lots of ties, and it can handle single detection limits.

1 Like

Perfect. Thanks Frank.

Hi Prof Harrell,

I’m going through the PO Markov example for dataset 1 (not assuming PO for treatment, no treatment*time interaction), and the estimate for ‘treat:1’ is labelled as the transition odds ratio for recovery. Just wondering if this was a typo because the recovered state is y=0 (recovered), not y=1 (alive, not recovered). Shouldn’t it be the transition odds ratio for “alive but not recovered” when we look at ‘treat:1’?

Many thanks.

You’re probably correct.

1 Like

What makes the above example using dataset 1 a Markov model? This example did not seem to include a variable for Previous State. The other Markov models (from COVID-19) that you developed have a variable for Previous State, and this makes it a first-order Markov model. Sorry, I’m quite new to Markov models and am hoping to understand the different ways of applying it. Many thanks.