The binary model will give probabilities that depend on the frequency distribution of individual outcome categories within the coarsened category combinations. And it depends on the single threshold being ‘magical’, i.e., one every expert would
The partial prop. Odds model would be better than either, but may require knowing ahead of time which predictors are likely to not act in prop. Odds.
This study tested the effect of Fluoxetine on functional recovery after stroke, which was measured at 6 months using the modified Rankin Scale (mRS; 0-5). The authors used ordinal (PO) logistic regression for the analysis. Their model was the following:
\ mRS_{6m} \text ~ \ Group \ + \ Time \ + \ Predicted \ Outcome \ + \ Motor \ Deficit \ + \ Aphasia
where,
Time = time since stroke (2-8 days vs 9-15 days) Predicted Outcome = probability of mRS of 0-2 at 6 months (≤0.15 vs >0.15), obtained from a prediction model Motor Deficit = presence of motor deficit at randomization (yes vs no) Aphasia = presence of aphasia at randomization (yes vs no)
I see the dichotomization of Time and Predicted Outcome as a problem. The very use of a prediction/ probability as a covariate, in place of the baseline mRS for instance, is problematic in my view. It seems that they didn’t have the baseline mRS available to use in the model. Also, they didn’t assess whether the PO assumption was met/ appropriate.
To do a study of mRS without collecting mRS at baseline is a key design flaw. And the PO assumption is the least of their worries. The flaws you pointed out, especially dichotomization of a continuous risk measure, are huge. A spline in logit risk should have been used. Dichtomizing time is pretty bad also.
I saw that you mentioned on a different post that when we have an ordinal covariate (mRS) it is better to model it as nominal instead of as ordinal. Does that apply here too?
fun= goes with contrast not print. And contrast will use the bootcov output to get approximate CLs on differences in means. The problem with some bootstrap samples not including some Y levels is a very tricky one. Sometimes it’s OK to give bootcov a group=Y type of argument to balance the bootstrap but performance of this has not been studied. At other times a bit of rounding of Y will suffice. This is another reason to go Bayesian.
I had reworked the model in r so as to get better clarity. But i still am not able to understand if there are issues to the model. I had run the brant test so as to ensure that the proportional odds assumption holds, but i am getting the result as
In brant(modeltest) :
24111 combinations in table(dv,ivs) do not occur. Because of that, the test results might be invalid.
Does this mean that i shall have to go for the partial proportional odds model or generalized ordered logit model?
I am not able to open this link as it might not be accessible here in India. Is there any alternative option I can take into consideration, as a pdf of the page or something similar.
Thank you for the link, it was really helpful. Is there any specific method in R that helps in checking whether the observations obtained from the sample population are unbiased while building the model, apart from the sampling technique chosen alone?
I’m running an ordinal regression model (orm) with the following formula, where HAM-D is the Hamilton Rating Scale for Depression:
orm(HAM-D ~ rcs(age) + sex + other covariates)
I’m having difficulty explaining the model output to some collaborators without framing it in terms of cut-offs of the HAM-D score. Could I ask you a couple of questions?
Can you direct me to a resource that explains how to calculate the estimated difference in exceedance probability and SEs for various cut-offs? For example, calculating Pr(\text{HAM-D} > 7 | \text{age} = 30) - Pr(\text{HAM-D} > 7 | \text{age} = 20).
Given the proportional odds (PO) assumption, would it be accurate to report the beta coefficients as “an increase in 1 unit of covariate A is associated with a exp(Beta)-fold increase in the odds of a HAM-D score > 7” instead of “odds of having a higher HAM-D score or HAM-D > j?”
Any suggestions or corrections would be greatly appreciated.
Great questions. This has an example showing how to get the whole predicted Y | X distribution. And you can use the ExProb function in rms to derive a function to compute any desired exceedance probability like \Pr(Y \geq y | X). You might compute that for a variety of y.
What is perhaps more helpful to collaborators is to get predicted mean Y | X which the Mean function (full name Mean.orm) makes easy to do. In all our work in orthopedics where various functional status/disability scales are used we routinely presented partial effect plots when the estimated mean Y on the y-axes.
On your second question, an accurate way to state an odds ratio, when proportional odds is being assumed as you are doing, is for example:
Assume the age range you want to use is the inter-quartile range and that corresponds to ages 20 and 40: The age 40 : age 20 odds ratio for HAM-D \geq any specific number is … with a footnote that the “specific number” is any HAM-D value other than the lowest possible value. Then you can add:
e.g. OR for HAM-D \geq 7 which equals OR for HAMD-D \geq 10