RMS Ordinal Regression for Continuous Y

FurlanLeo · October 13, 2023, 2:47pm

Thank you Professor.

Like this?
f <- orm(Y ~ rcs(Age, 3) + Sex + Time + X +
X %ia% rcs(Age, 3) + X %ia% Sex + X %ia% Time,
data = d, x = T, y = T)

Unfortunately, it doesn’t work.
It keeps returning the same error message.

f2harrell · October 13, 2023, 3:29pm

If you attach a saveRDS(…) binary data frame that is a minimal dataset that creates that error message (simulated or a small subset of the real data) I can debug.

FurlanLeo · October 13, 2023, 6:44pm

Thank you!
Already created a “subset.rds” file.
Please, how do I attach/send it?

f2harrell · October 13, 2023, 7:07pm

Sorry I thought attaching was built in to the system. You can email me at fh@fharrell.com or store the file somewhere and include the link here

FurlanLeo · October 13, 2023, 7:29pm

Just sent you the file via email.
Thanks!

f2harrell · October 13, 2023, 7:52pm

It worked fine under rms 6.8-0 under linux. I’m pretty sure it works under 6.7-0 also. The formula to use is

f <- orm(Y ~ rcs(Age, 3) + Sex + Time + X +
           X %ia% Age + X*(Sex + Time),
         data = d, x = T, y = T)

Check you don’t have X or any of the other variables in your global environment.

FurlanLeo · October 16, 2023, 10:56am

Many thanks Professor.
What’s the underlying logic/rule for this?

Also, I have another question on the function contrast().
When I estimate a contrast from a PO model (orm()), the effect on Y is on the odds ratio scale.
How do I get a contrast on the mean scale?

c <- contrast(f, list(X = 'Positive'), list(X = 'Negative'))
print(c, fun = exp)

When I do this I get a very asymmetric confidence interval.

f2harrell · October 16, 2023, 12:16pm

Not clear on which logic/rule you are asking about.

If using a Bayesian model with rmsc::blrm you can get exact posterior distributions for means or differences in means. For a frequentist model with rms::orm if you run bootcov to get bootstrap repeated coefficients you can get bootstrap intervals for these using contrast and Mean.

FurlanLeo · October 16, 2023, 7:50pm

Thanks Professor.
I mean the logic/rule underlying the representation of interaction terms in the model so that bootcov() works fine.

FurlanLeo · October 16, 2023, 9:02pm

For this model, contrast() on the bootstrap model (bootcov()) is not working (same error message as above):

f <- orm(Y ~ rcs(Age, 3) + Sex + Time + X1 + rcs(X2, 3) +
X2 %ia% Sex + X2 %ia% Time + X2 %ia% X1,
data = data, x = T, y = T)

It seems there’s a problem with bootcov() because of the interaction terms.
I just sent you (via email) another subset.rds file with the respective variables, in case you want to have a look at the issue.

f2harrell · October 20, 2023, 8:59pm

bootcov ran fine for me with your model and with the new data frame you emailed. I still wonder if your global environment is somehow cluttered. Or it’s a subject change I made in the package since uploading to CRAN. You can see the changes at GitHub - harrelfe/rms: Regression Modeling Strategies

PaulGS · November 6, 2023, 10:40pm

I am reposting this here on Frank’s request.

I have some lab data consisting of calculated tumour volumes from 4 group (1 control, 3 treated at varying doses). There is however, a cohort/day effect we want to account for. The distributions are quite skewed, so we log-transformed before running a mixed model (lmer) with treatment as a fixed effect and cohort/day as a random effect.

For comparison, I have tried to run a mixed-effects proportional odds model using clmm() from the ordinal package. The results are quite different though and I’m not sure I’m setting up the data properly. When using ordinal regression with a continuous outcome is it just as simple as converting your outcome to a factor variable prior to running the model? The thing that’s not sitting right with me is that there will then be a unique factor level outcome for every individual (maybe I’m overthinking things).

Thanks.

f2harrell · November 7, 2023, 12:48pm

With semiparametric models you don’t need to have all levels of Y present in all groups. And converting to a factor should work, just print levels() of the variable to verify they are in the order you expect.

To make sure you specify the model optimally, describe further the experimental design, which experimental units are measured more than once, and what is the measurement time schedule.

ordinal can have problems with its random effects multidimensional integration approximations. I’ve seen this especially with a non-huge number of independent subjects each measured only twice. It’s always a good idea to run a Bayesian random effects model and checking the resulting posterior median coefficients against ordinal and also checking the Bayesian posterior median of the standard deviation of the random effects. This can be done with rmsb::blrm or brms; See the long vignettes here.

Random effects models assume strange correlation patterns. Alternatives are discussed here and here.

PaulGS · November 8, 2023, 2:03am

Thanks Frank.

68 mice were allocated to 4 treatment groups - 17 in each. Four mice (one from each group) were euthanised and measured at a time (at least on different days).

I think my conversion of the outcome (tumour volume) to a factor is correct:

> dat_long$value_fac <- factor(dat_long$value, ordered = T)
> levels(dat_long$value_fac)
 [1] "0.095" "0.232" "0.245" "0.266" "0.335" "0.395" "0.424" "0.434" "0.463" "0.466" "0.528" "0.591"
[13] "0.597" "0.636" "0.655" "0.658" "0.675" "0.682" "0.685" "0.699" "0.72"  "0.73"  "0.797" "0.854"
[25] "0.855" "0.879" "1.034" "1.036" "1.132" "1.167" "1.211" "1.259" "1.333" "1.359" "1.362" "1.432"
[37] "1.437" "1.487" "1.526" "1.563" "1.594" "1.683" "1.812" "1.926" "2.021" "2.401" "2.461" "2.509"
[49] "2.518" "2.574" "2.64"  "2.756" "3.053" "3.201" "3.473" "3.623" "3.686" "3.736" "4.082" "4.104"
[61] "5.056" "5.062" "6.265"

I have never used Bayesian models before but I finally got brms installed and working properly after some error resolution. I am curious about the disparity in results. The model results in brief are:

1. lmer (log-transform continuous outcome)

lmer(log(value) ~ group + (1|cohort), data = dat_long)

Fixed effects:
                            Estimate Std. Error      df t value Pr(>|t|)  
(Intercept)                   0.3231     0.2143 27.5576   1.507   0.1431  
group1                       -0.1318     0.1793 44.5412  -0.735   0.4663  
group2                       -0.3218     0.1721 44.2788  -1.870   0.0681 .
group3                       -0.1463     0.1755 44.4060  -0.833   0.4091

2. clmm (continuous outcome as ordinal)

clmm(value_fac ~ group + (1|cohort), data = dat_long)

                             Estimate Std. Error z value Pr(>|z|)    
group1                       -0.461575   0.004267  -108.2   <2e-16 ***
group2                       -0.984198   0.004008  -245.6   <2e-16 ***
group3                       -0.472331   0.004178  -113.0   <2e-16 ***

3. brm (continuous outcome as ordinal - Bayesian approach)

brm(value_fac ~ group + (1|cohort), family = cumulative("logit"), data = dat_long)

Population-Level Effects: 
                         Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
group1                   -0.50      0.67    -1.77     0.80 1.00     3197     3149
group2                   -0.98      0.63    -2.23     0.22 1.00     3282     2765
group3                   -0.48      0.62    -1.68     0.77 1.00     3627     3035

If I read this correctly, the estimates from the log-transformed model are not statistically significant, but they are from the mixed-effects ordinal model. Then if you look at the brm model, the estimates are similar to the clmm model but the associated errors are wildly different, and I presume the brm credible? intervals all containing 0 indicate that none of these coefficients are significant (or however one interprets this in a Bayes context), so this is more in line with the log-transformed model. Is that about right?

I don’t understand why the SE’s with the clmm model are so low giving rise to the high statistical significance…

f2harrell · November 8, 2023, 12:52pm

You can’t directly compare a linear model and an ordinal one, but the main point is that you are getting exactly the problem I saw with random effects in a frequentist ordinal model leading to silly far too small standard errors. The approximations are breaking down. Assuming the posterior sampling diagnostics are good for the Bayesian analysis, Bayesian random effects models give you exact inference. It’s time to retire frequentist hierarchical models in many cases. This is especially true when the frequentist approach gives you no warnings other than very small SEs.

Bayesian model standard deviations of posterior distributions mean something very different than SEs in frequentist models, but a rough comparison isn’t bad.

sscogges · November 8, 2023, 6:01pm

Touching on what Frank said, the clmm function uses the Laplace approximation as a default. You could try using more accurate quadrature methods to see if that gives you more reasonable standard errors. In the call to clmm, set the nAGQ argument to a higher value (say something between 5 and 10).

@PaulGS This could be a really interesting teaching example for others. Would you be willing to make the full data available at some point?

f2harrell · November 8, 2023, 7:17pm

I would really like to see the result of trying different approximations.

PaulGS · November 9, 2023, 3:31am

Thanks Frank and @sscogges. I have to admit I always thought Bayesian analyses were way beyond me, but the more I read about it, the more I am getting inspired to try this in my day to day work. I especially like how the interpretation of the credible interval is straightforward, rather than having to be careful in how you phrase the frequentist confidence interval in describing results. I have also tried a couple of old glmer models that I had struggled with convergence issues with, and these just seemed to work in brms. Clearly I have a lot to learn and learning about priors and how to specify a good one is part of that - I have just been running models at the moment with the default which I presume is an uninformative prior.

One question relating to what you touched on. Do you think Bayesian regression should be considered for ALL small samples or is just those relating to ordinal distributions?

I realised I have already kind of posted the data, so I guess it won’t matter if I post the dataset (I have stripped off the group labels).

Please have a play if you like and let us know what you find @sscogges.

dat <- structure(list(cohort = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 
                         3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 
                         7L, 7L, 7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 
                         11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 13L, 14L, 
                         14L, 14L, 14L, 15L, 15L, 15L, 15L, 16L, 16L, 16L, 16L, 17L, 17L, 
                         17L, 17L), group = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 
                                                        1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 
                                                        1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 
                                                        1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 
                                                        1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), levels = c("1", 
                                                                                                                    "2", "3", "4"), class = "factor"), value = c(0.597, 0.232, 0.395, 
                                                                                                                                                                 0.466, 1.132, 1.359, 0.424, NA, 1.683, 5.056, 0.879, 1.526, 3.736, 
                                                                                                                                                                 6.265, 2.64, 4.104, 1.167, 0.797, 0.72, 1.926, 0.655, NA, 0.699, 
                                                                                                                                                                 0.658, 1.437, 1.034, 1.333, 1.036, 2.021, 1.563, 1.812, 1.432, 
                                                                                                                                                                 1.211, 1.259, 2.574, 2.461, 5.062, 3.623, 2.401, 3.053, 2.518, 
                                                                                                                                                                 4.082, 3.686, 3.473, 3.201, NA, 0.682, 0.528, 0.591, 0.245, 0.675, 
                                                                                                                                                                 0.463, 0.266, 0.335, 0.636, 0.73, 0.855, 0.434, 0.095, 0.685, 
                                                                                                                                                                 1.594, 1.359, 1.362, 0.854, NA, 2.756, 2.509, 1.487)), row.names = c(NA, 
                                                                                                                                                                                                                                      -68L), class = c("tbl_df", "tbl", "data.frame"))

sscogges · November 9, 2023, 7:25am

Thanks @PaulGS! First, I confirmed that I could replicate the results you already reported. Then I ran the clmm function using adaptive GH quadrature instead of the Laplace approximation. Using the quadrature method, even with only 2 points, gives results very similar to what you were getting from brm:

Cumulative Link Mixed Model fitted with the adaptive Gauss-Hermite                                                                                                             
quadrature approximation with 2 quadrature points                                                                                                                              
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
Coefficients:                                                                                                                                                                  
              Estimate Std. Error z value Pr(>|z|)                                                                                                                             
group_factor2  -0.5322     0.6735  -0.790   0.4294                                                                                                                             
group_factor3  -1.0535     0.6403  -1.645   0.0999                                                                                                                            
group_factor4  -0.5365     0.6252  -0.858   0.3909

and increasing to 10 points doesn’t change much:

Cumulative Link Mixed Model fitted with the adaptive Gauss-Hermite                                                                                                             
quadrature approximation with 10 quadrature points                                                                                                                             
                                                                                                                                                                               
                                                                                                                                             
                                                                                                                                                                               
Coefficients:                                                                                                                                                                  
              Estimate Std. Error z value Pr(>|z|)                                                                                                                             
group_factor2  -0.5357     0.6725  -0.797    0.426                                                                                                                             
group_factor3  -1.0413     0.6365  -1.636    0.102                                                                                                                             
group_factor4  -0.5341     0.6234  -0.857    0.392

PaulGS · November 9, 2023, 9:43am

hmm, so clmm does still work ok if you specify the appropriate numerical integration approach then. I guess the trap is that running the model out of the box would lead you to the wrong conclusions.