BBR Session 11: Nonparametric Tests Using the Proportional Odds Model

This is a place for questions, answers, and discussion about session 11 of the Biostatistics for Biomedical Research airing 2020-01-24 introducing the proportional odds model as a generalization of the Wilcoxon and Kruskal-Wallis tests. Session topics are listed here. The video will be in the YouTube BBRcourse Channel and will be directly available here after the broadcast. Power calculations are covered, and Bayesian ordinal logistic regression is briefly introduced.

1 Like

Is there a good way to use orm with no covariates?

Trying to use orm for this to obtain CI on the mean/median as shown in the course video, but without any covariates:

y <- round(runif(100,0,10))
dd <- datadist(y)
options(datadist = 'dd')
f <- orm(y~1)

Get the following error: Error in fitter(X, Y, offset = offs, penalty.matrix = penalty.matrix, :
object ‘kof’ not found

Sorry it doesn’t handle that. It should have. Not sure if lrm does either. The fit object does have the -2 log likelihood for the intercept-only model.

1 Like

i’m not very familiar with R. If i want to see the actual code behind the R package orm, how can i see it? I can only find the documentation: Maybe i can find it here somewhere?: cheers

Little bit off-topic maybe, but for a lot of R functions (especially from downloaded packages) you can simply view the code by typing the command without parentheses:


Results in the following output in the console (I only show the first 10 lines)

function (formula, data, subset, na.action = na.delete, method = "", 
    model = FALSE, x = FALSE, y = FALSE, linear.predictors = TRUE, = FALSE, penalty = 0, penalty.matrix, tol = 1e-07, 
    eps = 0.005, var.penalty = c("simple", "sandwich"), scale = FALSE, 
    call <-
    var.penalty <- match.arg(var.penalty)
    if (!missing(penalty) || !missing(penalty.matrix)) 
        stop("penalty not yet implemented")

The following post on Stack Overflow goes into quite more detail for other situations:

1 Like

Best to go to the master source on github:


is there any guidance regarding adequate sample size given the no. of intercepts to be estimated? In the Harrell et al. Stat Med paper they consider small (eg n=50) samples in their simulations but estimate “a few intercept values”. And although in the analysis of CD4 counts they estimate 831 intercepts, they have n=4776. I will soon find out because i will estimate CPMs with number intercepts = around 0.5 x n (with n<100). cheers
[i initially posted this Q here: Quality of life surveys and proportional odds models, but maybe more suitable in this thread]

Just as in estimating the empirical cumulative distribution function or running the Wilcoxon test, the number of ties in the data actually work against you. The fewer ties, the better everything works. Fewer ties \rightarrow more intercepts. There is no penalty for intercepts because they are order-constrainted unlike all the \beta s.

1 Like

I’m a little confused about the Kruskal Wallis test being a specific case of the Proportional Odds (PO) model. Section 7.6 of BBR says that the PO model is a generalization of the Kruskal Wallis test. When I mess around with some toy data, the p-values between the Kruskal Wallis test and PO model Likelihood Ratio test seem rather different. Are there different assumptions between the two methods that would explain this difference in p-values?

For example, with the following toy dataset

A <- c(0.05, 0.1, 0)
B <- c(1.5, 1.56, 6.9)
C <- c(5.9, 1.3, 1.4)
samples <- data.frame(
   grp = c(rep("A", 3), rep("B", 3), rep("C", 3)), 
   value = c(A, B, C)
kruskal.test(value ~ grp, data=samples)

	Kruskal-Wallis rank sum test

data:  value by grp
Kruskal-Wallis chi-squared = 5.9556, df = 2, p-value = 0.05091

orm(value ~ grp, data=samples)
Model Likelihood            Discrimination            Rank Discrim.    
                          Ratio Test                   Indexes                  Indexes       
 Obs              9    LR chi2     13.07    R2                      0.775    rho     0.843    
 Distinct Y       9    d.f.            2    g                      10.170                     
 Median Y       1.4    Pr(> chi2) 0.0015    gr                  26102.571                     
 max |deriv| 0.0003    Score chi2  10.37    |Pr(Y>=median)-0.5|     0.384                     
                       Pr(> chi2) 0.0056                                                      
         Coef     S.E.    Wald Z Pr(>|Z|)
[deleted intercepts]
 grp=B    20.3396 77.5439  0.26  0.7931  
 grp=C    18.3268 77.5333  0.24  0.8131  

There’s a sizable difference between Kruskal Wallis vs PO model (p=.05091 vs p=0.0015). I wasn’t expecting for the p-values to be exactly the same due to different estimation methods, but still surprised by the difference. What would be the cause of this difference?

You have perfect separation in the data, which points out how inaccurate the \chi^2 distribution is for K-W in some cases. K-W should be hugely significant with your dataset. See if you can get an exact p-value somehow (permutation-test-like). The score test from PO is the one that’s equivalent. I’d stake my money on the results from the PO model fit. You might also try a two-sample subset because wilcox.test will give exact p-values there.

:new: Correction: I didn’t look at your data closely enough. You don’t have perfect separation.

1 Like

Thanks for the help. On a related question, a colleague uses Graphpad Prism, which recommends Kruskal Wallis (exact p-value computation) and Dunn’s test for subsequent pairwise comparisons.

The BBR notes don’t mention Dunn’s test and instead suggest using the PO model and its coefficients/p-values. Do you have any thoughts on using Dunn’s test vs the PO model coefficients?

My understanding is that Dunn’s includes a multiple comparisons correction while the PO model does not, and I’m not sure if this multiple comparisons correction is needed as I usually don’t see multiple comparisons corrections for regression coefficients. Interestingly Dunn’s test seems to yield decently smaller p-values than the PO model.

dunnTest(value ~ grp, data=samples)
Dunn (1964) Kruskal-Wallis multiple comparison
  p-values adjusted with the Holm method.

  Comparison         Z    P.unadj      P.adj
1      A - B -2.385139 0.01707266 0.05121798
2      A - C -1.639783 0.10105026 0.20210051
3      B - C  0.745356 0.45605654 0.45605654

thanks. I’ll just note: using nlmixed in sas (because i have random effects for patients - repeated measures) it took about a day to run and the results ‘make sense’. The nlmixed requires array statements for the intercepts, cumulative probs etc and specification of the likelihood (depending on the link function - i’ve done logit, probit and cloglog). I’ll share the code once it’s cleaned up in case some people are not using R. I ought to use the orm R package to validate the sas estimates too …

confused why on BBR p7-25 the probabilities are given as “>=” and the intercepts are decreasing. For the CDF i expect the reverse. Can I assume the data were ranked from largest to smallest? (although I can’t see this in the R code). cheers

The only way I know to do Dunn’s test here is to run it from Wilcoxon tests. The pairwise Wilcoxon tests are not consistent with the K-W test. So stick with tests on the parameters in the single unified PO model. Better: Compute simultaneous confidence intervals for the multiple pairwise PO contrasts (log odds ratios).

1 Like

Don’t use that. Use this:

1 Like

It doesn’t matter whether you rank the data from small to large or large to small. I’m just stating the PO model using a parameterization that makes the binary logistic model be a special case. Most other people use P(Y \leq y) whereas I state it in terms of P(Y \geq y).

1 Like