Experimenting with a Bayesian Partial Proportional Odds Model

arthur_albuquerque · March 25, 2026, 8:39pm

Imagine we have data from a RCT comparing two strategies (experimental = PCT and control = UC). These strategies aid physicians to decide whether a patient needs treatment with antibiotics or not. We then measure the number of days patients were treated with antibiotics for up until 28 days and if they were dead or not at the end of follow-up of 28 days.

When Y = 0 , the patient was alive at the end of follow-up and did not use antibiotics.
When 1 \ge Y \le 28, the patient was alive at the end of follow-up and Y represents the total duration of therapy in days.
When Y = 29, the patient died.

Thus, there are three different process happening and thus a single estimand won’t probably capture potential independent differences between treatments.

From this scenario, I could imagine at least these three questions:

Did the experimental strategy decrease the risk of getting treated with antibiotics?
Did it decrease the risk of death?
In patient that did get treated with antibiotics and survived, did it reduce the total duration of therapy?

This example motivated me to fit a constrained partial proportional odds model to separely estimate treatment effect to answer these questions.

I simulated a dataset with 5453 patients (2715 in UC group) with ~0.2 in PCT and ~0.1 mortality rates in UC. Many patients didn’t get any antibiotics (Y=0) and many got around 7 days because this is a common duration for antibiotics.

I then fitted a Bayesian model in Stan with weakly-informative priors for treatment effects and vague for intercepts:

Stan code

data {
  int<lower=1> N;                   
  int<lower=3> K;                   
  array[N] int<lower=0, upper=K-1> Y; 
  vector[N] X;                      
  vector<lower=0>[K] dirichlet_conc; 
  int<lower=0, upper=1> prior_only; 
}

parameters {
  simplex[K] pi;                    
  real beta_duration;               
  real tau_avoidance;               
  real tau_death;                   
}

transformed parameters {
  vector[K] cp = cumulative_sum(pi);
  vector[K-1] alpha = logit(cp[1:(K-1)]);
  
  vector[K-1] beta_c;
  beta_c[1] = beta_duration + tau_avoidance;
  for (c in 2:(K-2)) {
    beta_c[c] = beta_duration;
  }
  beta_c[29] = beta_duration + tau_death;
}

model {
  pi ~ dirichlet(dirichlet_conc);  
  beta_duration ~ normal(0, 0.82);
  tau_avoidance ~ normal(0, 0.82);
  tau_death ~ normal(0, 0.82);

  if (!prior_only) {
    for (n in 1:N) {
      vector[K-1] mu = X[n] * beta_c; 
      int y_idx = Y[n] + 1; 
      
      if (y_idx == 1) {
        target += log_inv_logit(alpha[1] - mu[1]);
      } else if (y_idx == K) {
        target += log1m_inv_logit(alpha[K-1] - mu[K-1]);
      } else {
        target += log_diff_exp(
          log_inv_logit(alpha[y_idx] - mu[y_idx]),
          log_inv_logit(alpha[y_idx-1] - mu[y_idx-1])
        );
      }
    }
  }
}

generated quantities {
  array[N] int<lower=0, upper=K-1> Y_rep;
  
  // ==========================================
  // 1. POSTERIOR PREDICTIVE SIMULATIONS
  // ==========================================
  for (n in 1:N) {
    vector[K] theta; 
    vector[K-1] mu = X[n] * beta_c;
    
    theta[1] = inv_logit(alpha[1] - mu[1]);
    for (k in 2:(K-1)) {
      real raw_prob = inv_logit(alpha[k] - mu[k]) - inv_logit(alpha[k-1] - mu[k-1]);
      theta[k] = raw_prob > 0 ? raw_prob : 1e-10; 
    }
    real last_prob = 1 - inv_logit(alpha[K-1] - mu[K-1]);
    theta[K] = last_prob > 0 ? last_prob : 1e-10;
    
    theta = theta / sum(theta);
    Y_rep[n] = categorical_rng(theta) - 1; 
  }

  // ==========================================
  // 2. CLINICAL ESTIMANDS (Exact Probabilities)
  // ==========================================
  vector[K] P_UC;   
  vector[K] P_PCT;  
  
  // Calculate P_UC (Baseline)
  P_UC[1] = inv_logit(alpha[1]);
  for (k in 2:(K-1)) {
    real raw_prob = inv_logit(alpha[k]) - inv_logit(alpha[k-1]);
    P_UC[k] = raw_prob > 0 ? raw_prob : 1e-10;
  }
  real last_prob_uc = 1 - inv_logit(alpha[K-1]);
  P_UC[K] = last_prob_uc > 0 ? last_prob_uc : 1e-10;
  P_UC = P_UC / sum(P_UC); 
  
  // Calculate P_PCT (Treated)
  P_PCT[1] = inv_logit(alpha[1] - beta_c[1]);
  for (k in 2:(K-1)) {
    real raw_prob = inv_logit(alpha[k] - beta_c[k]) - inv_logit(alpha[k-1] - beta_c[k-1]);
    P_PCT[k] = raw_prob > 0 ? raw_prob : 1e-10;
  }
  real last_prob_pct = 1 - inv_logit(alpha[K-1] - beta_c[K-1]);
  P_PCT[K] = last_prob_pct > 0 ? last_prob_pct : 1e-10;
  P_PCT = P_PCT / sum(P_PCT); 

  // --- Estimand A: Absolute Risk Differences (Boundaries) ---
  real ARD_Avoidance = P_PCT[1] - P_UC[1]; 
  real ARD_Mortality = P_PCT[K] - P_UC[K]; 

  // --- Estimand B: Conditional Mean Days (1 <= Y <= 28) ---
  real cond_sum_days_UC = 0;
  real cond_prob_treated_UC = 0;
  real cond_sum_days_PCT = 0;
  real cond_prob_treated_PCT = 0;
  
  for (k in 2:(K-1)) {
    real day_val = k - 1; 
    cond_sum_days_UC += day_val * P_UC[k];
    cond_prob_treated_UC += P_UC[k]; 
    cond_sum_days_PCT += day_val * P_PCT[k];
    cond_prob_treated_PCT += P_PCT[k];
  }
  
  real Cond_Mean_Days_UC = cond_sum_days_UC / cond_prob_treated_UC;
  real Cond_Mean_Days_PCT = cond_sum_days_PCT / cond_prob_treated_PCT;
  real Diff_Cond_Mean_Days = Cond_Mean_Days_PCT - Cond_Mean_Days_UC;

  // --- Estimand C: Odds Ratios ---
  real OR_Avoidance = exp(-(beta_duration + tau_avoidance)); 
  real OR_Duration  = exp(-beta_duration);                   
  real OR_Survival  = exp(-(beta_duration + tau_death));     
}

Prior Predictive Checks:

Now for the actual model:

Diagnostics

All R-hats were really close to 1.0, N_eff/N were all above 0.8 and chains mixed well.

Posterior Predictive Check

Pretty good.

Results

And, of course, we can get odds ratios for each cutoff (Y = 0; Y = 29, and between):

Final remarks

I am no expert in Stan (closer to a newbie actually), thus I did use AI to help me write Stan code. My prior belief is that there ARE mistakes in the code, so please help me finding these and fixing it. Please! This is purely experimental.

Hope you enjoy it!

Full code: Bayesian partial PO · GitHub

arthur_albuquerque · March 25, 2026, 8:41pm

Disclaimer: this data structure and research question was heavily influenced by this RCT.

I extracted data (an approximation) from this figure and simulated some mortality rates.

f2harrell · April 2, 2026, 11:26am

This is really good work and I hope that others will respond with input. One suggestion is to work this out with a frequentist model using the VGAM package. I’m going to email someone who has been working in a related area to see if he’ll respond here.

philb · April 24, 2026, 5:12pm

This is very fascinating. I have a couple of thoughts.

My main wondering is whether CPPO is struggling to fit the mortality part of your data. Your code helpfully reports the posterior median of OR_Survival as being 1.59 (comparing UC to PCT), which is quite a bit smaller than the empiric OR of 2.25 (based on a 20% versus 10% mortality rates in the UC and PCT groups, respectively), so not a great fit there. And your model predicts a ~6.0% absolute mortality difference, which is also less than the observed 10% value. I think the reason is that the tau_death parameter in the CPPO model can only shift probability mass from the death category to the category immediately below (survivors who receive 28 days of antibiotics), but the data do not support that that is what is happening; rather, the patients who are rescued seem to be distributed across the other categories. So CPPO is sort of splitting the difference: in your Posterior Predictive Check plot, the dot for PCT at the highest category is resting pretty solidly above the observed data in the bar. I had a similar problem recently and we posited that the people who were rescued from the highest category (also mortality) would likely be distributed across the other outcomes (adverse events in our case; antibiotic treatment days in your case) rather than all being moved to the category immediately below death. To achieve this, we proposed a joint model: one modeling the probability of mortality and another conditional proportional odds model for the remaining categories, given that the patient survived, and then conducted a 2df test on both treatment effects. This completely separated the treatment effect on mortality from the treatment effect on adverse events. This would help you directly answer the third question you posed (“In patient that did get treated with antibiotics and survived, did it reduce the total duration of therapy?”). And as a nice side benefit there are no hard constraints on the value of the treatment effects in this joint model formulation, like there are for tau_avoidance and tau_death (see first possible typo below for more on this).

Possible typos

As you suspected, I think there may be a few typos in the STAN program, although I don’t think any are impacting your fitted model. The first is that tau_avoidance and tau_death have natural constraints on their values in order to satisfy standard probability requirements: \tau_{death} \leq \alpha_{K-1} - \alpha_{K-2} and \tau_{avoidance} \geq \alpha_1 - \alpha_2

I think these constraints would ideally be built into the code. I’m guessing this is also why your generated quantities block has a check for whether the raw_prob is greater than 0. If you add this constraint you shouldn’t ever encounter probabilities < 0 in your model. Second, on line 3 the lower bound of K should be 4 not 3; when K=3, I think you will only be able to identify two of \beta_{duration}, \tau_{avoidance}, and \tau_{death}. Your code would also not work as expected in the for loop on lines 23-25, since K-2 is less than 2 in this case:

for (c in 2:(K-2)) {
    beta_c[c] = beta_duration;
}

A third typo is that immediately after this for loop, you have beta_c[29] = beta_duration + tau_death; but I think you want beta_c[K-1] = beta_duration + tau_death. Again I don’t think this is affecting your model fit because in your case 29 = K-1.

Not a typo, but I wonder if the program would be a bit faster if you avoided defining mu in your model statement and instead replaced each instance of mu[k]with X[n]*beta_c[k]. I’m not 100% confident in that, though.

f2harrell · April 27, 2026, 12:16pm

Phil I’m so glad you joined the discussion! I wish we could give an award for helpfulness. I think you’ve won it for this year.

We often think of X affecting an ordinal scaled response Y as shifting patients along the Y spectrum. When adding non-PO to the fullest, e.g., adding k-2 degrees of freedom for a binary X where Y has k levels, the X effect can be in any direction for any level of Y. When the non-PO is constrained, you’re showing that there are constraints on what X is allowed to do for a specific Y even if the constraint that is lifted is targeted exactly at that level of Y. That was the wakeup call for me, and I still cling to hoping that the constraint is not too severe. The example being discussed here raises concern for me.

In going to the “no constraints for death” dual models you propose it would be nice if we could constrain all the covariate effects to be the same for both models, other than the one target covariate (such as treatment). That would bring stability as well as ease of getting a predicted unconditional quantity. On the other hand, risk factors for nonfatal outcomes can be much different from those for death in many situations.

When using a conditional independence argument as you are using (which is what Kaplan-Meier and Cox PH use) I assume that this is very easy to justify. We do need to combine the models to get causal estimates, i.e., intent to treat estimands such as P(good function and alive | X). I assume that is not a problem, right? Except for confidence intervals being complex, perhaps (but Bayesian uncertainty intervals probably come straight from posterior sampling from a joint stream from the two models).