Cox PH Assumption and Bootstrapped Coefficients

I recently read “Why Test For Proportional Hazards” by Stensrud et al. The article describes why hazard ratios are usually not proportional for medical interventions, and continues to say that when the proportional hazard assumption fails the standard errors of the Cox model with more than one covariate will be biased. The authors also warn that tests for proportional hazard violations may be underpowered and that large p-values can lull analysts into a false sense of satisfied assumptions.

To circumvent the test of proportional hazards, the authors suggest bootstrapping the Cox coefficients to give a better estimate of standard error but they don’t describe the exact bootstrap procedure. Would the following bootstrap procedure properly estimate confidence intervals without relying on the proportional hazards assumption? Are there better procedures for confidence interval estimation?

boot.cox <- function(df, indices) {
  samples <- df[indices, ]
  fit <- coxph(Surv(time, status) ~ age + factor(sex), data=samples)
lung.boot <- boot(lung, boot.cox, 2000), index = 1, type = "bca"), index = 2, type = "bca")

I’m also curious what others have to say about testing for the proportional hazards assumption, as it seems to be common practice from the tutorials and textbook discussions of Cox regression that I’ve seen.


yes, it’s a good point. Assessment of the assumption often ends up in a supplementary doc, or maybe only the reviewers see it. I think the idea is not to “solve” non-proportional hazards but to model it, and present it

edit: just a quick google search: “However, if this assumption is violated,it does not necessarily prevent analyst from using Cox model. The current paper presents two ways of model modification in case of non-proportional hazards: introducing interactionsof selected covariates with function of time and stratification model”

1 Like

This seems incredibly strange to me for two reasons:

  • If the model doesn’t fit the data, the coefficients don’t mean what you think they mean and bootstrapping doesn’t solve that problem
  • If all you want to do is to get standard errors that are robust to model specification (and again want to ignore the problem with interpreting the \beta s), you can instantly get the answer with the robust sandwich estimator which is easy to use in R and other systems

The estimate from the proportional model can be interpreted as some weighted average of the time dependent parameter. Bootstrap is then used to the SD for this average

What is the reason you would want to do that?

Not sure that I will do that but it is nice to know that one gets some average with some uncertainty. Personally I think that non proportionality is interesting because it may reveal something about the association, eg that the effect of some intervention decreases over time. Btw I cannot remember the methodological paper that shows that one gets an average over time, I think the first author name is something like Xu or Xi

Michael Schemper has a good paper about that. But another angle is that if PH is not known to hold, just spend extra parameters to capture this uncertainty, e.g., add treatment x log time as a time-dependent covariate.

1 Like

Thanks! I’ll look it up. Do you think that log time is good if follow up begins at zero or do you prefer some other simple function of time. Maybe a spline or fractorial polynomial perhaps. Cheers

A spline is best, if you have the number of events to support it. There are many new ways to do this with the rstanarm survival package in R. Robert Gray had a nice paper on penalized splines in modeling non-PH: