Cox Proportional Hazard

pablo · April 8, 2022, 11:50am

Hi,

I would like to ask if there is a threshold for the maximum number of censored individuals, given the total number of participants in a study. To give an example I have 11 patients died, out of 46 who are still alive (censored). Is there any rule of how many events should exist in the cohort in order to have robust results using the Cox PH model?
Thanks in advance!

f2harrell · April 8, 2022, 2:12pm

To estimate a hazard ratio to within a multiplicative (fold change) margin of error of 1.2 requires 462 events. You can compute the margin of error with only 11 events. It will be something like 3, i.e., you can’t nail down the true hazard ratio to within a factor of 3.

pablo · April 11, 2022, 8:49am

Thank you very much for the quick response.
In my case, the goal is to estimate the hazard in just one group (multivariate Cox regression with continuous independent variables), I don’t have 2 groups to estimate a ratio among them. Should I have in mind the same rule for this analysis too?

f2harrell · April 11, 2022, 11:11am

Multivariate means multiple dependent variables. I think you meant multivariable. And without having >1 group it sounds as if you are developing a multivariable prediction model (you didn’t state your goals). For developing a model there are excellent sample size papers by Richard Riley et al in Statistics in Medicine. A crude approximation is that you need 15 events per candidate variable. You do not have enough events for looking at a single candidate variable.

pablo · April 11, 2022, 12:34pm

Yes, you are right. I meant multivariable.
I have a group of 46 observations (33 right-censored - 13 died) and 10 variables/measurements of blood characteristics.
I have used coxph function in R (survival package) to compute a Cox proportional hazards regression model.
Dependent variable is the survival object Surv(time to death, event death).
Independent continuous variables are the blood measurements.
My goal is to identify which of the variables used in the model could be potential protective or risk factors for the group of patients and visualize the result using a forest plot.
From you reply, I understand that with this low number of events (13) I should use just one dependent variable, is that correct?

f2harrell · April 11, 2022, 1:38pm

No, it’s a bit worse than that. The sample is not adequate for the analysis. You might show some descriptive statistics but not do any statistical tests, seek to describe variable importance, or build a model.