When is censoring too high?

i feel that there must be a better reference than this:

Censoring of survival data is also of important influence on research result. Too high rate of censor will be lower accuracy and effectiveness of analysis result of an analytical model, increasing risk of bias. Hence, the rates of censor should be reported in articles. The result shows no articles report the censoring rate, but many articles have the phenomenon of excessive rate of censoring. For example, the calculation shows the study done by Xuexia et al[32] has censored rate up to 84%, severely influencing the results.

the paper is here: Reporting and methodological quality of survival analysis in articles published in Chinese oncology journals. But their reference 32 is impossible to track down. I am dealing with datasets with massive N and thus the censoring rate is high. My impulse is to use logistic regression but the convention within the literature is to apply Cox regression. Any thoughts, opinions or papers you can point to eg using simulations to evaluate how methods perform? cheers

edit: and here is another dubious source, an antiquated math message board (query from 2007):

When there is a large proportion of censored observations, the
likelihood function is skewed towards parameter values which yield long
lifetimes. Also, the likelihood function is less sharply peaked (i.e. greater
range of plausible values). These two features make conventional maximum
likelihood inference less useful.
My advice to you is to take a Bayesian approach. In addition to the
likelihood function, it may be possible to exploit information from
the shape of the age distribution of the population at present; the age
distribution is an observable consequence of the survival function
(and the birth rate, if that is not stationary), and so it should be
possible to incorporate the present age distribution in the likelihood
function. (You heard it here first, folks. A few years ago I tried to follow
through on this idea and didn’t get far.)

I don’t get any of that. If over a study period [0, \tau] the probability of an event is 0.01, you’ll have 0.99 censored. The likelihood function is still perfect, and the survival curve S(t) within that interval may be very well estimated. The log-likelihood may become more non-quadratic so there is more need for likelihood ratio tests instead of Wald tests.

At issue with censoring is

  • censoring needs to happen for a reason unrelated to impending risk of the event and
  • when the total sample size is small or the number of uncensored observations is small, one cannot estimate the hazard rate with precision and one cannot do relative comparisons

Think about a study in which 10,000 patients are followed for 1y and no one has an event. The survival curve S(t) is 1.0 until at least 1y and the compatibility interval for it will be very, very tight. This will be a great outcome. It’s just that you can’t compare two treatments on a relative scale (e.g., hazard ratio) when no one has an event. But a confidence interval for the difference in survival probability will work just fine.

Don’t switch to logistic regression because of censoring. Logistic regression has no way to handle variable follow-up time, and loses time-to-event information.


really appreciate your comments. will respond more fully when i have thought more about it and looked into it further. i stumbled upon this simulation study relating to the non-PH situation: Bias Of The Cox Model Hazard Ratio:

For random and late censoring the estimate decreases (higher bias) with increasing censoring proportion but remains stable relative to sample size.


Good point. I think this is a bit more of a goodness of fit issue than a pure censoring issue. This is a different issue, but I think in general we need to move to capturing uncertainty in proportional hazards, proportional odds, normality, etc. using full Bayesian models.

1 Like

As Frank said, censoring is expected but I think what they are referring is censoring before the end of the study (e.g. withdrawal of consent, lost to follow-up, etc). When this type of censoring is high it may indicate problems with the design and quality of data. We may don’t know the exact details of why we lost the patient but If the reason may be related to the outcome (e.g. patient too ill to come for follow-up appointment), we will have informative censoring and results will be biased. If censoring prior to study completion is low the possibility of bias is minimal but if it is high and many cases may be informative missing (which we don’t know), your estimates may be biased.


Occasionally it’s helpful to run a Cox model to see predictors of time until censoring, after adjustment for date of subject entry into the study. This can help identify subject or clinical site characteristics associated with inadequate follow-up. When analyzing time until censor, you censor on the original event.


i have a dataset now with >99% censoring (what some call type II censoring ie caused by termination of the study/follow-up, not lost to follow-up)… i used to consider 85% high!

1 Like

Dr. Frank Harrell, can you suggest a reference that comments more on the topic: “when the total sample size is small or the number of uncensored observations is small, one cannot estimate the hazard rate with precision and one cannot do relative comparisons”? Thank you in advance.

The variance of the log hazard ratio is very close to 1/a + 1/b where a and b are the number of events in each group. If one of a or b are small the variance is large.


Thank you very much for your reply Dr. Frank Harrell. Is it possible for you to suggest a reference were I can consult the formulas of Cox regression, including the “1/a + 1/b” you mentioned in your answer? Sorry, I’m still learning this method. Thank you in advance.


Thank you for your reply Dr. Frank Harrell.