i feel that there must be a better reference than this:

Censoring of survival data is also of important influence on research result. Too high rate of censor will be lower accuracy and effectiveness of analysis result of an analytical model, increasing risk of bias. Hence, the rates of censor should be reported in articles. The result shows no articles report the censoring rate, but many articles have the phenomenon of excessive rate of censoring. For example, the calculation shows the study done by Xuexia et al[32] has censored rate up to 84%, severely influencing the results.

the paper is here: Reporting and methodological quality of survival analysis in articles published in Chinese oncology journals. But their reference 32 is impossible to track down. I am dealing with datasets with massive N and thus the censoring rate is high. My impulse is to use logistic regression but the convention within the literature is to apply Cox regression. Any thoughts, opinions or papers you can point to eg using simulations to evaluate how methods perform? cheers

edit: and here is another dubious source, an antiquated math message board (query from 2007):

When there is a large proportion of censored observations, the

likelihood function is skewed towards parameter values which yield long

lifetimes. Also, the likelihood function is less sharply peaked (i.e. greater

range of plausible values). These two features make conventional maximum

likelihood inference less useful.

My advice to you is to take a Bayesian approach. In addition to the

likelihood function, it may be possible to exploit information from

the shape of the age distribution of the population at present; the age

distribution is an observable consequence of the survival function

(and the birth rate, if that is not stationary), and so it should be

possible to incorporate the present age distribution in the likelihood

function. (You heard it here first, folks. A few years ago I tried to follow

through on this idea and didn’t get far.)