# Should I impute person-time before conducting proportional hazards analysis?

Hi all. I will be conducting a proportional hazard analysis with my data. I’m planning on using multiple imputation to deal with missingness in covariates, however, I’m unsure of whether it makes sense to impute person-time as well. This would mean that individuals who have 0 follow-up (i.e., have baseline data only) would not need to be dropped from the analysis. I have been unable to find examples of this or even discussions of why this might not be a good idea. Any input would be welcome.

1 Like

what proportion of the subjects have missing covariate data?

re imputing survival, what is the survival outcome, eg is it a composite of events like time-to-first?

normally they would censor at time 0 i think ie if the analysis plan demands that all randomised patients will be in the analysis. You notice when a kaplan meier plot doesnt start at 1 at time 0 but instead some value just below 1

You notice when a kaplan meier plot doesnt start at 1 at time 0 but instead some value just below 1

Censoring at time zero (or any other time) won’t cause the K-M curve to drop, that only happens if an event occurs.

It will change the effective numbers at risk. If they’re missing at random, this will just reduce the sample size. If there are systematic differences between the arms in the reasons why people dropped out at time zero, and those differences are related to prognosis, then there’s a problem.

@pravleen I’d worry less about how to include them and more about whether their missingness is introducing bias. You could use imputation to ask some what-ifs as a sensitivity analysis but I’d suggest not for the primary analysis.

2 Likes

im assuming the problem is the analysis plan prespecifies a comprehensive analysis population, you see this often as people copy-paste ‘itt=all randomised’ from one sap to another, then at the analysis stage they wonder how it can be satisfied

Most covariates are missing for <5% but one is missing for ~50%. The outcomes is incident hypertension.

This is a good point, Josie and I do plan on doing sensitivity analyses to compare those lost to follow-up to those who were not. My hypothesis is that they are missing at random.

I have derived the person-time variable using study dates and currently everyone who was lost after baseline has person-time as missing. Based on this discussion, I wonder if I should assign them a 0 instead.

this amounts to excluding them from the analysis. It sounds like you have no analysis plan and therefore gain flexibility, but could also be criticised for that, although It is not unusual to exclude those without post baseline data, it’s in ich e9 ie the definition of the full analysis set:

" … There are a limited number of circumstances that might lead to excluding randomised subjects from the full analysis set including … lack of any data post randomisation"

personally i dont like imputing missing covariate for 50% of patients at all, but i have seen it done in eg NEJM. I’d consider losing the covariate, depending on what it is

I don’t work with clinical trial data so I had never seen the ICH E9 guidelines. But that was a helpful document.

this amounts to excluding them from the analysis.

You’re right, and I suppose that means it doesn’t matter whether I assign them a 0 or leave them missing. I think I will go ahead and exclude them from the full analysis set.

personally i dont like imputing missing covariate for 50% of patients at all, but i have seen it done in eg NEJM. I’d consider losing the covariate, depending on what it is

I appreciate that. It’s a clinically relevant covariate so I would like to keep it in the analysis. I also plan on repeating this analysis in other cohorts and would like to have homogeneous predictors throughout the analyses in order to compare them. I will be combining bootstrap and MI so that should help take care of some of the uncertainty.

1 Like