Validate and calibrate Cox model on dataset with a categorical covariate with few observations

Hi there,

I create this topic because this is a problem that I am facing now when validating my Cox PH model with validate() function.

My dataset consists in 666 observations, 501 events and 27 covariates (10 categorical and 16 continous).

Once I have developed a model and I want to validate() and calibrate() it, I find problems because there are three categories of a categorical covariate that have few observations (AH, MEMT and X).

e.g. X matrix deemed singular for covariate FLUID2 value X

image

Although I did my best in reagrouping the categories I still have problems to run a boostrap without issues.

An idea that came to my mind is to perform the bootstrap avoiding the observations with those FLUID2 categories and keep the main model with them (considering these FLUID2: AH, MEMT and X observations). But I do not know if it would be correct or not.

As well, I do not know if there are better ways to manage these issue that for sure anyone of you have faced before. Any help will be highly appreciated.

Very best regards and many thanks for your support,

Marc

This should have been placed in datamethods.org/rms5 but let’s give it a go.

You may have to combine infrequent levels. Hmisc::combine.levels can help with that. Or use the group= argument to calibrate and validate to do balanced bootstrap sampling. Specify something like group=mydataset$FLUID2.

Hi, sorry for the mismatch in posting it here.
Thank you for your help. I will continue with this discuss in http://datamethods.org/rms5.

Thank you

1 Like