Method to obtain optimal cut-off point for survival outcomes

Hi all,

What method should I use to estimate a cutpoint of a quantitative variable X to determine 2 groups of patients regarding the response Y, a survival / time-to-event outcome?

  • The classic approach I’ve seen, is to plot the ROC curve and choose the cut-off closest to the perfect classification point. To use this, one has to transform the time-to-event data into a binary endpoint (e.g. survival at 3 years)

  • Maximally Selected Rank Statistics: I saw a function that does this in {survminer} package. It calculates the logrank statistics of all possible cutpoints and choose the maximum one.

  • And others like… quartetic method, median selection (doi: 10.1111/COA.13689). I have not seen these methods yet.

I would like to ask what is a good method to use.

The general flavor of discourse around here tends to not recommend cutpoints/dichotomizing continuous predictors. See: Categorizing Continuous Variables

Do you have a particular aspect of your research that makes finding an optimal cutpoint necessary?

1 Like

Agreed that dichotomization is not the best idea - however, I have had my arm twisted before so I did the following:

  1. calculate some sort of statistic (like the log likelihood) from a cox model for various cutpoints for X and choose the highest log likelihood. For example this was a plot from an analysis I put together:

  2. Or, use a tree based method with the one predictor. I found it harder to extract the optimal cutpoint from.

Whichever way you go I strongly recommend getting a measure of variability somehow (i.e. using bootstrapping). When I put this together and showed the huge CIs in the ‘optimal’ cutpoint I was able to dissuade folks from going down the route of dichotomization.

2 Likes

Great observations and approach. Confidence intervals for things that don’t exist do tend to be quite wide :slight_smile:

1 Like