I am currently investigating the effect of a novel biomarker on recurrence/death following a malign disease. The biomarker in this case are titres of autoantibodies ( continuous and heavily skewed ).
First of all, to leave bias, I wanted to mathematically identify patients positive for the autoantibodies. I investigated it as a dichtomous variable. I applied the outlier criterion (P75+1.5IQR) to identify those “positive” for the autoantibodies. Also, I applied RCS regression (continuous).
I would also like to find a cut-off, above when the antibodies “start having an effect on prognosis” (I know it is not the most statistically correct approach). I know of ROC for calculating that, however this does not take censoring into account. Does it work with time dependent ROC analyses, like Heagerty proposed ? Or do I just simply look at the restricted cubic spline analysis and take the point where HR of 1 is crossed ?
Check out chapters 18 and 19 of Biostatistics for Biomedical Research on the risk modelling analysis alternative to your problem. So-called “cut points” will not reproduce and lose information contained in the data collected.
I concur with the - treat a continuous variable as a continuous variable first approach. If you know other variables associated with the outcome post the disease then you need to find out if the biomarker adds value to them as a prognostic marker. You may then want to create a multivariable prediction model - the predictions then become your new “biomarker”.
However, I understand the need for cut-offs when clinical decisions need to be made. I talk first to the clinicians about what rate of false negatives and/or false positives they would accept. This, of course, depends on the consequences & side effects of treating or not treating. This may yield something like a minimum sensitivity which may be used to identify thresholds. You will need to use bootstrapping or the like to estimate with confidence intervals a threshold. I do this by finding the threshold for which, say, there is a 98.5% sensitivity in each bootstrapped sample. This would mean that my lower bound of any confidence interval for a threshold will have >98.5% sensitivity.
I think it’s dangerous to speak of cutoffs at the clinical decision point. Clinicians are used to looking at compromises when it comes to blood pressure, cholesterol, etc. Just because a marker is new doesn’t mean it should be treated differently than blood pressure. One way to understand that a cutoff should not be sought until possibly the very last second before the decision is made is that the cutoff mathematically must be a function of the continuous values of all the other risk factors. Another way to think about this is that if you push a cutoff on a biomarker you’ll need to measure more biomarkers to make up for the information loss.
Agree with this notion. As an example, in this design we minimized information loss by using continuous utility functions representing risk-benefit trade-offs used for decision-making. It is generally a good strategy to avoid dichotomization at the statistical estimation step, when calculating clinically relevant probabilities, and when accounting for trade-offs using utilities.