I’m doing an external validation of a prognostic model. The model shows good discrimination (AUC 0.82), but uncalibrated (O/E 1.58 (95%CI 1.3 to 1.8), Calibration Intercept 0.2 (95%CI -0.1 to 0.5 ), Calibration Slope 0.8 (95%CI 0.1 to 0.96).
My doubt is if it is valid to run a decision curve analysis to assess clinical utility in this uncalibrated model or should I do it only on the recalibrated model? I would appreciate any insights into this or, if possible, literature that touches on the subject directly.
PD: It is an independent external validation of KFRE in Peru taking into account competing risks.
A bit technical but this paper by one of the people who introduced decision curve analyses in medical studies specifically goes into the effects of (mis)calibration in decision analyses: https://journals.sagepub.com/doi/10.1177/0272989X14547233
They mention the effects of miscalibration on decision curve performance in a handful of related publications as well:
In short, you can use decision curves on miscalibrated models but performance of these models is generally worse than the peformance of a (re)calibrated version of the same model. The poorer performance is caused by less cases being correctly classified as case and more non-cases as case for the given threshold values. This misclassification lowers the net benefit and thus the miscalibrated model generally performs worse. This is one of the advantages of decision curve analysis: the miscalibration is weighed directly into the net benefit that you used to decide which (version of the) model results in more expected benefit if it were used.
(Interestingly enough, a miscalibrated model can still be of more expected benefit than a well calibrated version of another but otherwise poorer performing model. I think they show some examples of this in the papers I linked above).