The CARG-BC clinical prediction model published recently in JCO [1] strikes me as methodologically flawed on multiple levels, even decision-theoretically and ethically. In this post, I would hope to explore specifically the gap between the methods employed and ‘best-practices’ prediction modeling such as might be done at the leading edge of ‘mainstream biostats’ as represented in this forum.
The paper is paywalled, and not even on Sci-Hub yet. But this screenshot of p.3 exposes most of the statistical suboptimalities that I can see, and it seems to fall within ‘fair use’ to post it:
I of course don’t even like the problem framing—why passively predict toxicities when you could actively avert them?!—but taking the authors’ clinical prediction problem for granted, what would be an ideal statistical approach?
- Magnuson A, Sedrak MS, Gross CP, et al. Development and Validation of a Risk Tool for Predicting Severe Toxicity in Older Adults Receiving Chemotherapy for Early-Stage Breast Cancer. J Clin Oncol. Published online January 14, 2021:JCO2002063. doi:10.1200/JCO.20.02063 PMID: 33444080
See also several images from this tweet by Sumanta Pal: https://twitter.com/montypal/status/1350675255044395010