This is a problem when analyzing data the indirect way using sensitivity and specificity. If you are on the other hand in forward-time predictive mode where you estimate probabilities of disease on the basis of current data, I don’t think you need to do anything special with, for example, logistic regression.
To within the resolution of available data, you use pre-test patient characteristics to model risk, and the predictors in the model should include indicators of what makes a patient tend to not get the ultimate test. So based on best available evidence you’ve accounted for what needs to be accounted for. Unless I’m missing something.
Ben’s presentation is excellent. I was hoping to hear something about resistance to workup bias that results from having enough representation of patients of a certain type in the complete data. For example if females seldom go the final diagnosis but you have 100 females in your dataset who did, the model might be OK.
It is indeed an excelent presentation! I showed it to my team two days ago.
For me the most disturbing points are 8 for diagnosis (this thread) and point 9 for prognosis that leads me to counterfactual predictions.
I sent Ben email asking about the subject and he sent me some links to work done by Joris A H de Groot and a related tutorial in R:
While I do agree that enormous efforts were dedicated to estimating the wrong performance metrics (Sens, Spec) I do not agree that there is no need for corrections in order to fix verification bias - not necessarily for model development but for model validation.
I use Lift / PPV conditional on PPCR (flexible resource constraint). If I’ll estimate Lift / PPV naively for PPCR = 0.05 (only 5% of the patients can be validated) I’ll get very different results since the top 5% at risk in the validation set are very different than the top 5% at the general population.
Propensity Score looks like a reasonable solution to me (once again, so much effort for Sens and Spec):
I don’t see a role for propensity scores here (or almost anywhere else). And if there are needed corrections for verification bias, it is better, and often possible, to include factors related to those corrections in the statistical model for diagnostic outcome.