Prevalence and probability

kiwiskiNZ · October 30, 2018, 10:37pm

In constructing diagnostic algorithms and/or calculating probabilities of myocardial infarction (MI) for patients being investigated for possible ACS in the emergency department I have noticed that there is a wide range of prevalences of MI (~2 to 20%). My discussions with clinical specialists suggest that there are some differences because of cohort selection, but most difference is because of clinician behaviour (eg more likely to investigate if in a litigious culture) or health system (eg v low risk patients more likely to be cared for within primary care & not reach the ED). ie the assumption is that the true population prevalence of MI is similar (at least in Western countries), but that the population being assessed varies. The alluvial plots illustrate this.

The assumption is that in the low-prevalence population there are many more people with very low-risk of MI being assessed in ED that are not assessed in the high-prevalence population.

If we assume that the true population prevalence of MI is similar in both cases then when it comes to developing risk prediction models:

Would developing a risk prediction model in the high-prevalence population be more likely to be generalisable to the low-prevalence population but not vice versa? [my guess is “yes” but I may be wrong]
Under the assumption that the difference is in the numbers of very low-risk patients who don’t have MI, is there a way we can incorporate prevalence into a model or be used to adjust a model so as to improve calibration when a model is used in a cohort with a prevalence different from that in which it was derived?

Note - if we are to provide probabilities to clinicians instead of a diagnostic algorithm then we would expect that clinicians would use a very low (eg ~1%) probability to rule-out MI. However, I am concerned that if (as has been done) a model is developed and pre-specifies a decision probability threshold that this may not travel way.

f2harrell · October 31, 2018, 12:34pm

Nicely set up problem. I’m apparently one of the few quantitative methodologists who don’t believe in prevalence. Prevalence is an attempt to quantify a background probability—an unconditional probability of disease for an entire population. For certain resource allocation problems, this can be a useful quantity, but for individual decision making I think not. Because of extreme heterogeneity in the risk of a diagnosis of MI and most other diseases, e.g., due to presence and type of chest pain, age, sex, etc., I prefer to think of using risk models to directly model this heterogeneity, and not updating an unknown gmish of risks represented by a single average number, the prevalence of disease. I would like to see an approach where prevalence is avoided entirely.

drezap · November 3, 2018, 4:08am

I would investigate MRP (multilevel regression and post ratification).

I could be summarizing this poorly: but a multilevel model would allow you to quantify risk for each sub population you would like to investigate. I think there is a package called “survival stan” that has Bayesian survival models implemented that would allow to to quantify risk probabilistically.

But high risk and low risk: are you defining threshold prior to analysis? Why not let a model define, based on covariates such as hypertension, diabetes, demographics, etc, define who is high risk based on the data?

f2harrell · November 4, 2018, 12:08am

I’m not meaning to imply the use of any threshold, but speaking to a matter of degree, in relative terms.

Rather than subpopulations I like to think of (mostly continuous) patient descriptors that move us continuously along the risk spectrum.

Uriah · January 29, 2024, 10:03am

I wonder when prevalence might be useful for resource allocation?

I think about prevalence as an estimate of a pre-test probability given the characteristics of the subpopulation.

But we always have some additional knowledge about individual’s characteristics that allow us to provide an individual pre-test probability and we can asses the accuracy by Calibration or NB (poor calibrataion leads to negative NB).

On resource allocation it tends to be trickier because Lift compares ranking of a model to a ranking of a “random guess” with PPV=Prevalence.

But we almost always have a better baseline model from my experience. When I develop a prediction model my real competitor is not the prevalence but some very simple easy-to-use mental map of the decision maker.