individual risk

ralphstern · May 27, 2023, 12:27am

Conventional wisdom in medicine is that individual risk exists and that clinical prediction models estimate individual risk.

However, in the relevant philosophy of probability literature there is little or no support for the existence of individual risk. Two obvious issues are that individual risk is not observable and the reference class problem. The latter is the fact that any individual belongs to an innumerable number of groups, defined by different combinations of risk factors, whose risks differ, resulting in a multitude of risks that could be assigned to the individual. Any credible discussion of individual risk needs to include the reference class problem, but there are less than 10 citations in PubMed. “On Individual Risk”, a critical review by an eminent statistician, concludes “in the end that concept remains subtle and elusive.” https://link.springer.com/content/pdf/10.1007/s11229-015-0953-4.pdf

A related issue is the interpretation of clinical prediction models. If individual risks don’t exist, they can’t be estimated by clinical prediction models. A more plausible interpretation is that they estimate the risk of groups defined by their combinations of risk factors. The clinical benefit of risk stratifying a population, which is selective allocation of preventive measures, does not depend on the interpretation. But any presentation of results to individuals need to be more nuanced.

I am posting this so the reference class problem becomes more widely known in medicine and to invite a discussion of the conventional wisdom.

giuliano-cruz · May 27, 2023, 7:46am

I see the same way. The term “individual risk” is really a simplification to avoid technicalities. David Spiegelhalter does a lot of work on this, as I understand favoring the communication of risk as the proportion of events among people with similar characteristics.

Ultimately, we are really talking about conditional probabilities: random variables that depend on a set of patient characteristics.

A related provocation contrasts the interpretation of risk as a conditional probability between the prognostic and diagnostic settings. Although formally there’s really no distinction, as I understand, “individual risk” has a natural frequentist interpretation as a conditional probability only in the prognostic setting – because the event of interest is yet to happen. However, for diagnosis, the patient either does or does not have the disease, so directly interpreting the observed risk as a conditional probability may be akin to common misinterpretations of confidence intervals. Of course, this all depends on frequentist interpretations of probability.

R_cubed · May 27, 2023, 12:02pm

Actuaries have dealt with this type of problem for at least a century in the context of property casulty insurance; extensions of the mathematical methods to health contexts also exist. The relevant topic is known in insurance circles as credibility theory. I posed a number of links to the basics of credibility theory in another thread. There are very interesting relationships between credibility theory and meta-analysis.

f2harrell · May 28, 2023, 4:05pm

This is a great discussion and I am so glad to know about the Dawid paper. I’m halfway through reading it and feel that it’s a definitive paper on the subject so far.

I think we can drive ourselves in circles worrying about this. I tend to think of risk as an information measure using best available information down to the lowest available unit of measurement (typically the patient, sometimes single cells, proteins, or genes). IJ Good argued convincingly that probability is a function of the eye of the beholder and different observers have different information available to them. Likewise, we have different depths of information about patient outcomes as a function of the diagnosis, disease severity, and treatment being studied. Instead of making a hard boundary about group and individual risk I think of information availability on a continuum and would phrase our goal as quantifying available information to make best decisions. We should be trying to estimate \Pr(Y | X) to the highest X-resolution that is practical.

@giuliano-cruz I don’t quite see diagnosis as being that different from prognosis. Disease is currently absent/present but has not yet been revealed to us. So for practicle purposes it’s more similar to prognosis.

giuliano-cruz · May 28, 2023, 6:08pm

For any practical purposes, I see the same way @f2harrell! These theoretical considerations, however, do nudge me away from the “hard frequentist” perspective and bring me closer to Bayesian interpretations. If I couldn’t interpret disease probability as a formal probability within any given framework, then I am ready to change frameworks. Same for confidence intervals.

ralphstern · May 29, 2023, 2:03pm

“The philosophy of probability has several major branches, the best known of which are frequency and subjective. These two philosophiesare (mistakenly) associated with the two main branches of statistics — frequency and Bayesian. The temperature of the intellectual dispute
concerning the meaning and applications of probability has been high in both philosophy and statistics.” Krzysztof Burdzy, RESONANCE From Probability to Epistemology and Back, preface, 2016

Probability Wars:
The disputes mentioned have been described as probability wars. Many of these are fought over the search for a single, comprehensive, rigorous definition of probability. IJ Good’s monist endorsement of Bayesian or subjective probability is an example. However, one can derive insight from the discussions, e.g. learning about individual risk (described as single event probabilities) and the reference class problem, without becoming a combatant. Good’s synthesis contrasts with the more common dualist synthesis, i.e. frequentist probabilities for science (probability of a heart attack in 10 years in a defined population) and Bayesian or subjective probabilities for degrees of beliefs (probability of a specific individual becoming the next president). ( philosophy bites: Hugh Mellor on Probability). Burdzy points out that Bayesian/frequentist interpretations of probability are separate from Bayesian/frequentist statistics; one can accept probabilities in science are frequentist, but reject frequentist statistics.

“The most standard notation, among mathematicians and statisticians, for the probability of A given B is P(AB), and a familiar beginner’s error is to forget about B when discussing the probability of A.” Irving J. Good, Good Thinking The Foundations of Probability and Its Applications, 2009 (republication of 1983 book)

Conditional Probabilities:
When clinical prediction models calculate probabilities, the tail (i.e., B) of the conditional probability is cut off and an apparently unconditional probability is output. This supports the conventional wisdom that the model is calculating an individual risk. It simultaneously leads to confusion when different models using different B’s calculate different individual risks. But of course, conditional probabilities will change when the conditions considered change, so lability is a given.

Diagnosis versus Prognosis:
Recognizing the distinction is helpful to understand the behavior of models. Since individuals do or do not have a disease, Pr(Y|X) for an individual should be expected to converge to 0 or 1 (the known unknowns) as more X is added. In contrast, since individual risk does not exist, Pr(Y|X) for an individual does not have a true value to converge to. Instead individual risks should be expected to follow a random walk as more X is added. The benefit of more X is a more disperse population risk distribution. Sometimes adding more X has little or no effect (e.g., adding C-reactive protein or polygenic risk scores to pooled cohort equations using the Framingham risk factors).

f2harrell · May 29, 2023, 3:30pm

I think it’s fair to say that the information that is used to condition on when computing probabilities involves a subjecting choice, and the information available to the probability estimator is often different from observer to observer. So probabilities are always subjective in my opinion, and the Bayesian approach is always appealing.