Confidence interval for positive/negative predictive value calculated from a theoretical prevalence

tamas.ferenci · April 17, 2020, 4:28pm

Assume we know the sensitivity and the specificity of a certain diagnostic test from a usual 2\times 2 table. We know that these are not the useful metrics, so I want to convert them to positive and negative predictive value, which are the relevant (“forward information flow”) metrics. But, I want to show that these are not single, fixed values, rather, they depend on the prevalence, so I calculate their values for different prevalences. (Assuming that sensitivity and specificity is fixed, which I know might be a risky assumption, but I believe this is the best I can do if I have a 2\times 2 table and nothing more.) This is easy, the formulae are well-known, essentially the Bayes theorem.

But the question is: how to calculate the confidence interval for PPV and NPV in the above setting? Better yet: is there any R package for this…?

I know it is easy to calculate the confidence interval for the particular prevalence implied by the 2\times 2 table, but note that the above setting is different, because in my question, the prevalence is an exogenously given input parameter (i.e., it has no sampling variability).

tamas.ferenci · April 19, 2020, 2:39pm

Professor Robert Newcombe kindly provided me a solution to this problem, which I really appreciate, and I think it worth sharing here, should anyone find this thread in the future.

Take the example of PPV:

PPV=\frac{\pi\cdot se}{\pi\cdot se+\left(1-\pi\right)\cdot\left(1-sp\right)},

where \pi denotes the – now theoretical – prevalence, se is the sensitivity, sp is the specificity. At first glance it might be frightening (both se and sp appears in the denominator, se even appears in the numerator), but there is a small trick to make the whole question much easier: divide both the numerator and the denominator with se! Then, we will only have a single term that contains these – uncertain – values. Note that \frac{se}{1-sp} even has an own name, it is usually called positive likelihood ratio, let’s denote it with PLR:

PPV=\frac{\pi\cdot se}{\pi\cdot se+\left(1-\pi\right)\cdot\left(1-sp\right)}=\frac{\pi}{\pi+\frac{1-\pi}{PLR}}=\frac{1}{1+\frac{1/\pi-1}{PLR}}.

Now the situation is much easier, because PLR=\frac{a/\left(a+b\right)}{c/\left(c+d\right)} (using the usual notation for 2 \times 2 tables), i.e., it is the ratio of two proportions. Obtaining a confidence for this is a well-known problem (see for example Olli Miettinen, Markku Nurminen: Comparative analysis of two rates. Statistics in medicine, 4 (2), 213-226. DOI: 10.1002/sim.4780040211. Link.). R functions are readily available for this end (e.g. ratesci or PropCIs).

Luckily, the transformation \frac{1}{1+\frac{1/\pi-1}{x}} is strictly monotonic for any \pi, so we can simply transform the endpoints of the confidence interval with the same formula above.

For details, see Robert G. Newcombe: Confidence Intervals for Proportions and Related Measures of Effect Size, CRC Press, 2012 (link), especially pages 261 and 325.