Monte Carlo simulation for sensitivity and specificity determination


I am a veterinarian looking to do a diagnostic accuracy meta-analysis. Some papers report sensitivity and specificity but many do not. They do however report the mean and SD of the biomarker in healthy and diseased animals. Is it possible to set a diagnostic threshold and then perform a simulation with these parameters to calculated a sensitivity and specificity? Many thanks for your statistical expertise.

1 Like


This is a good question. I think the statistically minded would avoid any type of threshold, and recommend keeping the data on a continuous scale. Instead of a “diseased/not diseased” classification, the ideal method would be to output a probability of disease.

But the data you have available might have been improperly dichotomized, making it less useful.

If you haven’t done so already, check out Chapter 18 in Prof. Harrell’s Biostatistics for Biomedical research aka. BBR (first item on the link provided). He goes into mathematical detail why looking for “cut points” or thresholds is arbitrary. It wastes about a third of the data in the best case. The link has a list of many of the free publications that he has kindly made available on the web.

You might also find his section on sensitivity and specificity, and their problems useful.

I was coincidentally thinking about a similar problem. I do not have any good ideas at the moment. My intuition suggests to use a bootstrap by creating synthetic, independent samples derived from the reported means/variances of the diseased and not diseased subjects in the studies, respectively.

The Probit regression model seems to be the most appropriate one to use in your particular case, if I understand Prof. Harrell’s writings correctly.

I am making the assumption that the studies you have available are reasonable estimates of the hypothetical population value, and that this is an appropriate application of the “plug in principle” described in Bradley Efron and Robert Tibshirani in Introduction to the Bootstrap.

Suppose we have 5 studies, that have 5 pairs of means/variances where D are the reported mean/variance for diseased subjects, and N are not diseased:
(D_{1}[\bar{x}, s^2], N_{1}[\bar{x}, s^2]) ... (D_{5}[\bar{x}, s^2], N_{5}[\bar{x}, s^2])

If the numbers of studies are small, you pair the data from D_{1} with each N_{1} ... N_{5} and create synthetic data for each group, pool it, then run a regression on the synthetic data. Store this result Do this for all possible pairs of diseased/not diseased, then look at the distribution of the bootstrap regression.

If the number of studies is too large (likely more than 10), then just randomly sample from the data indicating disease and not diseased.

Perhaps the experts here might reply after me bumping this. Is my intuition on the bootstrap reasonable here? If not, where did I go wrong?