Sample size consideration for establishing a reference interval

EpiLearneR · October 18, 2019, 5:37pm

is there any formula for calculation of sample size for establishing reference interval. reference levels of thyroid function tests will be different in different ethnic group. we would like to find out the reference level for different trimesters of pregnancy in our population.

f2harrell · October 18, 2019, 8:32pm

Reference intervals, being not risk based but rather sample-based, are highly dependent on how the sample is chosen. That is more important than anything else. There are sample size formulas available; I just don’t know the references. If you want to be distribution-free, the Harrell-Davis quantile estimator is recommended (shameless advertisement) and this requires something like 300-400 subjects per homogeneous sample. The sample size will depend on the acceptable margins of error in estimating the quantiles of interest.

Being not risk based, reference intervals are not consistent with medical decision making. One of the many ways problems with them arise is that one can have a lab value near the upper (or lower) limit of normal, and a patient at that level can have elevated disease risk unbeknownst to the physician or patient.

daszlosek · October 19, 2019, 2:43am

I agree with Frank that the reference intervals are sample-based. The Clinical and Laboratory Standards Institute (CLSI) recommends a minimum of 120 subjects per a homogeneous sample. This sample size is based off knowing that your data follows a normal distribution.

An introduction to reference interval analysis that I am most familiar with is CLSI’s “EP28-A3c EP28 Defining, Establishing, and Verifying Reference Intervals in the Clinical Laboratory”. This is the resource that my company (IDEXX - bio-marker discovery/ medical diagnostics) uses.

I have found some of the recommendations in the CLSI document a little concerning (such as how to handle outliers, small sample-size reference intervals, and their suggestion for robust reference interval calculations is not really that robust, etc.).

EpiLearneR · October 19, 2019, 9:58am

Thanks prof. We will read more about Harrell Davis quantile estimator and will implement that. Thanks again

EpiLearneR · October 19, 2019, 9:58am

Thanks. Let me go through those articles

f2harrell · October 19, 2019, 12:19pm

Please edit your earlier reply instead of adding separate consecutive replies.

Donald’s advice is excellent. I’ll just add that it’s a good idea not to assume normality of lab measurements, hence the need for sample sizes larger than 120. The Harrell-Davis estimator (see the R Hmisc package hdquantile function) is a little more efficient than the ordinary nonparametric sample quantile estimator, and converges to it as n \rightarrow \infty.

EpiLearneR · October 21, 2019, 1:40pm

Hi proff, i couldn’t find any option to edit the first reply. So i deleted second reply. I see the option to edit only in the orginal first post. And now whole post is seen washed out… Kindly forgive my ignorance

FH: sorry the edit privilege maybe only starts when you have more posts under your belt.

Elias_Eythorsson · December 8, 2023, 10:11am

Could you provide any thoughts on how one could work backwards from an acceptable margin of error for a 2.5 and 97.5 percentile to a required sample size?

f2harrell · December 8, 2023, 2:20pm

There are some formulas that may be helpful for ordinary sample quantiles. The only ways I can think of right now for the H-D quantile estimator are simulation to find out how to achieve a given half-width in a 0.95 confidence interval for say a 0.95 quantiles, or use pilot data and compute the bootstrap SE as described it the paper, and use the universal multiplier. The margin of error of almost any estimate based on independent observations is inversely proportional to the square root of the sample size.

Elias_Eythorsson · December 8, 2023, 5:00pm

I found this excellent paper https://onlinelibrary.wiley.com/doi/epdf/10.1002/sim.2177, which seems to give a closed form approach for sample size calculations for reference limits.

f2harrell · December 8, 2023, 5:45pm

Just make sure they didn’t assume a certain distribution shape.