Sample size in Epidemiological studies


Some time back an interesting discussion took place on on the inappropriate sample size in clinical trials (here: ANDROMEDA-SHOCK (or, how to intepret HR 0.76, 95% CI 0.55-1.02, p=0.06))

I work as an epidemiologist, and health economist. And since this discussion, I am planning to create a short slide kit on how does incorrect sample size matter in the context of epidemiological studies. I am looking for such studies for a while now, but could not come across any. Is anyone aware about issues of sample size in the epidemiological studies (smaller sample size leading to incorrect interpretation on population impact of intervention, effectiveness of an intervention, …)

I am using this platform. So in case, I described the issue in more detail (or did not provide enough context), kindly let me know.

Thanks a lot for your help in advance!




if epidemiology means uncontrolled, restrospective etc then i would have examples. i would have to come back to you tho when less busy. Other statisticians working in academia might have examples ie clients who present them with a spreadsheet of a dozen babies and some data on the mothers and say: what can you do with this? some conferences (presenting mostly epidem results) demand post hoc power calculations from those submitting abstracts. it doesn’t make sense but you can appreciate the impulse




Thanks a lot @PaulBrownPhD!

Yes. By epidemiology I mean uncontrolled, but not only retrospective studies. It can be prospective as well. Important being, no intervention is introduced as a part of study in a systematic manner. For example, a study looking at vaccine effectiveness can be epidemiological study where advice for getting vaccinated is not a part of the study; but it turns to be an interventional study when such an advice is a part of study.

Thanks a lot for the link. It will be great if you can share some more studies, when you have some time.

Thanks and Regards,



When people refer to epi studies they are usually talking about observational study designs such as cohort, case-control or cross-sectional studies.

Here’s a good reference from Dr. Madhukar Pai

@AmitBhavsar85 Would be nice to see slides discussing interpretation of analyses using large sample sizes e.g. a study with regression of 1 million cases when using admin health data



because in a funding application there should be refernce to some existing data to support the proposed study, it may almost become ‘method’ to obtain some preliminary data in the clinic then transition slightly by refering to said data as a ‘pilot study’ and then transition a little further and present the results at a conference with the caveat ‘requires replication’ or ‘encourages further research’, but really the results are very feeble and not worthy of a conference presentation (just my opinion). And it seems like this is what you’re trying to discern? ie some possible trend might emerge indicating an issue with power and over-confidence? i guess it’s possible due to the circumstances. I think if you attend epidem conferences you will see a strange mix of analyses on registry databases of several million patients and then results based on two dozen patients(?)

edit: you said “am looking for such studies for a while now”, I would just look at all the abstracts of an upcoming conference and see what the Ns look like. could be interesting but will very much depend on the field eg cardiology they will be tens of thousands, neonatology they may be less than 100

1 Like


you might also consider that epidem has too much power leading to confident assertions?: The risk of death from suicide is inversely related to BMI in middle-aged and older adults. One would not expect this to be a pre-specified hypothesis, also they might had done a cumulative risk analysis.