Sample size and power determination in high dimensional observational studies

Elisabetta · February 24, 2021, 5:30pm

Hi all,
I am collaborationg (with some biologists and clinicians) to the writing of a project. Last year, the proposal was not funded and one of the reviewers made a comment on the absence of any consideration on “power calculations and sample size statistical evaluation”.
The main study aim is to develop and externally validate a prognostic survival model starting from different types of biomarkers (molecular and imaging biomarkers) and clinical information.
For model development we will take advantage of retrospective cohort data already available on about 220 patients (on which we will determine the biomarkers) whereas for the external validation we are planning the enrollment of other 250 patients (on which we will determine only the biomarkers included in the developed model). We do not have any chance to increase these numbers.
The number of biomarkers to evaluate for inclusion in the model might be around 600 to 700. To develop the model different strategies will be considered, all comprising some form of variable selection or dimensionality reduction.
How is it possible to answer to the reviewer observation? I see difefrent issues here: high dimensionality, different types of biomarkers, no real a priori hypothesis, observational nature of the studies, survival outcome. Considering all these factors together I could not see any suitable method for sample size calculation or power (a posteriori) evaluation.
Any suggestiona and reference would be very appreciated.
Kind regards,
Elisabetta

scboone · February 24, 2021, 7:55pm

Richard Riley and others have recently published a series of papers on sample size calculations for prognostic/prediction models. His website contains links to several of them and might be a good starting point for further guidance:

They also published one specifically for prognostic models:
Minimum sample size for developing a multivariable prediction model: PART II - binary and time-to-event outcomes

Not an answer to all your questions, but I hope this might at least help a bit!

Elisabetta · February 26, 2021, 1:37pm

Thank you very much, it was very helpfull instead!
Elisabetta