I would like to ask you concerning a predictive model for drug related problems (DRPs) at discharge. I want to develop a predictive model for drug related problems (DRPs) at discharge for four internal at a secondary, 490 beds teaching hospital in Hadera, Israel.

I am planning to use logistic regression and random forest. 10-20 candidate parameters will be considered. I am planning to calculate minimum sample size that satisfies the criterions recommended in " Stat Med. 2019 Mar 30; 38(7): 1276–1296.". My question is, if approximately 16,000 patients are discharged a year from these departments and if I go for 10 candidate parameters and a shrinkage of 0.9 and R2 CS-adj=0.1, I will need at least 900 to 1000 individuals . I thought to choose the individuals randomly and inorder to represent all departments and all seasons I calculated that I should randomly select 21 patients per department per month.

I would like to ask if this logic is sound and how would you advise to handle this study

Thanks,

Elias