Despite being published just over 20 years ago, I think this paper on Bayesian model averaging complements the discussion and recommendations in RMS. It explains the problems with prediction based on single models (and by implication step down methods which select the “best” model using the data) in the introduction.
Their discussion of frequentist solutions in the last section is an apt description of RMS and much has been done from this perspective since that paper was written.
It may have been mentioned in the many RMS references, but I was not able to find it despite looking this morning.
This discusses computational aspects that simplify the implementation of Bayesian Model Averaging.