RMS Causal Inference

jsaho · May 28, 2023, 2:59pm

Hi @f2harrell and @Drew_Levy. I would like to ask about the understanding of some terminology and concepts and how they relate to the framework that has been presented in the RMS course.

Frameworks for prognostic research:
My understanding of prognostic research is largely based on the framework proposed in the PROGRESS series of publications, but even more on a publication by Kent et al 2020 “A conceptual framework for prognostic research” https://bmcmedresmethodol.biomedcentral.com/articles/10.1186/s12874-020-01050-7. I found it very helpful, also for a general understanding of the literature and published work (especially Table 1 in the Kent et al paper).
However, according to the Chapter 1 of the RMS (Uses of models: hypothesis testing, estimation, prediction), you do not seem to make a further distinction within the “prediction” type of model strategies. This made me wonder about your perspectives on the above frameworks, and if there are reasons why you do not distinguish, for example, between studies/purposes about “association”/“predictor finding” and “prediction model development”?

Distinction between causal/etiology studies vs. prediction studies and implications for adjustment:
A further distinction is often made between causal/etiological and prediction research, as described, for example, by van Diepen et al 2017 " Prediction versus aetiology: common pitfalls and how to avoid them" https://academic.oup.com/ndt/article/32/suppl_2/ii1/3056968?login=true. And I think this distinction is related to the distinction between “estimation” and “prediction” uses of models made in the RMS Chapter 1. According to van Diepen et al, confounding and corresponding adjustment is not an issue in prediction research, whereas it is an issue in causal/etiological research. However, as I understood from the course, even if the purpose of a study is purely prognostic, we should always be concerned about proper adjustment if possible. Is this correct and how does this relate to the understanding of van Diepen et al? The van Diepen et al perspective is kind of intuitive to me and I would appreciate any thoughts on this and how it relates to the RMS course.

I also tried to use DAGs to think about the confounder adjustment in both cases, etiology and prognostic studies. As I understood it, in the case of etiological research, we are interested in a certain effect of variable X on an outcome Y. And of course, in this case, we need to adjust for confounders in order to correctly estimate the effect of interest. This is how I understand the use and application of DAGs as introduced in the course. However, when we want to develop a prediction model, we are interested in a whole set of variables (X1, X2, etc.) that can predict the outcome Y. So how do I know what to adjust for when there is no focus on a specific effect? Or in other words, how would I draw a DAG in this case, do I need to consider confounders of all effects between X’s and Y simultaneously? I would be very grateful if you could help me untangle this confusion.

Thanks a lot for any feedback!