Hi Prof Harell,
I had not attended your course (really wish I could attend one in the future), and I hope it is okay for me to post a question here.
I refer to the section 4.12 of RMS book, the different models:
- Developing Predictive Models
- Developing Models for Effect Estimation
- Developing Models for Hypothesis Testing
I noted from various posts in datamethods [1][2] that univariable analysis is not useful and using it to pre-screen variables for inclusion in the multivariable analysis is not appropriate.
My questions are:
- Should this practice be avoided for all 3 types of models? Or can it be applied for some?
- In the question posted here [1], I understand that the variables are solely chosen due to significant p-values (in which the cut-off of 0.30 is arbitrary). If say, we generate table 1 or univariable analysis, to give some idea of what should be included or excluded, and in the multivariable, we include all statistically significant ones as well as others that are considered clinically important (either through previous studies, or clinical practice) - would this be okay?
Additionally, how are the different models defined in the book? I read Shmueli, G. (2010). To explain or to predict? Statistical Science, 25, 289–310, but I wonder if there’s any distinction in the definition of these. In summary:
- Explanatory modelling:
*Used for testing causal hypotheses – implicates cause and effect
*Here, it is important to identify the roles of each variable on the specific causal pathway for the study question (confounder, collider, mediator or effect modifiers). - Predictive modelling:
*Used for the purpose of predicting new or future observations. The goal is to predict
the output value (Y) for new observations given their input values (X). - Descriptive modelling:
*Used for summarizing or representing the data structure in a compact manner (or parsimoniously).
*In other words, it is used for capturing the association between the dependent and independent variables rather than for causal inference or for prediction.
I hope to learn more from you. Thanks!
Regards,
Hanis