Comparison of predictive models

huifangliang · April 29, 2022, 2:30pm

A comparative study will assess incidence rate of a disease X among different inflammatory diseases (rheumatoid arthritis, ulcerative colitis, atopic dermatitis). We want to generate predictive models for the incident disease X. Several questions arise.

To compare a risk factor for disease X (e.g., hypertension) in different inflammatory diseases, how do we generate predictive models in each different disease cohort? If a full model is used for each disease cohort, some predictor may not be significant. If a parsimonious model is used for each disease cohort, some predictor may not make it to the model.
For predictive modeling, which one is the better in this case: Poisson or Cox regression?
For the model chosen for question 2, can we split the data into a training set, a validating set and a testing set in SAS? Is there a reference on this?
Thank you so much.

huifangliang · May 5, 2022, 3:07am

Were my questions too stupid? I wonder why nobody answers them.