Comparison of predictive models

A comparative study will assess incidence rate of a disease X among different inflammatory diseases (rheumatoid arthritis, ulcerative colitis, atopic dermatitis). We want to generate predictive models for the incident disease X. Several questions arise.

  1. To compare a risk factor for disease X (e.g., hypertension) in different inflammatory diseases, how do we generate predictive models in each different disease cohort? If a full model is used for each disease cohort, some predictor may not be significant. If a parsimonious model is used for each disease cohort, some predictor may not make it to the model.
  2. For predictive modeling, which one is the better in this case: Poisson or Cox regression?
  3. For the model chosen for question 2, can we split the data into a training set, a validating set and a testing set in SAS? Is there a reference on this?
    Thank you so much.
1 Like

Were my questions too stupid? I wonder why nobody answers them.