A comparative study will assess incidence rate of a disease X among different inflammatory diseases (rheumatoid arthritis, ulcerative colitis, atopic dermatitis). We want to generate predictive models for the incident disease X. Several questions arise.
- To compare a risk factor for disease X (e.g., hypertension) in different inflammatory diseases, how do we generate predictive models in each different disease cohort? If a full model is used for each disease cohort, some predictor may not be significant. If a parsimonious model is used for each disease cohort, some predictor may not make it to the model.
- For predictive modeling, which one is the better in this case: Poisson or Cox regression?
- For the model chosen for question 2, can we split the data into a training set, a validating set and a testing set in SAS? Is there a reference on this?
Thank you so much.