My book covers this in great detail. Here are the main issues:
- Parsimony and Occam’s razor only provide better results if the parsimonious model is completely pre-specified. For your purpose, AIC would only be useful if you compared two completely pre-specified models
- Any attempt to force parsimony that comes from examining relationships between X and Y creates only the illusion of benefit and usually makes predictions worse
- The data do not contain enough information for selecting the “right” model with any decent probability. The situation becomes worse if predictors are correlated with each other.
- Keeping non-significant variables in the model, if not using something like lasso, is the only way to get accurate standard errors and accurate residual variance estimates
- Your full model is so simple that trying to making it simpler seems to be an especially bad use of anyone’s time
Look at my RMS course notes to see an example where I use the bootstrap to obtain 0.95 confidence limits for the importance rankings of predictors. Even in a high signal:noise example, the confidence intervals are wide. This is a good way to show the difficulty of the task before you.
Variable selection/model selection is not as valuable as model specification.