I’ve been reading about evaluating the predictive performance of models with internal validation techniques, but the discussions usually don’t mention Bayesian models. I’m assuming this is because predictions from Bayesian models can be probabilistic as well as point predictions, which is worth its own discussion. If we focus just on point predictions, does it make sense to apply the same techniques (split-sample, CV, bootstrap, etc.) to evaluate the predictive performance of Bayesian models? Is the full posterior only necessary for probabilistic predictions and checking model fit?

In my case, I have a Bayesian additive model and I want to compare its point predictions to that of a neural network (NN). My goal is to see if the predictions from the neural network are substantially better than those of the more interpretable model. To make this comparison, I plan to evaluate the squared error loss of the Bayesian model’s MAP and the NN’s predictions on bootstrap samples of the data. Does this seem appropriate?