Comparing prediction models without likelihoods

Hi everyone.

This feels like a basic question, but we have not found a clear answer.

We are studying the utility of using a novel predictor within a prediction model.
The baseline and augmented models are both “xgboost” models (gradient boosting with decision trees as base learners), so have no clear notion of likelihood.

We want to be able to say that the novel biomarker “improves” the prediction in a NHST framework
The standard “machine learning” way is to compare AUROC values.
We are familiar with the work (I think by Margaret Pepe) that shows that the statistical test for comparing AUROC is just a less powered likelihood ratio test.
We are also familiar with Prof. Harrell’s approach that says that the LR test is the go-to test for these kinds of questions.
The problem is, again, that the model has no defined notion of likelihood.

What then is the preferred approach to perform such a hypothesis test?

Thanks in advance,
Noam Barda

1 Like

An excellent question. Take a look at the measures here. If you want to get a confidence interval for the amount of improvement due to addition of a new predictor, consider bootstrapping one of those indexes.

1 Like

Dear Prof.,

Thank you for the quick answer. The article is very helpful and seems to answer our question precisely.


Except for AIC, BIC, ChiSq/df, what are other good measures for model fit/model comparison for modeling count data, such as Negative Binomial model?

1 Like

Good question and I don’t have a real answer but I would look for a way to compare fitted and observed empirical entire distributions.

Good suggestion. Thank you!