First of all, sorry for the perhaps novice question, but it has been bothering me for a couple of weeks now.

Basically I have some data I wish to use on a few models (probit based) to see if they “work” on my data or not. So my data has a binary outcome, and if I use my data on these models, and put in the correct values that the models require, I get a predicted risk. My question is: How do I evaluate the goodness of fit for each model? I have been suggested (by quite a few, and even a reviewer) to use the Hosmer-Lemeshow test, but when I read about it, it seems like it should be avoided. I know it is a common test to be used in my field (I’ve seen it done a lot of times), but that doesn’t make it better, if it’s actually not a good tool.

So how would I go about this ?

Also, can it be visualized in some way ? I mean, since it’s binary outcome data, there isn’t a good way to pair a prediction probability with either a 1 or 0, unless I make groups and get the group estimates instead, and after that see if the regression line is similar to the unity line. But as you are probably all aware of, the grouping can be a problem. One can pretty much make it look as good or bad depending on the group size. I am not sure if this is even possible to do in a rightly manner ?

Thanks in advance.