Converting a continuous output to risk score category and selecting the optimal number of bins

Can you elaborate or point me to any resource? I have a bone mineral density measurement but for regulatory reasons want to convert it to a score between 1-N. I want to display the probability of having low BMD (i.e. the observed prevalence in my test set) by the score.

by the way - you were right on this. The threshold didn’t generalize which is why I am now trying to provide probabilities in the form of a risk score. If I were to calibrate the model to provide real world probabilities wouldn’t I need to bin them in some manner?

Thanks!