As I said, I reach for quantiles when I’ve got a scale that is arbitrary, where there is no real-life interpretation to a scale difference of a given size.

My preference, of course, is for calibration. We have a paper in submission at the moment that looks at levels of student burnout. Authors have suggested, based on cumulative distributions, that scores could be divided into tee-shirt sizes (low/moderate/high) but this seems to me to be dogged with circular reasoning, assuming that a certain proportion of the population will be experiencing high burnout and then declaring, on the basis of a cutoff score, that the hypothesised proportion exists!

We’ve taken the tack of calibrating the scores against probability of depression on a standard self-completion instrument that pretty closely conforms to clinical diagnostic criteria. So our ‘high’ cutoff is interpretable as ≥50% probability of caseness on the depression scale. This allows us to give prevalences in categories that have a non-arbitrary interpretation.

Of course, we could have calibrated burnout against another variable – I’d be interested in calibrating it against course drop-out, but we have a low-dropout setting.

However, in modelling relationships, I struggle with getting interpretable measures of effect size. If you were looking at the effect of self-stigma in the probability of compliance with treatment in people with tuberculosis (current project, so the question is not at all hypothetical) how would you express the strength of the relationship?

Oh – piece of trivia. I once spoke with Garrow, he of “Treat obesity seriously” – the book that put obesity on the map as a health issue. I took the opportunity to ask him how he got the neat cutpoints for BMI – 19-25 = normal, 25-30 = overweight, 30+ = obese. He said to me “They were easy to remember and they looked about right, based on my own experience”. I was delighted to realise that subsequent intense scrutiny has shown that Garrow was pretty much on the money!