In various allied health disciplines, there is a growing push from academia to use “validated, evidence based” scales and outcome surveys.
A common one is the Disabilities of Arm, Shoulder, and Hand – aka the DASH as we therapists refer to it.
Essentially, there are from 27 to 30 Likert items. An average is calculated, and this raw score is converted to a number from 0 (no perceived disability) to 100 (total perceived disability). A “minimum clinically important difference” is defined as 12.75% - 17.23% change in score.
These are generally used in the context of providing evidence to third party payers that therapy is effective in an individual case, and should be continued (or ended in case of lack of progress or goals met).
I’ve known for a long time that parametric statistics have been controversial on Likert type data like this. But it appears that those who defend the use and those that oppose it no longer write in the same publications. Those who teach research methods to clinicians are generally the supporters of parametric analyses in these scenarios.
Front line therapists will never understand the measurement issues unless they have an inquiring, and contrarian mind.
Some of my concerns generally:
- The obvious problem of regression to mean (in the sense of return towards central tendency).
- The initial score, and the outcome score can clearly come from non-normal distributions. Initial entry into a clinic for injury are going to have scores skewed towards high scores, and on discharge, towards low ones.
It would seem to suffer all of the problems that changes from baseline have.
How does one know if the finite variance assumption that justify parametric analyses, are plausible here?
- Obviously, the permutations of the possible scores are normally distributed, but that is just an artifact of the scale, not of the underlying phenomena.
Could this simply be a case where the assumptions of parametric analysis is violated, but it is a small error not worth worrying about?
FYI – he is referring to this paper here:
Analyzing Ordinal Data with Metric Models: What Could Possibly Go Wrong?
I have thought the information from the distribution of scores (ie. the histogram) was interesting information. Perhaps some sort of Wilcoxon signed rank method would be preferable. I’m really not sure.