Considerations about coefficient of variation in clinical practice

Let assume we have a set of measurements made with a novel instrument and we want to identify some that could be clinically meaningful between all the one we wave. This is a problem that I’m encountering often since the advent of “-omic” in medicine, especially in radiology.

Let assume I have a set of feature that describe my biological property. I want to identify the ones that could be the most clinically meaningful and I was thinking about the coefficient of variation. Using the CoV I should be able to have a measure that can compare all the different feature even if they have different mean or scale. I should discard feature with very high (for example >100%) or very low (for example <2%) variability, and keep the feature with an intermediate variability (let’s say between 10% and 30%). I suppose this kind of variability could be clinical useful in differentiating between healthy and pathological subjects, a too low variability can indicate that the feature is useless (for example a list of 1), the same for a too high variability (because I assume it is unreliable between every measurement). I couldn’t find any reference about this indicating if this is correct (and eventually the proposed cutoff) and if not why it’s a wrong assumption.

Does anybody have a reference about this kind of problem? What do you think?

Thank you very much


I may not understand your problem, but usually we desire a more direct approach such as using a prediction model, i.e., multivariable regression. There is a cool indirect approach due to Baggerly and Coombs where they reduced the number of features to model in a high-dimensional gene microarray setting by choosing the features that are most bimodal. If there is a mixture of phenotypes that you want to distinguish with gene expression, these effects must create some sort of bimodal distributions when you are masked to the outcome variable (such masking will prevent overfitting at that stage).

1 Like