Correlation coefficient

Josie24 · May 23, 2019, 3:42am

Hello,
In a retrospective clinical study.
In linear regression (generalized least squares model) a correlation coefficient of -0.34 is it weak or moderate ?
Appreciate your comments

RonanConroy · May 23, 2019, 10:22am

Without knowing what your study is, what the correlation is measuring, why you did the study in the first place it’s impossible to tell. It’s a bit like asking us if €34 is good value or cheap.

Please tell us a little more about what you are doing, and why you want to know. I am suspicious of what are often called tee-shirt sizes. If a correlation of –0·34 doesn’t mean anything to a person, then calling it “weak” or “moderate” won’t really help.

In general, I dislike correlations. They are poor measures of effect size because they lose the original variable metrics. And they are hard to explain in plain language (or impossible, as in the case of Spearman’s rho) – hence the recourse to tee-shirt sizes, I believe.

So, more detail please!

ADAlthousePhD · May 23, 2019, 11:38am

Exactly. There is not nearly enough detail in the original post to answer this question. Please give some more information so we can better help you.

Jochen · May 23, 2019, 9:08pm

In addition to the very correct answers above, I’d like to stress that this value(-0.34) is just a point-estimate. There is some uncertainty accosiated with this value, which might be given by a confidence interval.This intervalcould range from -0.9 to +0.6, in which case really nothing much could be stated about the strength of the relationship, even if one could connect the numbers (like -0.34) to some degree of practical relevance.

I completely subscribe to Ronans statement that correlations are poor(I would say almost useless) measures.

f2harrell · May 24, 2019, 12:55pm

I agree with everything that’s been said. I do tend to use correlation coefficients when I’m in a hurry, or for computing sample sizes need to study relationships. For example n=400 is needed to estimate r with a margin of error of ± 0.1 with 0.95 “confidence”.

RGNewcombe · May 28, 2019, 11:33am

Is a correlation coefficient of -0.34 weak or moderate? I fully take on board the criticisms previous respondents have raised. Nevertheless, the strength of quoting a correlation-type measure is that it is the nearest we’ll ever get to a one-size-fits-all scale-free measure of effect size. Especially if you make it non-parametric - enabling (informal) comparison of strength of effect for a wide range of pairwise associations.

The sign is always important to note carefully - here inverse - we always need to check carefully that the direction of any association is in line with what we would logically expect.

And the absolute magnitude, 0.34. In one sense, this is a very weak correlation - one of the variables is capable of explaining away 0.34*0.34 or less than one-eighth of the variation of the other. On the other hand, as correlations in epidemiology go, it’s much stronger than most correlations we normally encounter. Correlations tend to be low simply because most variables are influenced by numerous other variables.

f2harrell · May 28, 2019, 11:44am

The wish to add labels such as “significant”, “strong”, “moderate”, “weak”, or a medical diagnoses (“diabetes” instead of just stating the severity of diabetes) does much more harm than good IMHO.

Josie24 · May 28, 2019, 11:10pm

I appreciate all of your responses. I have been working night shifts for a whole week, thus my delay in responding. I’ll provode
more info as soon as I can.

jroon · May 29, 2019, 11:08am

Don’t forget the datasaurus!
https://cran.r-project.org/web/packages/datasauRus/vignettes/Datasaurus.html
The 13 distributions shown on that page all have almost the same summary statistics including the x-y correlation coefficient. And so you see interpreting a correlation coefficient by itself is not wise!

RonanConroy · May 30, 2019, 3:51pm

The trouble is you cannot “just state the severity” of diabetes, or cholesterol, or blood pressure. Many biological parameters that vary on a continuous scale have no definable threshold. Risk of adverse events rises with each unit of cholesterol, with each unit of blood glucose.

Furthermore, the severity of a given cholesterol – 6 iu/L, say – depends on other factors – in the context of a 65-year-old with an SBP of 160 and a life of smoking, it’s associated with a substantial risk of a cardiovascular event. But in a young, nonsmoking woman with an SBP of 110, it’s of no account.

There is, I think, a problem underlying your problem, which is that it’s hard to wean clinicians away from a fixation on managing individual risk factors and to get them to manage total risk.

f2harrell · May 30, 2019, 5:20pm

The lack of a definable threshold is more reason to pursue this. I wasn’t speaking of severity in terms of impact but rather severity with regard to the isolated disease process, e.g., for hypertension what is the untreated SBP, or for diabetes the HbA1c (if you don’t have more details including various glucose levels).