My apologies, I am asking something very basic.

I am creating a basic descriptive summary table for subjects in my study. There are three distinct participant groups: A, B, C, I am summarizing their BMI and Age , continuous variables: Mean, Standard Deviation, Median, Q1,Q3, Range, usual stuff.

Subjects in subgroup A, have a mean age of 50, standard deviation 6.2

Subjects in subgroup B, have a mean age of 60, standard deviation 20.7

Subjects in subgroup C, have a mean age of 55, standard deviation 10

Although subjects in subgroup B have a higher average age compared to those in subgroup A & C, the standard deviation is also higher (20.7). I am not sure how to interpret the average when the sd.dev is large. Any advise is highly appreciated. Thanks.

I see nothing wrong with simply reporting that **in your particular sample or data set** group B had a higher average age. This is the conditional, post-data perspective.

Read the section in BBR about the standard deviation not necessarily being a good summary stat.

Blockquote

The mean and standard deviation are not descriptive of variables that have an asymmetric distribution such as variables with a heavy right tail (which includes many clinical lab measurements). Quantiles are always descriptive of continuous variables no matter what the distribution.

What conclusions to draw depends on other factors (ie. model assumptions and prior information). Is it possible that there is a relationship between variance and age that should be considered? Or is the higher variance merely a result of a much smaller sample in that group? Not much can be said about implications of the data without context.

1 Like

Thank you very much. I knew there was something more than just looking at the mean or standard deviation but i was unable to recall all the finer details. Thanks for pointing me to this resource. Very helpful, I will go through this. I am forgetting a lot of things very quickly now-a-days.

1 Like

The other information that would be helpful would be the number of patients in each group and what the distribution of data is (or at least the median and IQR). If the ages are in years a SD of 20 with a mean of 60 looks odd if the data is not heavily skewed.

1 Like

I agree John, n is a factor that can definitely impact the distribution.