Differences between quartiles in two different conditions

cs1972 · January 25, 2021, 12:31pm

Hi everyone,

I have a doubt and I hope this is an adequate forum to try to solve it. I have posted the same question in “cross validated” (although I think that does not violate any rule of this forum).

My question: I have calculated the difference between two independent groups in their quartiles and calculated the 95% confidence interval of that difference. These calculations have been performed twice (in the summer/ in the winter). Now, I would like to know whether there is a statistically significant difference between the summer/ winter estimated group differences at each quartile. How should I do that?

So far, what I did was to inspect whether each “winter difference” falls within the boundaries of the 95%CI of the corresponding “summer difference” (considering that if not, there is a statistically significant difference between them). Would that be correct?

If so, how one should interpret cases at which one “winter difference” falls within the boundaries of the 95%CI of the corresponding “summer difference”, but the “summer difference” does not fall in the boundaries of the 95%CI of the same “winter difference”?

Thanks in advance for any possible answer.

f2harrell · January 25, 2021, 12:44pm

Just a few notes.

It is not a good goal to seek “statistical significance”
It is not valid to compare two confidence intervals; instead compute a confidence interval for the double difference
Better to put this in the context of a well-formulated unified statistical model so that we can see all the parameters involved and better understand their meaning

cs1972 · January 25, 2021, 1:23pm

Dear Dr. Harrell,
Thank you very much for your fast reply.

-I completely agree with your comment about statistical significance. I might miss-expressed myself, as I am not “looking for” a p-value as much as knowing if I can claim that there is a non-trivial difference between the summer/ winter differences. Yet, if that would come together with a p-value, that would be convenient for possible publication.

I see that the correct approach would be computing a confidence interval for the double difference. In fact, that is what I have read about confidence intervals of means’ differences. However, I am not sure how to conduct such a calculation in this case.

-To get help with the previous, I should probably provide more context. I apologize for not having done it in the first instance and I hope to make it better now (please, correct me if I missunderstood which information I should provide).

my dependent variable is the score in a psychometric scale which provides a continuous score between 0 an 100. I have used the qcomhd function for R to calculate the differences between two groups at specified quantiles (in this case, the quartiles) of the obtained scores. Therefore, I obtained a bootstrap estimate of the value of each quartile in each group and also an estimate of the groups’ difference at each quartile with its 95%CI. As mentioned in my original post, I evaluated my sample in two different moments (summer/ winter) and I would like to provide information about the following questions

A) Differences between groups at each quartile in summer or winter. This is “solved” because that is what qcomhd calculates.

B) To draw valid conclusions about whether the between group differences for a quantile (e.g. the median) are similar/ different to those observed at other quantiles. I (probably uncorrectly) assumed that I could do that by checking whether or not the confidence intervals of the median contain the point estimate of other quantile (e.g. Q1). That is, are differences at the median similar or different than those observed at Q1?
C) Obtain similar conclusions when comparing the same quantile in the summer/ winter measurements. For instance, is the between-groups difference at the medians similar in the summer than in the winter?

I apologize for the extension of this reply and for any error or confusing expression it may contain.

f2harrell · January 25, 2021, 10:19pm

I’m such a modeler that sometimes I have trouble understanding things that are not stated as models. Would it be possible to write down a model that includes a parameter for your major quantity of interest? It’s probably an interaction effect.

cs1972 · January 26, 2021, 8:25am

Dr. Harrell,

Thanks for your answer.

I do not know if -or how- such a model could be build. I am a not a statistician, just a researcher trying to look for a correct way to analyze her data.

I say so because this is a case that, in my field, would be ordinarily assessed in terms of mixed model two-way ANOVA (time x group). However, that would imply assuming normality (while the data are very skewed and the dependent variable is bounded between 0 and 100) and exclusively focusing on the groups’ means (although it does not make sense due to shape of the groups’ scores distributions) .

f2harrell · January 26, 2021, 12:41pm

I think that a model similar to that would be what you need. I think I’ve noticed recent work on random effects quantile regression. Better to me would be semiparametric ordinal regression with random effects or other ways to model dependency such as a Markov model.

cs1972 · January 26, 2021, 1:15pm

Ok, thanks. I would try to explore semiparametric ordinal regression with random effects and see if my sample fulfills its possible requirements. Thank you very much for your expert advice