Calculating SMD from geometric mean titers

tamas.ferenci · April 24, 2021, 7:29pm

A recent paper (Vaccines | Free Full-Text | SARS-CoV-2 Neutralizing Antibodies: A Network Meta-Analysis across Vaccines) performed a meta-analysis of different SARS-CoV-2 vaccines using data on neutralizing antibodies. While I feel that this approach is problematic on several levels (and I do welcome any comment on this as well), now I focus on one single aspect: the methodology.

This paper is already getting a lot of attention as a way to rank the different available SARS-CoV-2 vaccines, so I feel it is critical to be very sure about their methodology.

Here are the relevant parts:

When needed, the arithmetic mean and standard deviation were calculated from the geometric mean, median, range, and sample size, as previously described [46].

(Ref. 46 is Hozo, S.P.; Djulbegovic, B.; Hozo, I. Estimating the mean and variance from the median, range, and the size of a sample. BMC Med. Res. Methodol. 2005, 5, 13. https://link.springer.com/article/10.1186/1471-2288-5-13)

Since the investigated studies assessed the same outcome (the level of SARS-CoV-2 neutralizing antibodies) by using different metrics and methods, the results of the network meta-analysis expressed as relative effect (RE) and 95% credible interval (95% CrI) were converted into the standardized mean difference (SMD = (difference in mean outcome between groups) × (standard deviation of outcome among participants) −1). The SMD was also reported in agreement with the rules of thumb proposed by Cohen and the Cochrane Collaboration [49,50]—namely, ≤0.5 represents a small effect, >0.5 to ≤0.8 a moderate effect, >0.8 to ≤1.3 a large effect, and >1.3 a very large effect.

Here are my questions:

Is it meaningful to use SMD for geometric mean titers?
If so, should we convert it to arithmetic means and then calculate SMD from the arithmetic means or SMD should be simply calculated from geometric means?
Here is my first problem. The above excerpt suggests that geometric mean titers were converted into arithmetic means. The problem is that reference [46] has absolutely nothing to do with converting geometric means to arithmetic means, it is about converting median (using range and sample size) into mean and variance. There is a well-known publication about converting geometric means to arithmetic means (Higgins, J. P., White, I. R., & Anzures‐Cabrera, J. (2008). Meta‐analysis of skewed data: combining results reported on log‐transformed or raw scales. Statistics in Medicine, 27(29), 6072-6092. https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.3427), but it is not cited at all in the paper. Thus, in my opinion, there are two possibilities: either the authors did convert geometric means to arithmetic means, but simply cited the wrong paper, or they did in fact used only ref. [46] (i.e., did not convert geometric means to arithmetic means) and wrongly stated that they did this conversion. In my opinion both is a very minor and honest mistake, possibly just a typo, but interestingly, the corresponding author simply refused to answer this simple question.
And my second problem: I was simply unable to reproduce their numbers, no matter what approach I tried to use. Take the example of Ad5-nCOV (5 × 10^{10} viral particles IM), where they report an SMD of 0.80 (0.54-1.05), and give the following citation: Zhu, F.C.; Guan, X.H.; Li, Y.H.; Huang, J.Y.; Jiang, T.; Hou, L.H.; Li, J.X.; Yang, B.F.; Wang, L.; Wang, W.J.; et al. Immunogenicity and safety of a recombinant adenovirus type-5-vectored COVID-19 vaccine in healthy adults aged 18 years or older: A randomised, double-blind, placebo-controlled, phase 2 trial. Lancet 2020, 396, 479–488. Immunogenicity and safety of a recombinant adenovirus type-5-vectored COVID-19 vaccine in healthy adults aged 18 years or older: a randomised, double-blind, placebo-controlled, phase 2 trial - ScienceDirect. If I read the paper correctly (but I’d appreciate a confirmation) they report a GMT increase of 4.0 to 18.3 (14.4–23.3) with n=129 participants. Naively, I’d convert it to SMD by assuming a standard deviation of zero to the baseline, and a standard deviation of
(23.3 - 14.4)/3.92 * sqrt(129) = 25.8
to the post-vaccination measurement. This’d imply an SMD of
(18.3 - 4.0) / 25.8 = 0.55.
In contrast, the paper reports an SMD of 0.80 (0.54 - 1.05).
Alternatively, if I try to convert geometric mean to arithmetic mean then the mean and the CI on (base-10) log scale is 1.262 (1.158 - 1.367), thus the standard deviation is
(1.367-1.158)/3.94 * sqrt(129) = 0.606.
Thus, using the method of Higgins et al, the mean on the raw scale is
10^(1.262+0.602^2/2) = 27.9
and the standard deviation is
sqrt( (10^(0.602^2)-1)10^(21.262+0.602^2) ) = 32.15.
Thus, the SMD would be
(18.3 - 4.0) / 32.15 = 0.44,
even farther from the value the paper reported.

It is entirely possible that I made a mistake, so I again tried to contact the corresponding author, who labeled by attempts at reproduction a “speculation” and then flatly refused to further communicate with me.

I’d appreciate your comments on these question. I contacted the editorial office, who suggested to submit my remarks as a letter. (I’m rather sad about this, as this whole story might be a simple misunderstanding on my part that could be cleared in five minutes if they were willing to share their raw data, analysis script, or at least answer my questions, but as the corresponding author was completely unhelpful in this respect, I don’t think I have any other option than to submit this letter.)

Thank you in advance!

tamas.ferenci · May 25, 2021, 10:55pm

I unfortunately can’t edit the post, but there are mistakes, so I give an updated version here as a reply, should anyone in the future come across this post. (Most importantly, I should have used base-e logarithm, of course.)

Here is how the calculation that I now believe to be correct goes:

t=1.98 for this sample size.

Naively, I’d convert it to SMD by assuming a standard deviation of zero to the baseline, and a standard deviation of
(23.3 - 14.4)/(2t) * sqrt(129) = 25.5
to the post-vaccination measurement. This’d imply an SMD of
(18.3 - 4.0) / 25.5 = 0.56.
In contrast, the paper reports an SMD of 0.80 (0.54 - 1.05).

Alternatively, if I try to convert geometric mean to arithmetic mean then the mean and the CI on (natural) log scale is 2.91 (2.67 - 3.15), for the post measurement, thus the standard deviation is
(3.15-2.67)/(2t) * sqrt(129) = 1.38.
Thus, using the method of Higgins et al, the mean on the raw scale is
exp(2.91+1.38^2/2) = 47.5
and the standard deviation is
\sqrt{ (\exp(1.38^2)-1)\cdot\exp(2 \cdot 2.91+1.38^2) } = 113.8.

For the pre-measurement, we have 4.0 (-) which I assume to mean 4.0 (4.0-4.0), which in turn translates to 4 as arithmetic mean too.

Thus, the SMD would be
(47.5 - 4.0) / 113.8 = 0.38,
even farther from the value the paper reported.

davidcnorrismd · May 26, 2021, 9:00am

Tamas, I’d encourage you to organize your calculations more systematically in a spreadsheet. Point-wise attempts at reproducing individual results from a paper may (paradoxically?) be more confusing than systematic attempts to reproduce a whole slew of results.

You might like to compare this reanalysis attempt, which I scripted in R Markdown. Because that effort was systematic, it revealed that some of the results were calculated correctly, and even provided a basis for speculation about how the wrong results might have been produced (e.g., by transcription errors).

The first branch-point in the diagnostic ‘decision tree’ in a case like this, is whether the original authors used the right formula incorrectly/inconsistently, or just outright used the wrong formula.