Subgroup analysis and Meta-regression in Meta-analysis

Dear Scholars,

I hope you all are well. I am doing a proportion meta-analysis in Stata. I need to double-check whether I did the subgroup analysis and meta-regression correctly. Actually, I am new to this forum, Stata and Meta-analysis. Therefore, I have more questions than answers.
Could someone with the knowledge of Stata and or meta-analysis be kind enough to comment on my output, please? I’ll be grateful. I have included only one subgroup analysis and covariate (meta-regression) here. However, in the final analysis, there are four subgroups and four covariates. I’ll start interpreting the output once I receive some feedback.

Thank you for reading the post, and I look forward to hearing from some scholars soon!

Kind regards,


The following is a good list of things to consider:

  1. Average treatment effects may not represent individual level effects (ecological bias).
  2. Small number of studies per factor examined
  3. Complexity involved regressing treatment outcome with risk.

Meta-regression is often mentioned in the journals, but when you realize most MAs have less than 20 studies, you realize how limited a tool it is.

See also: Regression Modelling Strategies

1 Like

I think you need to tell us first what you are studying and why you are pooling proportions
Second, the CIs in your studies have lower limits below zero - are these transformed or raw proportions? If raw then tell us how you arrived at these limits?
Third, the inbuilt meta module in Stata is pretty basic and useless for proportions
Please download metan by typing: into the command window in Stata the following:
ssc install metan
Once you answer these queries then perhaps we can assist better

Thank you, R-Cubed, for sharing your thoughts and the article. I read this article earlier, but it has limited information for proportion meta-regression. I like the Regression Modelling Strategies you shared. Thank you for that.


Dear Professor Doi,

Thank you for responding to my post. You note

‘I think you need to tell us first what you are studying and why you are pooling proportions’

I am working on an SR and Meta-Analysis estimating the incidence and prevalence of Meniere’s disease (MD) worldwide. It is a less prevalent disease. The latest study (2014) from the UK estimated a prevalence of 270 per 100,000 population, around 1 in 371. There are a few reasons why I am pooling proportion firstly because there is no global estimate available so far. The epidemiology of MD is fiercely debated across the globe. There are a few other reasons as well.

Second, the CIs in your studies have lower limits below zero - are these transformed or raw proportions? If raw then tell us how you arrived at these limits?

I assume they are raw because when I included Freeman Tukey Transformation (ftt) command in meta, it gave an error message saying, “ftt not allowed”. However, when I used metaprop to estimate the effect size and SE, I used ftt, and it did work. All the UB CI and LB CI were between 0 and 1. Actually, I wanted to use Logit transformation but was struggling to find a command in the Meta set.

Third, the inbuilt meta module in Stata is pretty basic and useless for proportions
Please download metan by typing: into the command window in Stata the following:
ssc install metan**strong text

I have Metan installed, but I never used it before. Do you have the commands for ‘Metan’ by any chance? Or could you share some source material, please?

Once you answer these queries then perhaps we can assist better

Please, if you need more information, I am happy to provide it.

Any help is greatly appreciated

Kind regards,


Okay that is clear. . I have a philosophical position on meta-analysis of prevalence - when should it be done and when not but will bring that up last. Assuming that the meta-analysis is appropriate it should always be conducted using transformed proportions because that stabilizes the variance and keeps the 0,1 limits of the proportion as noted from your output using raw proportions
There are two transforms commonly used - FTT and logit and FTT is preferred. FTT has received a lot of attention after we first elaborated on it
Barendregt JJ, Doi SA, Lee YY, Norman RE, Vos T. Meta-analysis of prevalence. J Epidemiol Community Health. 2013 Nov 1;67(11):974-8.
There was then a critique by Schwarzer et al
Schwarzer G, Chemaitelly H, Abu-Raddad LJ, Rücker G. Seriously misleading results using inverse of Freeman-Tukey double arcsine transformation in meta-analysis of single proportions. Res Synth Methods. 2019 Sep;10(3):476-483.
and a response by us
Doi SA, Xu C. The Freeman-Tukey double arcsine transformation for the meta-analysis of proportions: Recent criticisms were seriously misleading. J Evid Based Med. 2021 Dec;14(4):259-261.
There is also a new paper out that has a criticism from a different angle
Röver C, Friede T. Double arcsine transform not appropriate for meta-analysis. Res Synth Methods. 2022 Jul 15.
Although I don’t believe the flaw they have detected meaningfully impacts the use of the transform
With historical perspective out of the way I will suggest we take the data from the Schwarzer paper as an example of use of the FTT transform in metan
First update metan by typing
ssc install metan, replace
then run the following:
input str3 studyname long n int cases byte qi
"S1" 217154 422 1
"S10" 16557 32 1
"S13" 676 1 1
"S18" 44 1 1
"S26" 29 1 1
this will add the Schwarzer data to Stata
then run the meta-analysis as follows:
metan cases n , pr model(ivhet \ re ) transform(ftukey, iv) study( study ) forestplot(astext(85) textsize(120) boxscale(55) spacing(1.2) leftjustify range(0.1 7) dp(3)) denom(1000) extraline(yes) hetinfo(isq h)

This will put everything in perspective and you can ask questions after you review this

These authors recommend a logit transform over arcsine. I think they make a good case.

They refer to regression models and in this situation I agree. For meta-analysis the variance stabilization is better with the FTT as compared with the logit transform.

Also, FTT needs a well considered back transform and that is where Schwarzer et al slipped up - they used the harmonic mean which we had flagged in 2013 as not recommended. Metan in Stata has our back transform as an option and is in the code I posted above

Why don’t those considerations also apply to meta–regression? The simulation they present suggest otherwise.

From the discussion

In this paper, we demonstrated using theory, examples, and simulations that logistic regression and its random-effects counterpart have advantages over analysis of arcsine-transformed data in power and interpretability. For binomial data, power tended to be higher when using a logistic regression approach than arcsine-transformed linear models. In addition, the logit function has a much simpler interpretation, while avoiding the possibility of nonsensical predicted values. For non-binomial proportions, there was never any theoretical reason to use the arcsine transform in the first place, and we instead suggest using the logit transform. It is important to recognize that these ideas apply equally well to proportions collected in ANOVA designs as to those collected in a regression context.

They are models of individual participant data - my guess is performance metrics of interest differ in this case.
Coming back to meta-analysis - we are interested in error (lower MSE) and error estimation (coverage that is at the nominal level) and given that variance stabilization is better with the FTT transform this has to perform better. The problem with this transform is with the back-transform and that has led to many issues being reported for the FTT

There are several known issues with the logit transform:
a) the variance of the logit transformed proportion depends additionally on the event counts
b) within-study variances are treated as fixed and known values in MA, while the event counts are not thus violating one of the assumptions in MA
c) Since both the logit proportion and its variance depends on the event counts they are correlated ending up in gross bias in MA especially of studies with small samples

Thus clearly stabilization of variances is expected better with FTT without need to cite any studies

Suffice it to say that while I have no dog in this fight, there is a growing literature that disagrees with you.

As for the idea that the logistic can be sufficient at the individual study level, but not at the meta-analytic level strikes me as self-refuting. If an individual is attempting to compress his/her knowledge via a regression on studies, and accepts the logit as valid at the individual level, combining the information in sufficient stats via logistic modelling directly follows, since logits can be combined by addition. How to model the heterogeneity is another matter.

After a bit more reading, I’ll come back and attempt a more formal proof of this.

okay lets see what you find - perhaps limit your proofs to a standard pairwise aggregate data MA as most meta-regression simply is a weighted linear regression with an ES as the dependent and I do not see much point in debating that.

What has been debated a lot is the best transform in the standard MA and so far no one has raised a convincing argument against FTT though many have tried and continue to try as noted in the citations I posted

Coverage probability for n = 50 from:
Interval Estimation for a Binomial Proportion. Lawrence D. Brown, T. Tony Cai and Anirban DasGupta. Statistical Science, Vol. 16, No. 2 (May, 2001), pp. 101-117

The oscillation in coverage for very small proportions can be fixed (see our paper) but the logit interval, for the same coverage is much larger.

1 Like

Dear Prof. @s_doi and @R_cubed,
Thank you both for sharing your thoughts and relevant literature. @s_doi I read your articles earlier, and that was the basis for changing from Logit to FTT. It was truly helpful.
I am really enjoying the debate. I believe such healthy debates could empower early career researchers to make informed decisions on what transformation should be used and why. Great stuff!

I think it is me. I executed the commands you gave, but unfortunately, Stata said “cases’ cannot be read as a number.” Any command I am executing Stata is displaying the same message “xyz’ cannot be read as a number.” What could be the problem? However, I was able to run a transformed command using the following and produce a Forest plot.

. metan cases Samplesize, pr model(ivhet \ re ) transform(ftukey, iv)

I can see all my CI are within 0 and 1. How can I do a subgroup analysis maintaining these CI between 0 and 1? And would the same principle be applicable when performing meta-regression? For meta-regression, I believe we need to use the ‘Regress’ command. Can we do a subgroup analysis in Metan?
On another note, does ‘ivhet’ stands for heteroscedasticity of the instrumental variable?
Plenty of things are new for me. Therefore, I have lots of questions for experts.
I am thankful for all your valuable comments @R_cubed, @s_doi.

Kind regards,


The command you used is fine as the rest affect the display of the forest plot. Note study(study) was meant for you to put the variable that holds the study name in your dataset in parenthesis instead of “study”. range(0.1 7) is the range of the proportions across the forest plot and given denom(1000) is expressed as cases per 1000 population - not tweaking these could lead to an error. Adding by(subgroup) will give you the subgroup analysis where “subgroup” is the variable that holds your subgroup indicator variable.

transform(ftukey, iv) implements the Miller back-transform with the Barendregt-Doi modification. If you use transform(ftukey) i.e. just ftukey alone then you get the results Schwarzer was complaining about and this then also gives you the same result as metaprop - in 2013 we had warned against using this but this remained largely ignored and when Schwarzer wrote the paper they did cite our paper but ignored the change we had suggested. This change was implemented in Stata after our rebuttal was published.

IVhet is a fixed effect model replacement for the RE model that removes the overdispersion seen with the RE model. Although it is a fixed effect model, it can be used for heterogeneous data. It is just an alternate model that I recommend everyone use in lieu of the RE model whose assumptions seem to me to be questionable at best when used in MA.

There is a metareg command in Stata but I will not recommend it as it only allows RE weights. To use IVhet weights you simply run
regress FTT x1 x2 [aw=1/v], vce(robust)
where v is the variance of each FTT transformed proportion from each study and x1 and x2 are moderator variables

Without overwhelming the OP, I’ll try to briefly sketch out the disagreement I have
with Doi’s representation of the literature on this issue.

Starting from first principles – I take Herman Chernoff’s philosophy as a worthwhile

With the help of theory, I have developed insights and intuitions that prevent me from
giving weight to data dredging and other forms of statistical heresy. This feeling of freedom
and ease does not exist until I have a decision theoretic, Bayesian view of the problem
I am a Bayesian decision theorist in spite of my use of Fisherian tools.

For Bayesians, use of decision theory as a formal tool even applies to the design of experiments.
I view meta-analysis as a tool to derive the most informative experiment, given goals and
resource constraints.

With this outlook, I find classical meta-analytic methods excessively reliant on the
metaphor of a “population” of studies, and the assumption of normality. [1][2]

Gene Glass, a pioneer in meta-analysis wrote: [1]

Third, the conception of our work that held that “studies” are the basic, fundamental unit of a research program may be the single most counterproductive influence of all. This idea that we design a “study,” and that a study culminates in the test of a hypothesis and that a hypothesis comes from a theory this idea has done more to retard progress in educational research than any other single notion.

Summary of Criticisms

  1. In his dismissal of the logit method, he failed to distinguish between the classic 2 step proportion combination procedures, and the more recently proposed 1 step GLMMs (Generalized Linear Mixed Models) [3-5]. This is directly relevant as the OP mentioned meta-regression, with [4] providing a good example on how to proceed.

  2. Limiting the discussion to classic 2 step methods, his advocacy of the Freeman-Tukey double arcsine variance stabilization transformation is inadequate. In order for the synthesis to be useful, the estimate must be converted back from the combination scale to the [0-1] interval. This is trivial for the closest competitor – the arcsine transform – which is never mentioned in his papers, but is discussed in [3-5] and recommended by the authors in [6].

The Freeman-Tukey transformation converges to the arcsine in large samples, but is not defined in a meta-analytic context with multiple proportions, as the authors mentioned in [7] (this paper was also noted above). Considering Glass’s quote above, the fact this transformation is so reliant on how sample sizes are averaged leads me to skepticism of its value in this context.

I’d agree variance stabilization can be valuable, but the double arcsine is too complicated without clear
benefit over the arcsine.


  1. Glass, Gene Meta-analysis at 25. Self-published Jan 2000 Archived at:

  2. Jackson, D, White, IR. When should meta-analysis avoid making hidden normality assumptions?
    Biometrical Journal. 2018; 60: 1040 1058.

  3. Lin, L, Xu, C. Arcsine-based transformations for meta-analysis of proportions: Pros, cons, and alternatives. Health Sci Rep. 2020; 9999:e178.

  4. P. J. Shi, H. S. Sand Hu, H. J. Xiao “Logistic Regression is a better Method of Analysis Than Linear Regressionof Arcsine Square Root Transformed Proportional Diapause Data of Pieris melete (Lepidoptera: Pieridae),” Florida Entomologist, 96(3), 1183-1185, (1 September 2013)
    Logistic Regression is a better Method of Analysis Than Linear Regression of Arcsine Square Root Transformed Proportional Diapause Data of Pieris melete (Lepidoptera: Pieridae)

  5. Lin L, Chu H. Meta-analysis of Proportions Using Generalized Linear Mixed Models.
    pidemiology. 2020 Sep;31(5):713-717. doi: 10.1097/EDE.0000000000001232. PMID: 32657954; PMCID: PMC7398826.

  6. Kulinskaya, E., Morgenthaler S., Stadute R. Meta Analysis: A Guide to Calibrating and Combining Statistical Evidence. Wiley 2008

  7. Röver C, Friede T. Double arcsine transform not appropriate for meta-analysis. Res Synth Methods. 2022 Jul 15. [2203.04773] Double arcsine transform not appropriate for meta-analysis


The only comment I will make is regarding my use of the FTT as the rest is your opinion which of course you are entitled to hold.

You are right, I have never really given much weight to the usual arcsine-square-root transformation as Freeman and Tukey created a much better variance stabilizing version by summing over the two arcsine values. Also Lin and Xu that you quote above are coauthors of mine and Xu co-authored the rebuttal ( Doi SA, Xu C. The Freeman-Tukey double arcsine transformation for the meta-analysis of proportions: Recent criticisms were seriously misleading. J Evid Based Med. 2021 Dec;14(4):259-261.)

There is no objective evidence to date (would be happy to see it if you have some) that the GLMM outperforms the standard RE aggregate data approach in MA if the right simulation approach is used. What has been done by most (including Chu that you quote above (and who is also a co-author of mine) is to simulate the way that the data will be analysed i.e assume random effects in data generation and then analyse using a random effects assumption - this is nothing more than a self fulfilling prophesy that I have criticized previously (Doi SAR. Examining how meta-analytic methods perform. Res Synth Methods. 2022 May;13(3):292-293.)

1 Like

The complaint of Rover and Friede is that the conversion of the double arcsine combined estimate back to the proportion scale can lead to an estimate that is outside the actual range of the data when the sample sizes are drastically different. That is a serious criticism that the single arcsine (which is the limiting value of the double) does not have.

The point of transforming proportions to a different scale, and then back to a proportion is to have a common scale for combination. Variance stabilization is an important feature, but there is no common scale, only overlapping ones with multiple sample sizes for the Freeman-Tukey… .

As for GLMM, this is from the abstract of Lin and Chu’s Meta Analysis of Proportion Using GLMMs

In general, GLMMs led to smaller biases and mean squared errors, and higher coverage probabilities than two-step methods. Many software programs are readily available to implement these methods.

1 Like

Rover and Friede are correct but the error is at the extremes and is so small that it lacks practical significance - there is always a trade-off in methods and a superficial read of the paper indeed sounds alarming if the context is ignored. Its the classic cycle of EBM that we teach - there are studies that suggest coffee causes cancer and also that coffee protects against cancer - an in depth understanding is required to make a recommendation and simply quoting from these authors does not resolve the issue as its much more complicated usually than a need to take sides.