I wanted to post this for some input to see if there was any flaw in my reasoning for the proposed procedure. It incorporates a lot of ideas from dozens of papers and posts. Following the ideas from

A Theory of Experimenters, this description is how an individual with no need to convince an external audience, might use it. There are some adaptations I can imagine that would bring it in line with Robert Matthew’s Analysis of Credibility, to model a dialogue with a skeptical audience.

Blockquote:

Statistical practice changes slowly because the teaching of statistics changes slowly … Once sociologists and physicians have learned about significance levels well enough to use them, a major reorganization of the thought process is required to adapt to decision theoretic or Bayesian analysis … With the help of theory, I have developed insights and intuitions that prevent me from giving weight to data dredging and other forms of statistical heresy. This feeling of freedom and ease does not exist until I have a decision theoretic, Bayesian view of the problem. I am a Bayesian decision theorist in spite of my use of Fisherian tools.

—Herman Chernoffin commentary on “why Isn’t Everyone Bayesian” by Bradley Efron (1986)

**Procedure Description**

**Inputs:**

- sequence of signed test statistics and/or 2 sided p-values,
- Prior and Posterior Odds,
~~Sequence of Posterior Odds changes as number of studies increase,~~- Sequence of changes \alpha
~~allocation~~as studies increase (see this article on adaptive \alpha levels), - Expected Power for individual study
- Step procedure (Holm or Hochberg)
- P value combination procedure (Stouffer-Liptak, Fisher, Logit)

**Output:**

~~Statement of Posterior Odds~~ There exists s of N studies indicated a clear direction of effect, where

- s = H(r) + C(b)
- H(r) = null rejected by Hochberg step-up method (or Holm step-down)
- C(b) = -1 or +1 if combination procedure of studies not rejected in Hochberg are rejected at remaining alpha, 0 otherwise.

The familywide \alpha level will be set using Bayesian reasoning. Following Cheng and Sheng, \alpha will be allocated between a step procedure and a combination procedure. The input for combination procedure will be the remaining statistics not rejected by step procedure. Following Naaman and Pericchi and Periera, familywide \alpha will decline as number of studies increases. i~~In addition, less error will be allocated to the combination procedure, permitting more power for the sequential testing procedure to find a local effect.~~

All values will be specified by user prior to data collection as part of SAP (statistical analysis plan).

**Rationale:** Contrary to Chernoff’s quote, current discussions of a scientific replication crisis indicates that “significance” levels are not understood well enough to use them correctly in a large areas of science. This suboptimal state of affairs continues to exist because:

Blockquote:

Classical Fisherian significance testing is immensely popular because it requires so little from the scientist… –Bradley EfronLarge Scale Inference: Empirical Bayes for Large Scale Testing, Estimation, and Prediction (p. 48)

This major reorganization of thought begins by placing elementary statistical procedures in a Bayesian context. Particularly important are are correction for multiplicity, and the information fusion approach via p-value combination have Bayesian and information theoretic justifications.

Omnibus p-value procedures remain useful and research into extensions of these methods is ongoing.

Hedges and Olkin (1985) note:

Blockquote

An important application of omnibus test procedures is to screen for any effect … Alternatively, combined test procedures can be used to combine effect size analyses based on different outcome variables.

In the context of meta-analysis, it is far from unusual to have what would ordinarily be a small sample (ie. no more than 20 studies). This limits the application of preferred methods (ie. effect size aggregation or meta-regression) when explanation of variability would be useful. Rather than produce a misleading effect size estimate from heterogeneous studies, it would be preferable to compare and contrast the local results of individual studies to design more informative ones given current knowledge.

The following procedure builds upon Cheng and Sheng (2017)

Their procedure divides the familywise \alpha among 2 uncorrelated p-value combination procedures. This \alpha division is essentially a Bonferroni adjustment for multiple tests.

Taking guidance from Bayarri, Berger, Benjamin, and Selke (2016), Pericchi and Periera (2016), and Naaman (2016), this Bayes-Frequentist procedure will overcome limitations of classical combination procedures, and provide a more intuitive Bayesian interpretation.

Related threads: