Some thoughts on uniform prior probabilities when estimating P values and confidence intervals

I had been working on post 279 for hours and posted it before I had seen your editorial note. I had replied to @R_cubed in the hope that he or anyone could pint-point any errors precisely. I have been grateful to @EvZ and @R_cubed for their comments and thanked them repeatedly. However, I have been unable to understand precisely where my maths goes wrong and why I get the right answers for wrong reasons. I have of course learnt a lot from their comments already.

I’ve spent a lot of time on this and I’m disappointed that I wasn’t able to get through to you. I think the problem is that you are stuck in a tunnel where you keep trying to prove that you are right. Your post 279 is more of the same. You ask me to pinpoint your mistakes, but I’ve already done that. The overarching problem is that your reasoning is just too loose and sloppy. You are only fooling yourself with pictures and vague analogies.

In light of Frank’s editorial notes, I think it is now time to consider the possibility that you might be wrong and I am right. Have a look at my posts 178, 204 and 215. See if you can pinpoint any mistakes in my posts. Note that in my posts everything is defined very clearly and all the steps are spelled out to make it easy to find mistakes (if there were any).

So, it is now time to accept the following fact which I have proven multiple times. Assume the uniform distribution on \beta and

b \mid \beta,v_1 \sim N(\beta,v_1) \quad \text{and}\quad b_\text{repl} \mid \beta,v_2 \sim N(\beta,v_2).

Also assume b and b_\text{repl} are conditionally independent given \beta. Then it follows that

P(b_\text{repl} / \sqrt{v_2} > 1.96 \mid b,v_1,v_2) = \Phi\left( \frac{b - 1.96\sqrt{v_2}}{\sqrt{v_1+v_2}} \right).

You also need to accept that “successful replication” is commonly taken to mean |b_\text{repl} / \sqrt{v_2}| > 1.96. Therefore it is reasonable to call this the “conditional probability of successful replication after observing the first study.”

Numerically, this conditional replication probability does not agree very well with reality. The reason is that the uniform prior on \beta does not represent reality very well. To address this, we estimated an empirical prior in my paper with Goodman.

You are essentially claiming that you get more realistic results without making the model more realistic. However, it only looks that way because of your math mistakes.

2 Likes

Thank you again for your patience.

(1) Is it my failure to have accepted the above that is my precise mathematical mistake?
(2) May I assume that given my assumptions (right or wrong) there are no algebraic or numerical errors?

Is it my failure to have accepted the above that is my precise mathematical mistake?

That is not a gramatically well formed question, so I don’t know what you mean. But please don’t bother to explain because I’m not interested in what you mean.

May I assume that given my assumptions (right or wrong) there are no algebraic or numerical errors?

You may not. Specifically, you keep claiming (e.g. your post 271) that the model we agreed on (cf my previous post) implies

P(b_\text{repl} / \sqrt{v_2} > 1.96 \mid b,v_1,v_2) = \Phi\left( \frac{b}{\sqrt{v_1+v_2}} -1.96 \right).

That is false.

As Frank is telling you, you should go back and study what @R_cubed and I have been explaining to you. Further arguing is not appreciated. At this point you are abusing this platform.

2 Likes

I was asking questions, not arguing. Thank you for the discussion.

I would be grateful for a reference to a published source regarding the above so that I can do further reading.on how does |b_\text{repl} / \sqrt{v_2}| > 1.96. leads logically to P(z_\text{repl} > 1.96 \mid b,s,n_1,n_2) = \Phi\left( \frac{b - 1.96\frac{s}{\sqrt{n_2}}}{\sqrt{\frac{s^2}{n_1} + \frac{s^2}{n_2}}} \right).

By removing one of your Z scores from the numerator of the formula for a standardized Z score, which is merely a weighted sum of N(\beta, \sigma^2 ) divided by the square root of the variance, you break the algebraic property of closure, so your result can no longer be interpreted as a standardized Z score.

Note that equally weighting a set of k Z scores continues to satisfy the algebraic closure property, because we will be dividing by \sqrt{k}. We could easily remove your standard error term by substituting a new variable w and applying the axioms from the algebra of random variables, where each score gets a weight of w_i, and the denominator has a value of \sqrt{\Sigma w_1 ... w_k}

You need to study the basic facts from the algebra of random variables and their application to statistical procedures that were referenced in these posts:

1 Like

I was hoping not to bother anyone further. However, in answer to your question, I assume that by b1 you mean the observed raw effect size that I term b and b2 is a possible b_{repl}{{i}}. Now I don’t assume that ẞ= b but I specify a normal distribution with variance v_1 of all possible ẞ_i conditional on b. For each possible ẞ_j in this normal distribution, I specify another normal distribution of all possible b_{repl}{{i}} with variance v_2 conditional on each ẞ_j. These two distributions with variance v_1 and variance v_2 are convolved to form a single normal distribution for b_{repl}{{i}} conditional on the observed raw effect size b with variance of (v_1 + v_2) and se_{1,2} = \sqrt(v_1 + v_2). So b divided by the se_{1,2} = z. I subtract the critical value z* = 1.96 from the z-score z to give another z score of b/se_{1,2} - 1.96 so that P(z_{repl}>1.96|b, s, n_1, n_2) = \Phi(b/se_{1,2} - 1.96). I hope that this makes sense and that I have not made any errors.

You continue to use the term “convolution” but don’t seem to apply the rules correctly.

In probability theory, the probability distribution of the sum of two or more independent random variables is the convolution of their individual distributions.

If you will note, a “convolution” of two normally distributed variables is equivalent to the addition rule for the algebra of random variables.

That is precisely what I am doing. I note that Erik did the same thing with \sqrt(v_1 + v_2) in his expression.

But it makes no sense to divide 1 Z score by the sum of independent variance terms. Your result has no interpretation from a math or stats point of view.

If you look at the formulas closely, you will see your error. Your problem is your specification of the problem after the first study is observed. On the probit scale, information (in a mathematical sense) is defined as a shift from a standard normal N(0,1) distribution. So the problem remains:

How do you distinguish, using the difference between the two estimates, the cases:

  1. they come from the same distribution
  2. the come from different ones?

I don’t do that. I divided a raw effect size b by a standard error se_{1,2} for a normal distribution (formed from a basic convolution) to give a z-statistic, which I would have thought is very basic..

You simply can’t take one measurement and divide by 2 different error terms and expect it to make any sense.

The only data available are b, s, n_1 from the original study. I assume that s is the same for the second replicating study but choose n_2. All I know about ẞj is its possible normal distribution conditional on b. Also all I know about b_{repln}i is their possible values conditional on each possible ẞj. I am using different assumptions but as far as I can see, sticking closely to the rules of probability.

As you say the process of convolving two distributions with 2 different error terms represented by v_1 and v_2 is standard practice and also done often by Erik to give \sqrt(v_1 + v_2).

He was likely typing fast and didn’t apply the markup correctly. Look at my definitions and explain to me why they don’t apply. Any statistic you develop must adhere to the constraints of the algebra of random variables, otherwise we can’t say anything about their distribution.

I doubt that @EvZ made a mathematical mistake due to a typing error. Dividing a raw effect size by the square root of the sum of unequal variances is standard practice in frequentist significance testing and could not be regarded as a mathematical mistake. This also involves convolving two distributions of results (e.g. from the treatment and control limbs of a RCT).

Please specify which of your definitions (and where they are) that I should look at and I will do so.

Maybe I made a typo somewhere. But in any case, assume the flat prior on \beta and

b \mid \beta,v_1 \sim N(\beta,v_1) \quad \text{and} \quad b_\text{repl} \mid \beta,v_2 \sim N(\beta,v_2)

where b and b_\text{repl} are conditionally independent given \beta.

  1. Since b \mid \beta,v_1 \sim N(\beta, v_1) and \beta has the uniform distribution, it follows that \beta \mid b, v_1 \sim N(b, v_1).

  2. Since b_{\text{repl}} \mid \beta,v_2 \sim N(\beta, v_2) and b and b_\text{repl} are conditionally independent given \beta, it follows that b_{\text{repl}} \mid b,v_1,v_2 \sim N(b, v_1 + v_2).

  3. Standardize the (conditional) distribution of b_\text{repl} by first subtracting the (conditional) mean (which is b) and then dividing by the (conditional) standard deviation (which is \sqrt{v_1+v_2}).
    P(b_\text{repl} > 1.96\, \sqrt{v_2} \mid b,v_1,v_2) = P\left(\frac{b_\text{repl} - b}{\sqrt{v_1+v_2}} > \frac{1.96\, \sqrt{v_2} - b}{\sqrt{v_1+v_2}} \mid b,v_1,v_2\right)

  4. Conditionally on b,v_1 and v_2, (b_\text{repl} - b)/\sqrt{v_1+v_2} has the standard normal distribution. So, we conclude that
    P(b_\text{repl} > 1.96\sqrt{v_2} \mid b,v_1,v_2) = \Phi\left(\frac{b - 1.96\, \sqrt{v_2}}{\sqrt{v_1+v_2}} \right)

Since b_\text{repl}/\sqrt{v_2} has the standard normal distribution when \beta=0, “successful replication” is commonly understood to mean that b_\text{repl}/\sqrt{v_2} > 1.96.

1 Like

Here is what I want you to think about given the following fact about random variables from the normal distribution.

A general formula for weighting standardized Z scores is \frac{\Sigma w_1Z_1 ... w_nZ_n} {\sqrt{\Sigma w_1... w_n}}

How do you derive a statistic that distinguishes the 2 cases (which I will repeat for the third time):

  1. b1 and b2 come from the same distribution
  2. b1 and b2 come from different distributions

You are restricted to representing information on the probit scale, which is merely a shift from N(0,1).

If you pay careful attention, EvZ’s proof is entirely consistent with the formula I noted above.

Addendum: in the expression @HuwLlewelyn quoted from @EvZ: √(v1+v2), the sum is computed first, then the square root is taken due to parentheses, so it is still mathematically correct, although it might not look like formulas in the textbooks.

You simply can’t break the numerator in a summation of normal random variables and expect the result to make any sense.

I can’t understand this. In order that I might understand this better I would need you to define b1 and b2 for me, give numerical examples and explain under what circumstances they do and do not come from different distributions. You would you also need to explain why your objections don’t apply to the identical calculation of b / \sqrt(v_1 + v_2) during significance testing with unequal variances. I don’t really want to burden you further though. However, when I re-examine my reasoning, I can’t find anywhere where it makes no sense. Maybe you can pin-point something in Post 288.

This has been done a half dozen times.

You cant say \frac{2+3}{5} = \frac{2}{5} + 3 which you should immediately recognize is false.