BBR Session 5: One- and Two-Sample Methods for Means and One-Sample Methods for Proportions

This is a topic for questions, answers, and discussions about session 5 of the Biostatistics for Biomedical Research web course airing on 2019-11-15. Session topics are listed here.

1 Like

Hi Frank!

Sorry for being so out-of-sync with the course, I just managed to complete the 5th session which was a rather math-heavy topic for a non-statistician, I guess that’s why the discussion here hasn’t exactly been brewing :slight_smile: Anyway, I would like to ask for some clarification regarding the prior assumptions in these calculations.

In the part “Decoding the Effect of the Prior” you do address this to some extent, but I think I would need a little more elaboration to properly understand it. What I can decode from your examples:
(1) Flat priors with infinite standard deviations don’t impose much constraints on the data (not surprising)
(2) Only having very constrained prior distributions (small SD) would make large posterior effect sizes very unlikely (and thus increase the required n by a lot), but more reasonable assumptions actually don’t add a whole lot of extra n.

From a clinician’s standpoint, making a reasonable estimation of μ0 (eg a treatment effect) and σ0 (a believable interval of potential values) seems achievable in the right context (see the valuable discussion here). However what I would feel much more uncertain about is the shape of the distribution itself. I understand that when using frequentist methods one assumes these implicitly, without having to specify it case-by-case. However, with the Bayesian examples of BBR, assumptions such as switching to the gamma distribution and then specifying the α and β for this distribution seems arbitrary. Could one make a similar “sensitivity analysis” for the choice of prior distribution and its additional characteristics? Are there maybe some rules of thumb, for choosing appropriate distributions (and parameters) for certain types of variables in clinical contexts (eg. time-to-event, ordinal scales, etc)?

In the 2019 O’Hagan paper on elicitation of expert opinion, he talks about fitting distribution curves to the three estimates elicited but doesn’t give much guidance as to how one could avoid overfitting. Any other relevant sources I should read on this, primarily from a non-statisticians perspective?

Thank you again!
Aron

This is an excellent question Aron. I have not worried as much about the shape as I have about a certain summary of the shape: the tail area quantifying the chance of a bit effect. A sensitivity analysis of the type you mentioned would be useful, e.g. if the exceedance probability I just alluded to were held constant but the shape of the prior changed in various reasonable ways what happens to the posterior exceedance probability?

The only non-arbitrary way out of this can only be achieved in situations where we have a large number of experts or prior data sources that can be used to put together the entire shape of the prior as Spiegelhalter has done in the past.

One small correction: priors don’t impose constraints on data but rather constrains on unknown parameter values.

Hi Professor Harrell (may I address you as Frank?),

I, too, am out-of-sync with the course, but am making efforts to catch up! I enjoyed your discussion of calculating a sample size for a given precision (confidence interval width). The “power to detect a miracle” was pretty amusing. Probably because it’s true way too often. I’d like to write out my understanding of powering for precision and would appreciate any feedback (from anyone on the forums, of course).

The correct way to state the outcome of a sample size calculation for precision would be as follows:

  • If we were to run an infinite number of experiments, given a Standard Deviation of SD and a DELTA of 1, we would obtain 95% confidence intervals with widths less than or equal to DELTA 50% of the time.

I, personally, think better in terms of numbers and units rather than symbols, so I will put this in terms of the numbers used in your example from BBR5 (page 5-40, timestamp 58:10 in the video).

  • Question, form 1:
    Given a SIGMA of 10, what sample size would be required to give a 95% confidence intervals with +/- DELTA of 1 (aka an interval width of 2), on average in the long run?

  • Question, form 2:
    Given a SIGMA of 10, what sample size would be required to have a 50% chance of getting a 95% confidence interval with +/- DELTA of 1 (aka an interval width of 2)? [Note this is before the calculation of the interval, since after the calculation of the interval, the interval either does or does not include the “true” parameter value.]

Answer: A sample size of 384

It wasn’t part of your discussion, but I’m adding the “on average in the long run” and “50% chance” statements to bring out a nuance that I think is accurate, though it’s certainly where I’d like feedback.

To see the “50% chance” part, I do this calculation in R:

power.t.test(n=NULL, delta=1, sd=10, sig.level=0.05, power=0.50, type=c("one.sample"), alternative=c("two.sided"))

One-sample t test power calculation

          n = 386.0696
      delta = 1
         sd = 10
  sig.level = 0.05
      power = 0.5
alternative = two.sided

[NOTE: I assume the 386 above vs the 384 in your example is due to rounding. If not, then there’s something else going on that I do not understand.]

If, however, one wanted to have, say, an 89% chance of getting a 95% confidence interval at least as narrow as the stated delta, then one would run the power calculation as:

power.t.test(n=NULL, delta=1, sd=10, sig.level=0.05, power=0.89, type=c("one.sample"), alternative=c("two.sided"))

One-sample t test power calculation

          n = 1017.296
      delta = 1
         sd = 10
  sig.level = 0.05
      power = 0.89
alternative = two.sided

Would that be a correct understanding, and, if not, where did I fall down?

Thank you.

Hi Brian and yes please call me Frank.

I think you are making it a bit more difficult than needed. Think about the case where we know the confidence interval width without needing any data, e.g. when \sigma is known and you are getting a confidence interval for the mean of a Gaussian distribution. The 0.95-level margin of error is the 1.96 \times \frac{\sigma}{\sqrt{n}}. There is no chance involved, and we can compute n exactly to yield a margin of error of \epsilon. Next, when \sigma is not known we estimate it from the ultimate data but we might assign that n is large enough so that the margin of error in estimating \sigma can be partially ignored. I.e. we approximately know the margin of error for estimating \mu in advance and go through the earlier logic. It is true that in general we might do a better job of taking uncertainty about \sigma in account when estimating n.

You later part where you look at power calculations doesn’t seem on target when just talking about precision.

Hi Frank. Thanks for the notes, and what you say makes sense. I agree that what I wrote above could be considered overly complicated for practical application, but my approach is generally to try and understand something to the fullest detail possible, then scale back and relax conditions once I’m confident that my relaxing of conditions isn’t overly relaxing, if I can put it that way. If I don’t have that deeper understanding, I get nervous about making more assumptions!

That’s why I was thinking in terms of quantifying how often the confidence interval width calculated from the data will be larger than the target width and how often it would be smaller than the target width.

So, as far as I understand things, using the calculations you presented will result in a confidence interval with a greater interval length than targeted 50% of the time, because sometimes you get a sample with a greater StDev ( and sometimes you get a sample with a smaller StDev). I know this can be compensated for by increasing N, but I’m trying to understand a way to quantify that. I think powering for precision is far, far better than the typical use of power, and I want to understanding it as fully as possible.

Perhaps I’m getting hung up on the dichotimzation between “greater than the target CI width” and “less than the target CI width”? If the n is large enough, then the width calculated from the data will be close enough to the ‘true’ width for all practical purposes? Hmm. Now I’m thinking that’s what you you are saying? I admit I’m still somewhat hazy.

And, actually I’d like to understand how to sample-size an experiment for precision of a posterior distribution of a parameter, and here I’m thinking of the 95% highest posterior density interval (or 89%, etc), but I feel as though I should, basically, understand what happens with flat prior before moving on to weakly or moderately informative priors.

Any advice on that front (sizing for HPDI with informative prior) as well? I assume it would be purely simulation based?

Thanks again, I appreciate your feedback even if I don’t fully grasp it.

To me, sizing of HPDI or CIs is relevant but I’m having a hard time understanding the relevance of uncertainty in the sizing.

I think there’s some background I’m missing in order to formulate my thoughts better at the moment, so I’m going to set this question aside (in my mind) for now, and perhaps revisit it in the future when I’ve completed the entirety of your course and maybe read more on precision and power in the future. If you have any favorite articles/papers on the subject that you’d recommend reading, I’d appreciate any pointers. Thanks!

Don’t know if this helps but under a symmetric distribution the sample mean will be below the true population mean half the time, and above it half the time.