Should one derive risk difference from the odds ratio?

Yes, the one above
Methods A & B are not based on constancy of these across populations but rather that these methods can be used to update probabilities based on evidence. Whether method A or B is the correct form of evidence is what is under discussion.

Huw, you are absolutely correct that Method A assumes the OR is the same in the population we have results for and in the population we’d like to generalize to and Method B assumes the same for the RR if we want these methods to actually be coherent. Not in the technical sense of coherency, just in the sense of actually having a chance at producing the correct answer. This is obvious from examining the equations behind the two methods.

From Suhail’s description of Method A

Convert r0 in new population to odds and multiply by ratio of RRs (i.e. odds ratio) obtained from study in original population:

\begin{align} \frac{P(Y_1 | X_1 = 0)}{1 - P(Y_1 | X_1 = 0)} &\times \frac{P(Y_2 | X_2 = 1)/(1 - P(Y_2 | X_2 = 1)}{P(Y_2 | X_2 = 0)/(1 - P(Y_2 | X_2 = 0)} \\ & := \gamma \end{align}

where the := symbol just indicates that we’re assigning the name \gamma to the quantity on the left-hand side.

The next step in the method is to treat \gamma as if it’s the odds for the outcome under treatment in the new population, and convert that to a probability to obtain P(Y_1 | X_1 = 1). But treating \gamma as if it’s the odds of the outcome under treatment in the new population only makes sense if we assume that the odds ratio is the same in the new population and the old population, since that’s what makes the relevant cancellations actually happen:

\begin{align} &\gamma := \\ & \\ &\frac{P(Y_1 | X_1 = 0)}{1 - P(Y_1 | X_1 = 0)} \times \frac{P(Y_2 | X_2 = 1)/(1 - P(Y_2 | X_2 = 1)}{P(Y_2 | X_2 = 0)/(1 - P(Y_2 | X_2 = 0)} =\\ & \\ & \frac{P(Y_1 | X_1 = 0)}{1 - P(Y_1 | X_1 = 0)} \times \frac{P(Y_1 | X_1 = 1)/(1 - P(Y_1 | X_1 = 1)}{P(Y_1 | X_1 = 0)/(1 - P(Y_1 | X_1 = 0)} = \\ &\\ &\frac{P(Y_1 | X_1 = 1)}{1 - P(Y_1 | X_1 = 1)} \end{align}

where the equality between lines 2 and 3 follows from assuming that the ORs are the same in the two populations. Without that assumption, \gamma is just some arbitrary quantity that has no guarantee of actually aligning with the odds of the outcome under treatment in the new population.

Similar calculations can be done for Method B to show how it assumes that the RRs are the same in the two populations.

Edit: Of course, which, if any, of these methods produces the correct answer will depend entirely on the reasonableness of those assumptions about the OR and RR.

1 Like

I’m sorry. What I meant was that you ‘assumed for the sake of argument’ that the OR or RR was the same in Norway and Wales, not that you ‘assumed either to be true’. What I thought we were doing was ‘What If’ modelling in the spirit of the title of Hernan and Robins. I regard all these mathematical exercises as models to be compared with subsequent observations.

1 Like

In that case I misunderstood and agree with you!
Addendum: Huw, that’s why I used a saturated model so that we ignore random error and pretend there are no possible artifacts in these samples

These discussions have been very useful and so our next generation preprint on this issue is here available for comments

1 Like

From the paper:

… the RR when looked at from the diagnostic testing angle, is now understood to be a likelihood ratio and therefore a ratio of two odds not … simply considered as a ratio of two conditional probabilities…

I have no idea why you are converting proportions to odds ratios here. This flies in the face of decades of mathematical work on the statistical properties of estimators and experiments.

In an experimental design context, the only thing we have control over is the error probabilities, which are more informative than likelihoods.

After the data are collected, an observed proportion can be useful to estimate a future probability.

Pawitan, paraphrasing Fisher wrote:

whenever possible to get exact results, we should base inference on probability statements, otherwise they should be based on the likelihood.

Pawitan, Y. (2001). In all likelihood: statistical modelling and inference using likelihood. Oxford University Press. p. 15

The Bayes Factor has a frequentist interpretation of \frac{1- \bar{\beta}}{\alpha} , which is a ratio of probabilities.

Diagnostic testing is merely a specific case of the broader hypothesis testing problem, where:

Type I error = 1 - spec
Type II error = 1 - sens
Power = sens = 1 - \beta

Information is formalized as a (pseudo) distance metric of 2 probability distributions.

2 Likes

Side note: The definitions of sens and spec are not that useful because they imply they are solely properties of the test (and not the patient) and that they are constant.

2 Likes

I guess you mean proportions to odds. Proportions and odds are mathematically equivalent representations of the same underlying information.

From the paper:

However, only the odds ratio allows generalization beyond the sample because it is variation independent.

This statement is just false. Sander, Anders and I have all given examples where effect measures other than the odds ratio can be used for generalizing results to another population.

2 Likes

No one disputes it is valid to compute odds from probabilities, or probabilities from odds. Your problem is that odds ratios are compatible with a range of probability assignments, and from this perspective, lose information.

An experiment that has a 1 - \beta of 0.8, \alpha of 0.05 has a prospective likelihood of 16 to 1. An experiment that has a power of 0.08 to an \alpha of 0.005, also has prospective likelihood of 16 to 1, but the sample size for the former is much different than what is needed for the latter.

You are looking at this from a different perspective from mine. If I assume that these probabilities are determined by continuous test thresholds then for each scenario:

Sen Spe pLR nLR OR AUC
0.8 0.95 16 0.211 76 0.974
0.08 0.995 16 0.925 17.30 0.896

Thus your value of 16 is just a bit of the picture and this has been the problem with being too focused on a single aspect. I tend to see everything as interrelated as mentioned in the paper and then the big picture emerges

Hi Dr.Doi

Can I clarify that you’re emphasizing that it’s important to understand that when we see that an RCT- derived Odds Ratio has “moved” with adjustment for different factors, we should not infer that this signals the presence of underlying “confounding bias” (or “random confounding”- a term that seems to be used by epidemiologists but shunned by statisticians…) (?) Rather, we should recognize that, in an experimental (as opposed to observational) context, movement of the OR with adjustment can simply reflect the impact of an important (but non-confounding) prognostic factor?

I read this paper a couple of years ago and had difficulty reconciling some of its assertions with things I’ve read on the Datamethods site:

For example (my emphasis in bold):

“In summary, prognostic imbalance does not on average jeopardize internal validity of findings from RCTs, but if neglected, may lead to chance confounding and biased estimate of treatment effect in a single RCT. To produce an accurate estimate of the treatment-outcome relationship conditional on patients’ baseline prognosis, balanced or unbalanced PFs with high predictive value should be adjusted for in the analysis. Covariate adjustment slightly reduces precision, but improves study efficiency, when PFs are largely balanced. Once chance imbalance in baseline prognosis is observed, covariate adjustment should be performed to remove chance confounding.

I’d be interested (if you have the time), to hear your opinion on the use of the terms “bias” and “confounding” in this paper (which I believe was written by epidemiologists) and whether this usage comports with the way these terms are used by statisticians. It seems important that clinicians not make inferences that an RCT result is “biased” when bias is not actually present- this practice could lead to discounting of potentially compelling trial results…

Perhaps you are right if you read this literally, but most readers will interpret this to mean ‘However, the odds ratio is more reliable for generalizing…’. I take a strong view which may be unnecessary - Jazeel please update this.

Your own columns of sens/spec as ratios of probabilities vs your OR column should show you that your claim that “interpreting the RR as a ratio of probabilities is inconsistent with Bayes Theorem” is wrong, since the only difference between the 2 scenarios is a constant of proportionality. The discriminant information in this scenario is indeed 16 to 1, as the likelihood principle would indicate.

2 Likes

Hi Erin, yes, you are right - in the context of a RCT derived OR such movement reflects the impact of an important nonconfounding prognostic factor.

Regarding covariate adjustment in a RCT, if we have finite randomization residual imbalance in covariates after randomization may occur. Perhaps that is what the researcher is referring to by ‘chance confounding’ and I believe Ian Shrier has a paper on this where he says confounding by chance occurs with the same frequency in RCTs and observational studies. Because such imbalance is a form of random error that is reflected in the statistical uncertainty presented through the analysis - these do not need to be adjusted for.

Prognostic covariates that have a large impact on the primary outcome however are a different issue - they need to be adjusted for even in a RCT if the closest empiric estimate of the individual treatment effect is the desired goal of the analysis of the RCT data. The latter adjustment has nothing to do with confounding but rather is a method of dealing with HTE called ‘risk magnification’ by Frank.

1 Like

Our paper does not agree with that assessment - what is the evidence to support your position?

Simply looking at your mathematical claims and comparing them to the definitions of:

Discriminant information (discrete case):
I(P||Q) = - \sum_{x \in X} P(x)log(\frac{Q(x)}{P(x)})

The fraction is a log transformation of a ratio of probabilities at a particular x, while the information is a probability weighted sum of those log transformed probability ratios.

The Bayes Factor as defined in the James Berger et. al. paper I’ve linked to in a number of threads (including above):

The Bayes factor is virtually identical to the KL information divergence metric, in that it is the integrated likelihood, with a uniform prior being a special case where the Bayesian posterior and Frequentist “confidence” distribution (set of all intervals with \alpha = [0 ... 1] coincide (in large samples).

Brockett, P. L. (1991). Information theoretic approach to actuarial science: A unification and extension of relevant theory and applications. Transactions of the Society of Actuaries, 43, 73-135. PDF

These (to me) do not mean anything in terms of supporting the notion that a single likelihood ratio has any utility as a measure of discrimination - I think a data example is needed to get your point across

Your own example shows that ratios of probabilities when framed in terms of power and type I error can be constant, while converting those same probabilities to odds cannot be.

The fundamental relationship between frequentist alpha levels and Bayesian update from prior to posterior is shown via simple algebra.

Starting from Berger et al.s Rejection Ratio paper, the pre-posterior odds needed for an experiment is:

O_{pre} = \frac{\pi_{0}}{\pi_1} \times \frac{1 - \bar\beta }{\alpha}

The term with \pi can be thought of as a skeptical odds ratio \lt 1, with the entire ratio being a proportion of true to false rejections of the reference null hypothesis.

O_{pre} can be seen as the poster odds ratio \gt 1 needed to confidently reject the null hypothesis based not only on the design of this single study, but also on prior assessment of the relative merit of the hypotheses.

\alpha = \frac{1-\beta}{Odds(\theta|data) \times Odds(\theta)}

In Berger’s GWAS example, the scientists did not need for the poster odds to be greater than 1 in order to claim a discovery; simply going from \frac{1}{100,000} prior odds on a gene - disease relation to \frac{1}{10} (leaving high posterior odds that there no relationship even after the data were seen) was considered worthwhile from a scientific decision analysis.

Are you disputing that the Bayes factor is a ratio of frequentist error probabilities? If so, why?

The expression you have typed is incorrect - please have a close look at our paper. Also using alpha and beta will be confusing for everyone so I recommend sticking to TPR and FPR. Our example shows no such thing - if you think it does please explain how so.