The switch risk ratio is not used for prediction. Use the OR-based model to get individual risk predictions and differences from them. Bypass that issue.
The authors say about their example in Table 3:
“In this table, the risk ratio and the odds ratio differ substantially between the different populations, while the survival ratio is exactly equal and the risk difference is almost equal in every group. This is not a coincidence; the mechanism of action that we assumed imposes sufficient structure to make the effect of Penicillin stable across groups, but only on the survival ratio scale”
The problem is that the survival ratio (RR_not_Y) was fixed by the authors to 0.99 (see table below) so why the surprise that it was constant!
Table 3 adapted | |||||
---|---|---|---|---|---|
r0(not_Y) | RR(not_Y) | r1(not_Y) | OR(not_Y) | r0 (Y) | r1 (Y) |
0.9000 | 0.9900 | 0.8910 | 0.9083 | 0.1000 | 0.1090 |
0.9500 | 0.9900 | 0.9405 | 0.8319 | 0.0500 | 0.0595 |
0.9800 | 0.9900 | 0.9702 | 0.6644 | 0.0200 | 0.0298 |
0.9900 | 0.9900 | 0.9801 | 0.4975 | 0.0100 | 0.0199 |
0.9950 | 0.9900 | 0.9851 | 0.3311 | 0.0050 | 0.0150 |
0.9990 | 0.9900 | 0.9890 | 0.0901 | 0.0010 | 0.0110 |
It is quite clear that when OR(not_Y) is about 0.5 then r0(Y) doubles and when OR(not_Y) decreases to about 0.1 r0(Y) increases 10-fold
Clearly there is no utility to the survival ratio and this should be obvious because the ratio of both relative risks equals the OR i.e. RR(Y) ÷ RR(not_Y) = OR(Y)
We did not set the survival ratio to 0.99. We made the following assumptions, reflecting claims about biology:
- The effect of treatment is determined only by switches of type B
- In people with switches of type B, treatment is a sufficient cause of the outcome
- In people who do not have switches of type B, treatment has no effect on the outcome
- The prevalence of switches of type B is 1% in all populations
- The prevalence of B is independent of the baseline risk in all populations
(Note that monotonicity is implied by these assumptions and therefore not listed separately)
From these assumptions, it follows that the survival ratio is 0.99 in all populations. This is not a circular argument. You can reason about the plausibility of assumptions one through five by considering your understanding of biology. If you believe the assumptions, stability of the survival ratio follows logically.
You can of course argue that you don’t believe the assumptions. For example, you can argue that switches of type B have different prevalence between men and women. If that is the case, you would need to condition on sex as an effect modifier.
The point we are making is that if you set up the analysis using the variant of the relative risk that was suggested by Sheps, and define heterogeneity as deviations from that measure of effect, it will be orders of magnitude more plausible for humans to reason about why there might be heterogeneity, and to account for it in the analysis (by conditioning on effect modifiers and using partial identification methods when necessary).
Let me make one final comment in response to this:
Clearly there is no utility to the survival ratio and this should be obvious because the ratio of both relative risks equals the OR i.e. RR(Y) ÷ RR(not_Y) = OR(Y)
This isn’t even an argument. If you wrote something like this on an exam, you would fail logic 101. This is just a random string of words, with no connection between the correct premise and the incorrect conclusion.
Let me make this perfectly clear: My career is literally on the line here, I am leaving academia unless I can convince the academic community about the soundness and importance of these ideas. It is not at all helpful to deal with senior academic statisticians who claim that my work has “no utility” without even making an attempt at a sound argument
Our biological models are not dependent on observed outcomes. Sometimes, those models reflect biological mechanisms which are sufficiently asymmetric ( in part for evolutionary reasons) such that Sheps’ preferred variant of the relative risk is stable. The switch risk ratio is simply a convenient mathematical object that will be stable given a biological model, you do not need to give it a realistic interpretation. What matters is whether the underlying biological models reflect reality, and those models are not functions of the observed data.
Is this not mathematically equivalent to saying that the survival ratio is 0.99 in all populations? Taking the example in your preprint:
0.005+0.995×0.01 = 1 - 0.995×0.99
That is why I said you fixed the survival ratio at 0.99. I think that we need to be more pragmatic and less philosophical about this.
You are here pointing out that our conclusions follow logically from our assumptions. This is correct, that’s how deductive reasoning works. If that wasn’t the case, it would mean that we had done something wrong.
In your previous post, you accused me of begging the question (assuming the conclusion). I then showed you that the premises in our argument can be evaluated based on biological plausibility.
And now I’m being asked to be less philosophical? Why? The philosophical argument works, and I don’t see any pragmatic reasons for doing anything else? Why would I prefer a “pragmatic” approach that doesn’t work over a philosophical argument that works?
(Edited to add: Our assumptions are not equivalent with the conclusion. Our assumptions imply the conclusion. The conclusion does not imply our assumptions)
On what basis can you conclude this when the math is clear on this point? This is an honest question and not a rhetorical one.
Please look up the definition of “equivalence”. In the standard usage of this term, it means “bidirectional implication”. At least that is how the term is used in basic logic.
The math is very clear that our assumptions (1) through (5) (which I discussed in my previous post here), imply stability of the survival ratio. This is what we intended, if this wasn’t true, our work would be invalid.
Stability of the survival ratio does not imply our assumptions (1) through (5). Therefore, there is no bidirectional implication.
Completely agree and this example of neoadjuvant chemotherapy in unresectable and resectable pancreatic cancer highlights why neither RR(Y) nor RR(notY) are useful measures. Only when both are considered do we get rid of their numerical dependency on baseline risk
Unresectable | Status at 9 months | |||||
---|---|---|---|---|---|---|
Dead | Alive | total | RR(Y) | 1/RR(notY) | OR | |
No chemo | 90 | 10 | 100 | 2.0 | 5.5 | 11.0 |
Chemo | 45 | 55 | 100 | |||
Resectable | Status at 9 months | |||||
Dead | Alive | total | RR(Y) | 1/RR(notY) | OR | |
No chemo | 44 | 56 | 100 | 2.0 | 1.4 | 2.8 |
Chemo | 22 | 78 | 100 |
Now you’re just throwing stuff at the wall to see what sticks. There isn’t even an argument here. How does an arbitrary table prove anything? You are not giving us a causal model, and not even an interpretation of what is going on. In this table, it even looks like RR(Y) rather than your preferred odds ratio is stable between strata…
I like seeing this discussion continue but ask you guys to tone it down a bit. Thanks.
Frank, I will do my best to tone it down.
However, I will note that your request carries an implicit allegation that I am being unreasonably disagreeable. When my work is being discussed publicly, I think I have a moral right to ask potential participants to stay out of the conversation unless they have the ability to understand my arguments. This is particularly important when those participants have academic credentials that outrank mine, which means there is a real possibility that casual observers take them seriously.
Since my goal here is to convince others about the correctness of my work, I am required to point out why Prof. Doi’s counterarguments fail. Because his errors are so elementary, this comes across as hostile and defamatory.
I acknowledge that if I am wrong and Prof. Doi is right, what I have said here is defamatory. If that is the case, it is not just morally problematic, I may have real legal liability. Certainly, it should mean the end of my academic career: It would be entirely appropriate to ban me from having an academic position at any reputable institution, if I had incorrectly accused a fellow scientist of failure to understand elementary logic.
This is a risk that I am aware of. I am confident enough in the correctness of what I’m saying, that I am willing to do say it anyways. I beg you to please at least consider the possibility that I’m right. In a hypothetical world in which I am right about the switch risk ratio, do you understand why it is necessary to point out the errors in the counterarguments as explicitly as possible, even if that makes the person who made those counterarguments look bad?
I appreciate the response Anders. I was not referring to correctness of anyone’s argument and was not meaning to imply that there is anything wrong with disagreement. I just want all of us to be careful in terms of choice of words. One of this site’s principles is that we criticize ideas, not people. Let’s depersonalize this a bit.
This is an interesting paper titled Revisiting the relationship between baseline risk and risk under treatment that questions the ‘effect model’ describing the relationship between baseline risk and risk under treatment being linear, i.e. ‘relative risk’ being constant
I have been reflecting on this thread and the reason there is so much disagreement is that those who believe in OR hold it as reference and those in RR hold that as the reference for demonstrating problems with the other effect. This is not going to solve the disagreement so what we need is mathematical proof that the numerical value of the effect measure in question is indeed measuring the strength of association between the independent variable(s) and the binary outcome. I will now present this for the OR as follows:
The area under the curve, which ranges from 1 (perfect discrimination), to 0.5 (no discrimination) is clearly a measure of how strongly the independent variable(s) discriminate between a binary outcome (whether causal or not can be ignored for now) and therefore the odds ratio, if it were to measure strength of an effect, say of treatment, for example, must be directly related to the AUC - mathematically. It turns out that this can be proven quite easily as follows:
√OR = (AUC/(1-AUC)) if OR ≥ 1
Therefore
ln(OR) = 2 × logit(AUC) if OR ≥ 1
This is a perfectly linear and more interestingly a monotonic relationship and therefore the OR is measuring the strength of the association. I believe that someone pointed out that there is not a monotonic relationship for the RR but will leave that for someone else to demonstrate.
I think the disagreement is less a math issue than a context issue. I think there is broad agreement that OR is a very reasonable effect measure for RCTs and most observational studies.
Do you have counter argument to @Sander comment here:
Blockquote
Sander: Yes, for the same reasons I was never a fan of log-risk regression either, except in some very special cases where (due to sparse data and resulting nonidentification of baseline log-odds) it could provide more efficient and useful risk assessments than logistic regression, and without the boundary and convergence problems it hits in ordinary ML fitting.
You need to be more precise about what it means to “measure the strength of association”. In my view, all effect measures are measures of the strength of association, and the only thing that matters is whether there is any reason to expect the strength of association to be stable between groups on that scale.
Stability will clearly not be a general mathematical property of any effect measure: I can promise you that for any effect measure, there will exist data on some exposure-outcome relationship that disproves the claim that the effect measure is always stable.
Stability is therefore situational, and needs to be evaluated separately for each exposure-outcome relationship. What distinguishes these exposure-outcome relationship from each other is what the variables mean biologically, and how they relate to each other. In other words, it depends on the causal structure. If you want to convince me of stability of the odds ratio for any particular exposure-outcome relationship, you therefore have the propose a biological narrative that explains how this stability arises for this particular biological relationship.
You seem to be under the impression that there must exist some platonic ideal for the magnitude of the effect, and then you equate that platonic ideal with the area under the curve. I don’t see any argument for why the AUC is the uniquely correct measure of how strongly two variables are correlated
(Edited; I deleted an incorrect paragraph here, AUC curves are outside my area of expertise. The deleted paragraph was not central to the original argument)
This is an argument against the log-risk regression model. I am not arguing in favor of the log-risk regression model.
I very strongly dispute the claim that there is any kind of broad agreement that the OR is a reasonable measure of effect.
There is broad agreement among @f2harrell @Sander and Stephen Senn, based upon their peer reviewed writings and discussions here.
When I say “broad agreement” I mean “a good default strategy that will work most of the time.” I do not infer any of them think there is a dominating effect measure, in the formal, decision theory sense.
If you disagree with them, your arguments would be interesting to all following this thread.
I would think a “broad agreement” would include not just three specific statisticians, but also a large majority of epidemiologists, trialist and medical doctors.
I don’t know if you follow Twitter, but there is an recurring controversy between epidemiologists/doctors on one side (arguing in favor of the risk difference or the risk ratio), and traditionally trained statisticians on the other side (arguing in favor of the odds ratio). This discussion goes back many years, and flares up several times a year. Coincidentally, statisticians find themselves in a similar debate with economists, who prefer linear probability models (using an identity link function, implying additive effects) over logistic models.
Most statisticians are, of course, absolutely convinced that they understand the reason for these disagreements: They are sure the disagreement happens because epidemiologists and doctors aren’t as good mathematicians as them, and therefore find it challenging to interpret the odds ratio.
I assure you that this is not the reason that I, or any of my epidemiologist friends, dislike odds ratios. Rather, it is that we find non-collapsibility to be extremely troubling, and/or that we cannot imagine any causal model that would result in stability of the odds ratio. I made an attempt to formalize this argument in the appendix to the manuscript that was linked earlier in this thread. Check out the appendix section called “An impossibility theorem for the odds ratio”.