@AndersHuitfeldt, perhaps it is too late for a retraction, as you have already contacted the editor and the author seems unconvinced. Thus, you should submit a letter, where you explain your objections. This enables a future researcher to cite your letter when confronted with Doi et al eg in a review.

I my field, there is a faulty statistical paper (suggests a number of analyses based circular reasoning) published in a high ranging medical journal. It is really annoying and difficult to argue against doing these analyses as reviewers can be quite illiterate when it comes to statistics

Good luck – and don’t that this too serious (It is only work)

Furthermore, a companion post on PubPeer would alert anyone (with the browser extension installed) that there is a comment on the paper.

Thank you Esben. I have already sent a rapid response to BMJ Evidence-Based Medicine, they have not published it yet and have told me that they are analyzing the situation first.

I understand some of you think I am overreacting. I hope you can see my perspective, that this is about much more than “only work” to me. Just to make sure you understand why I am doing this, I am going to tell my story:

10 years ago, when I was a doctoral student at the Harvard School of Public Health, I had an idea that I thought was going to change the fundamentals of medical statistics and clinical decision making. It has since turned out that this idea has been rediscovered multiple times, as variations around the same theme, starting with Mindel C. Sheps in 1958. However, the idea is largely unknown among academics, and I contributed significantly to the theory by clarifying the underlying causal model and how this relates to work from other academic disciplines.

Very early after I completed my doctoral degree, I concluded that I have no interest in staying in academia unless I could work on theoretical methodological questions that arise from this idea. It seemed very clear to me that this touches on matters that are both important and of great interest to the academic community: If we had a clear answer for how to choose between effect measures, we would not only resolve the never-ending Twitter discussions between statisticians and epidemiologists/clinicians, we could also potentially resolve very significant theoretical disagreements between economists and statisticians about linear probability models. Moreover, this idea touches very directly on transportability/generalizability, which has become a central focus in methods research in the years since I started working on it (though with a dominant paradigm that in my view misses the point, by focusing on counterfactual distributions rather than effect measures).

With that background, I decided to very literally gamble my career on the correctness and importance of the idea. This gamble did not seem like a big deal at the time: I assumed that I would get a fair hearing, and that if there was a problem with the idea, the flaw would be spotted and I could leave academia with my head hung high and no hard feelings.

That is, unfortunately, not what happened. I have been ignored by almost everyone, and met with countless rejections from journals and conferences, without any attempt to identify a flaw in the argument. From my perspective, I am simply being blocked by hostile reviewers from having a platform for making the argument. I therefore end up constantly disappointing my employers with lack of research output/productivity, and having to move to a new country where someone is willing to give me a chance. Meanwhile, I am constantly being advised that if I want to stay in academia, I need to start working on something else. Which is probably accurate practical advice, but the conditional clause about wanting to stay in academia simply does not apply. After almost a decade of working on this, I am at very real risk of becoming permanently unemployable due to my obsession, despite holding both a medical degree and a doctoral degree.

I will never be at peace with giving up, unless someone identifies a critical flaw in my argument, or points to an alternative approach that works significantly better. All that I ask for, is that competent statisticians and epidemiologists evaluate my arguments. These scientists are, of course, in part acting rationally: They assign a low prior probability to my papers being significant, and they therefore have little incentive to put in the work that would be required to fully evaluate their merits.

This is the context I find myself in when people like Suhail Doi make explicit and public claims on scientific forums that my ideas are “wrong”, based on an incoherent paper that somehow sailed through peer review, making a mockery of the process that has held me back for so many years. Due to the credibility he holds by being a Professor, his public claims can be expected to even further lower the prior probability other statisticians assign to my work being worth engaging with.

I am therefore forced into finding a way to make it common knowledge that Suhail’s arguments do not work. Ideally, we would have a functioning market place of ideas where high status statisticians could back up my claims. For whatever reason, that is unfortunately not happening in this situation (probably because nobody reads papers anymore, nor do they care whether they are correct). My only recourse is to insist that BMJ Evidence-Based Medicine ask their statistics experts to evaluate it for retraction.

For people who do not know the full background, it may appear as if I am acting with unnecessary hostility. I want to be clear that this is not a reflection of my usual temperament, that I was put in a quite unusual and extreme psychological situation here, with much more at stake than “just work”.

My only advice is to not be tied to one thing. It’s fine to emphasize one thing but best to be attached to multiple ideas and to practice multiple methodologies.

As I said above, I have received variations of this advice many times. Unfortunately, the utility function is not up for grabs: If changing the emphasis of my research to this extent is necessary, I would rather leave academia.

But before I give up, It is important for me that my work has been thoroughly evaluated. I can’t live with giving up, unless I can pinpoint what was wrong with the idea

In a NewsHour retrospective on Serena Williams, a sportswriter noted she “left it all on the court.” Thank you for your courage in doing the same here, Anders. I take it that the definitive expression of your hypothesis is here? https://arxiv.org/abs/2106.06316v5

I see I had marked up v1 extensively, but with intervening v3 & v4 both marked ‘Major update’ I am presumably far behind the current v5. I am glad to see James Scanlan dropped from your refs.

That’s not what I’m suggesting. It’s fine to have an emphasis. But having multiple other areas that get a significant amount of your time, and learning a variety of tools is a good recipe.

This is going to be my last post on this thread and leave it with some suggestions for @AndersHuitfeldt

a) Being extremely rude and unprofessional in communication means lack of seriousness even if there was a point – in this case the point is also lacking which makes it worse

b) It does not help ones reputation or career to take the lazy route of making a litany of defamatory comments about a paper or beg editors to retract a paper to fulfill vested interests or personal views. Neither does it help to write a letter that contains nothing more than personal opinion and post to the editors

c) It only pays off to do the hard work of writing a counter-point based on scientific concepts that address the specific issues that may be thought need to be made and a key requirement is to be able to discuss disagreements logically and rationally. For examples see the three responses in the journal after my paper that started this thread.

d) As expected, the behavior in a) above leaves no room for scientific progress and the main responses thus far over the last 34 posts confirm this since they have just resulted in people taking advantage of this situation to engage in philosophical (or in some cases crude) swipes at each other

I don’t think there will be much to add to this discussion until some third party weighs in to resolve the actual methodological disagreement. My response to your accusations will depend very significantly on the final consensus about whether I was right to call for retraction of your paper.

In a hypothetical world where it turns out that I was wrong, I can assure you that an apology will be forthcoming and that I will withdraw from methods research. I do however think that this hypothetical is highly unlikely, and I urge you to start thinking about your course of action if I am proven right.

This thread raises the spectre of a tweet from Vineet Tiruvadi:

If you start with the wrong framework then the ability to do complex analyses may seem like it's giving insight, but what you're mostly doing is studying how wrong your framework is #academictwitter #scitwitter #medtwitter pic.twitter.com/2Y6ZgQDtFL

— Vineet Tiruvadi (@vineettiruvadi) February 21, 2021

Perhaps this entire research programme has devolved into studying *the search* for a mythical One Effect Measure, losing touch with the real underlying problems of clinical decision-making and risk communication?

Anders, as heavy as the notational burden of your paper (v5) is, I would like to see it support *formally* placing your §2 “formaliz[ation] of how such individualization is done in practice” into the ideal context where “direct evidence for personalized risk under intervention” (§1.1) is *not* absent. Is the particular heuristic you adopt (applying an effect function g_\lambda : \mu_0 \rightarrow \mu_1) totally generic? Does it emerge *formally* as somehow ‘natural’? Or is it merely one of a much larger class of heuristics for which examples could be provided? I’m struck by the usefulness of categorical concepts to Maclaren & Nicholson’s presentation in this paper. Have you considered a categorical formulation of your ideas?

I do not believe that my research program falls into this trap. I am very clear that effect heterogeneity is the default situation, and that homogeneity should only be invoked in special situations where this corresponds to reasonable beliefs about biological mechanisms.

It is not fully generic, in the sense that it is possible for two groups to have the same μ0 but different μ1. However, any kind of reasoning based on effect measure stability (for any parameter) can trivially be represented in this way, making it a very flexible and general representation of how findings from randomized trials are used in practice to inform individual level decision making (approaches based on risk difference, risk ratio, odds ratio etc are all special cases). Moreover, if an effect measure λ happens to be stable between groups, it is easy to prove that the approach based on its associated effect function gλ will lead to correct results when transporting results between those groups

I am unfamilar with categorical formulations, and will have to read up on it. As always, there are multiple mathematically equivalent ways to formulate the same ideas. I am always willing to try new formulations in order to communicate better with the statistical community (with a preference for simplicity whenever possible). So far, I have tried three equivalent formulations (counterfactual outcome state transition parameters based on counterfactual variables, causal pie models and modified causal DAGs), but I am always willing to try new formulations if this helps the reader. Ideally, I would need a coauthor who is familiar with such formulations to attempt this

I am not going to make a judgment about this because I haven’t studied the issues deeply enough. But I am going to call a halt to **ALL** posts that even **HINT** at personal judgements. Everyone: stick to the science and stick to clear non-esoteric examples. I will delete any post that includes any form of attack on other person from this point on.

My rapid response has now been posted at https://ebm.bmj.com/content/early/2022/08/10/bmjebm-2022-111979.responses

The editors asked me to shorten my response, which led to deletion of the paragraph about why I consider it “not even wrong”. Unfortunately, the title from the original version was retained, which may be confusing to readers who are not familiar with mathematical lore.

The letter was also edited to change my request to appoint a statistician to evaluate the paper for retraction, such that it now instead reads as a request for clarification. I want to be clear that I stand by my insistence that such evaluation is necessary.

The key claim in the paper is a purported proof that *“the conventional interpretation of the risk ratio is in conflict with Bayes’ theorem.”.* If this was truly proven, it should be very easy to find a statistician who has read and understood the paper, and who is willing to publicly stake their own credibility on the claim that this conjecture is true and proven. That is the implicit standard work flow of mathematical publishing: When a claim of a proof is published, it is assumed that there exist others are willing to publicly defend the correctness of the claim if necessary. Is there anyone who is willing to do that in this case?

Statisticians taking up your challenge:

This whole situation is sociologically interesting, perhaps revealing cultural differences between mathematical and medical scientific traditions. In mathematics, the scientific advance is almost always a theorem and its proof, meaning that the manuscript contains absolutely everything relevant to evaluating its correctness. In contrast, in medicine, so much is dependent on trusting the authors on their claims about the data. This leads to very different cultures

Doi et al have claimed a theorem and a proof, and got it published in a medical journal. I have questioned the soundness of their theorem, and publicly staked my reputation and my future in methods research on this. This is not an action I would take lightly. At the very least, I would have expected a number of statisticians to weigh in on the issue.

This manuscript is very clearly intended to be understood as a work of mathematics, in the sense that the contribution is fully deductive. It must be held to the standards of mathematics. When I cast doubt upon the published scientific record, one would think the community gives priority to resolving whether my accusation is correct. That cannot happen if everyone hides from the controversy

Have you considered that maybe everyone is still burned out from the last time something like this happened?

Suhail Doi has now posted the following rapid response at the journal website. I reproduce it here in its entirety:

The problem in evidence-based medicine arises when we port relative risks derived from one study to settings with different baseline risks. For example, a baseline risk of 0.2 and treated risk of 0.4 for an event in a trial gives a RR of 2 (0.4/0.2) and the complementary cRR of 0.75 (0.6/0.8). Thus the ratio of LRs (RR/cRR) is 2/0.75 = 2.67. If applied to a baseline risk of 0.5 the predicted risk under treatment with the RR “interpretation” is 1.0 but with the ratio of LRs “interpretation” is 0.73. Here, the interpretation of the risk ratio as a likelihood ratio, using Bayes’ theorem, clearly gives different results, and solves the problem of impossible risks as clearly depicted in the manuscript and the example.

If, in our effort to highlight the need of this correct interpretation, we have used strong wording that annoyed the commentator we feel the need to express regret. We hope that the commentator could also feel similarly for his scientifically unbecoming choice of wording that culminated with “Doi’s Conjecture”.

In this response, Suhail finally admits (at least implicitly) that he is trying to make claims about transportability/generalizability, not about “interpretation”. Moreover, he also states that the argument depends on the property which we have called “closure”. This admission is helpful, as I can trivially win any discussion against Suhail if we agree that the goal is transportability, and that his argument relies only on closure. I am not going to repeat these arguments here, as they have been stated repeatedly in this thread and in several papers and preprints.

The response still makes no attempt to clarify what he means by “interpretation”. Interestingly, this response appears to suggest that when “interpreting” the relative risk as a likelihood ratio, the investigator is constrained to only using it for transportability purposes in the form of the ratio of the standard RR and the complementary cRR, as (RR/cRR). This object is usually known as the odds ratio. No attempt is made to establish why the interpretation of the relative risk as a likelihood ratio forces the investigator to use the odds ratio for transportability purposes.

Suhail’s erroneous and unsubstantiated claim remains on record, as the key message of a published paper: that interpreting the relative risk as a relative risk (i.e. interpreting a spade as a spade) is “inconsistent with Bayes Theorem”. My position is unchanged: I consider retraction of this paper to be scientifically necessary.

Doi accuses me of scientifically unbecoming choice of wording, and expresses hope that I would express regret for this. I want to be clear that I regret nothing. I remind readers that Suhail Doi dragged me into this discussion, by tagging me publicly on this forum immediately after his paper was published in BMJ Evidence-Based Medicine, with a claim that this paper puts closure to our discussion about choice of effect measure. When I tried to exit the conversation, he dragged me back in with continued claims that his “math” proves unequivocally that no object derived from the complementary relative risk can ever have value as an effect measure (an implication of which is that my life’s work is worthless). What does he expect to happen - does he think I have some kind of ethical obligation to pussyfoot around the undeniable fact that his paper is incoherent and that its main message is an impossible and unjustified claim about Bayes Theorem?

This incident has caused me to lose a lot of respect for the integrity of the scientific publishing model and for academics working in medical statistics. The fact that this paper sailed through peer review is in itself reminiscent of the Sokal Affair. The facts that the journal doesn’t seem to care a single bit when I point out that the paper is incoherent, and that the academic community doesn’t care enough for anyone to go on record with their opinions, beggars belief. What is the point of having a formal system of peer review if we aren’t making a good faith attempt to verify the correctness of the scientific record?

My latest manuscript, “Mindel C. Sheps - Counted, dead or alive” will appear as a commentary in Epidemiology in 2023, half a century after Sheps’ death in 1973. This manuscript highlights Sheps’ important contributions to the discussion about choice between effect measures, and is also an attempt to simplify arguments I have made elsewhere in support of Sheps’ conclusions. The final author manuscript is now available as a preprint on arXiv, at [2211.10259] Mindel C. Cheps: Counted, dead or alive