@zad & @Sander’s recent pair of arXiv papers [1,2] — on which, BTW, I’ve offered this PubPeer comment — have revived my interest in extending my earlier efforts to reproduce the Supplement. In a couple of posts at PubPeer this past week, I noted discrepancies between the Supplement and the paper’s Figure 1 and also posted results from a partial reproduction. Interestingly, this reproduction exhibits several discrepancies explainable as simple data entry errors occurring during (what I must suppose was) the authors’ hand-collation of ‘post hoc power’ calculations performed manually at the Stata console.
This paper thus gives every appearance of having reported a fundamentally irreproducible activity. Even if reproducible research (RR)—the statisticians’ sterile technique so to speak—has not yet been adopted within Surgery, certainly the concept of a reproducible sequence of actions is foundational to that field. Consider this extended quotation from [3]:
I recall once in the winter of 1964, 6 very young children with transposition of the great arteries were awaiting operation. Operative correction of this problem had not been well defined. Dr Kirklin had used the technique of Dr Senning from Switzerland, and in 1961, he published his experience with 11 pediatric patients; and although most of these patients survived for many years, operative mortality was still 15%. In 1 week, Dr Kirklin performed this operation on 2 of these 6 waiting infants; unfortunately, both died a few days after the operation. Dr Kirklin became noticeably sad and frustrated and informed his residents that he would not be operating for a few days and instructed us to notify the parents of the other 4 children to go home. He gave no other details. … Four days later, Dr Kirklin returned and told us to contact the 4 families to return to Rochester. All returned and were promptly operated on, recovered well, and left the hospital 1 week later; but we all noticed that he used a different operative technique. He informed us that he had visited his friend, Dr William Mustard, the Chief of Cardiac Surgery at Sick Children Hospital of Toronto, who had introduced a new procedure to correct the transposition of the great arteries and published it with good results in 1964. Thereafter, Dr Kirklin performed many of these procedures in 1 stage and published his excellent results in 1965. Dr Kirklin believed that in surgery there are only 2 causes of failure—lack of knowledge or human error—therefore, when confronted with his own ‘‘failure,’’ he sought a new solution for his ‘‘failure.’’ The previous anecdote was a good example.
So there seems to be ample basis for JSR to press for a Retraction & Replacement, with publication of the abstracted data along with a Stata script. Indeed, this episode might serve as a cautionary note that increases the visibility of RR principles among surgical researchers who may not have encountered them.
The reason I connect all this with [1,2] may be found in the following figure from my repro posting:
To the surgeon’s practical, ‘hands-on’ mentality, this figure should thoroughly debunk the efforts in [4] as essentially a sham [statistical] procedure. All that error-prone typing into (and transcription from) the Stata console could have been accomplished with a scientific calculator: just calculate the surprisal s = -\log_2 p, convert to bytes, and subtract 0.05. The only thing lost by this shortcut is the placebo effect of using the word ‘power’.
Furthermore—now in a fully constructive vein—this argument serves also to introduce the ‘compatibilist’ toolkit of Chow & Greenland, which I have begun to appreciate as an especially promising solution to the sorts of problems that one might suppose legitimately motivated the authors of [4].
-
Chow ZR, Greenland S. Semantic and Cognitive Tools to Aid Statistical Inference: Replace Confidence and Significance by Compatibility and Surprise. arXiv:190908579 [q-bio, stat]. September 2019. http://arxiv.org/abs/1909.08579. Accessed October 1, 2019.
-
Greenland S, Chow ZR. To Aid Statistical Inference, Emphasize Unconditional Descriptions of Statistics. arXiv:190908583 [q-bio, stat]. September 2019. http://arxiv.org/abs/1909.08583. Accessed October 1, 2019.
-
Aldrete JS. Dr John W. Kirklin (1917-2004): a unique surgeon. Surgery. 2010;148(5):1038-1039. doi:10.1016/j.surg.2010.07.044
-
Bababekov YJ, Hung Y-C, Hsu Y-T, et al. Is the Power Threshold of 0.8 Applicable to Surgical Science?—Empowering the Underpowered Study. Journal of Surgical Research. 2019;241:235-239. doi:10.1016/j.jss.2019.03.062