Agentic AI and Peer Review

trumanfrancis · June 2, 2026, 2:16pm

I have run some of my own papers through Claude for peer review. You can upload methodologic papers to refine its performance. Do you think statistical peer review could ultimately become automated, or is it too complex and in some cases domain specific to make this viable? Perhaps the answer is “time will tell”.

f2harrell · June 3, 2026, 7:30am

A loaded question. For most journals, AI stat peer review will be better than the current system. I see so many serious statistical design and analysis errors getting by human reviewers. It will depend on which skills we trust to hand to AI in forming a stat reviewer. I would like to be involved in such skill development.

JiaqiLi · June 4, 2026, 6:38am

AI is capable of many tasks, and its abilities are growing at an astonishing rate. I wonder to what extent work in academia and pharmaceutical companies could be replaced by AI. Examples include designing research protocols, conducting data analysis, and writing manuscripts, as well as drug discovery, clinical trials, and efficacy evaluation in pharmaceutical companies. I have heard that data analysis in the pharmaceutical industry is highly standardized, which makes it highly suitable for AI.

Furthermore, I believe these tasks should not be entirely delegated to AI; they require supervision by experienced researchers. This raises the question of how to train future researchers. Most researchers accumulate experience gradually through projects, starting with basic tasks. However, if basic tasks are handed over to AI, how will researchers gain experience? I do not believe AI will completely replace humans, but this will certainly be a massive transformation. How should academia and industry adapt to this change?

trumanfrancis · June 5, 2026, 11:13am

An analogy I have considered is that you could train someone to perform cardiac bypass surgery in probably 6 months to a year with you tube videos and a little hands on experience. What differentiates the actual professional is the fundamental understanding of anatomy, physiology and pathophysiology. The trainee would be fine until something was not congruent with the videos or little experience they had. The professional is flexible and adaptable due to their fundamental understanding of the domain. AI still strikes me as being potentially overconfident and unable to adapt to situations beyond what it is trained on. One real advantage it has that was pointed out by Vinay Prasad as he discussed a paper on AI generated vs Physician generated discharge instructions, is that the bot never fatigues.

Elias_Eythorsson · June 5, 2026, 12:56pm

I wonder whether the model moving forward should be screening with AI and confirmation/ overread by human reviewers. The editorial board or possibly a single reviewer could decide to confirm it unchanged, add to it, or override it and switch to old school manual human review. I think we should then increase cultural and infrastructure investment in open source post publication peer review, such that the most controversial or important publications receive increased scrutiny, whilst less consequential papers are mostly reviewed by AI and the review published with the paper.

Something needs to be done the current model doesn’t feel sustainable or a good use of anyone’s time.

R_cubed · June 5, 2026, 3:07pm

Can someone explain the benefit of what is at best, a lossy compression algorithm doing an initial review? There are a host of assumptions that go unchecked in any of these discussions.

A global collection of research mathematicians have signed and released the Leiden Declaration, that discusses the limitations of LLMs in mathematics. Those insights could certainly be expanded to applied statistics.

From the first item listed:

Current automated techniques can produce plausible but unreliable (or even incorrect) arguments which are difficult to distinguish from correct mathematical proofs. This applies not only to informal arguments, but also to formalizations, where the difficulty lies in the translation between computer-encoded and human presentations of concepts. These fast-moving developments put our present system of review under increasing pressure, jeopardizing our ability to implement traditional standards for the correctness, transparency, and independent verifiability of proof.

Here is a news report on the declaration with a target audience of physical scientists: