I think our experience with publishing this paper actually illustrates why the current publication system is broken. We had nearly the smoothest possible experience, from Nature to Nature Neuroscience we had the same reviewers who were professional and fair. We didn't have any problems at all but the whole process still took a year and a half. Often this type of paper can take 3 years to publish.
Lab work requires meticulous attention to detail; one wrong pipetting technique, and you've accidentally created a self-aware slime mold that now controls the lab's Wi-Fi.
We just released this consistency check into our core review flow. It will catch the big and little inconsistencies. And as always, you can filter comments by severity depending on how much detail you want.
Read more: https://t.co/GawG9js79A
We built a manuscript consistency check to catch places where a paper contradicts itself.
Found some interesting stuff when we tested it on a highly cited 2020 Nature Medicine paper on COVID-19 transmission dynamics.
A new Nature Medicine paper claims that frontier models (GPT 5.2, Gemini 3.1 Pro, Opus 4.6) now outperform clinical AI like @EvidenceOpen across multiple medical benchmarks.
I looked into it further with Reviewer3 and it has a few flaws that undermine its major conclusions:
@EvidenceOpen Also near negligible agreement between the human evaluators (described as "fair" in the paper) and potential self-preference bias in the LLM evaluators
https://t.co/PiG0ErZLxP