Our joint study with EMBO is just out! We are proud to be at the forefront of this sea change. AI will reinforce the central role of scientists in this new era. Strong science should be seen! @tlemberger@EMBO@ReviewCommons
https://t.co/oyO2aL71V1
It’s finally out! Together with @EMBO and @ReviewCommons, we conducted a structured side-by-side comparison of human peer review and our AI scientific review.
Here’s what we did: Authors whose manuscripts had already received journal-agnostic review at Review Commons were provided with an independent AI review generated by @qedScience. The AI analysis was compared to the combined feedback of multiple human reviewers, not to a single report, and had no access to those reviews.
We then asked authors how they evaluate the strengths and limitations of both approaches, and how they would actually want to use AI.
The conclusion was clear: Scientists want AI feedback to strengthen their work IN ADVANCE, under their control. Not as a gatekeeper, but as a tool for constructive input.
That is exactly what we are building at q.e.d! We are on the authors' side.
q.e.d. is not working in isolation; we are collaborating with leading pro-scientists organizations, including EMBO (and other journals), Review Commons, and OpenRxiv (@biorxivpreprint), and are working closely with researchers across fields. At the same time, we are building an alternative model that puts agency directly in scientists’ hands.
Researchers should be the ones deciding when their work is ready to be shared. We are building the infrastructure to support that.
A pleasure doing this with the great Thomas Lemberger @tlemberger and Niv Samuel Mastboim @nivmast
Everyone’s hyped about “AI for Science.” in 2025! At the end of the year, please allow me to share my unease and optimism, specifically about AI & biology.
After spending another year deep in biological foundation models, healthcare AI, and drug discovery, here are 3 lessons I learned in 2025.
1. Biology is not “just another modality.”
The biggest misconception I still see:
“Biology is text + images + graphs. Just scale transformers.”
No. Biology is causal, hierarchical, stochastic, and incomplete in ways that language and vision are not.
Tokens don’t correspond cleanly to reality.
Labels are sparse, biased, and often wrong.
Ground truth is conditional, context-dependent, and sometimes unknowable.
We’ve made real progress—single-cell, imaging, genomics, EHRs are finally being modeled jointly—but the hard truth is this:
Most biological signals are not supervised problems waiting for better loss functions.
They are intervention-driven problems. They demand perturbations, counterfactuals, and mechanisms, beyond just prediction.
Scaling obviously helps. But without causal structure, scaling mostly gives you sharper correlations.
2025 reinforced my belief that biological foundation models must be built around perturbation, uncertainty, and actionability, not just representation learning.
2. Benchmarks are holding biology back more than compute is.
Let’s be honest: Benchmarking in AI & biology is still broken.
Everyone reports SOTA. Everyone picks a different dataset slice.
Everyone tunes for a different metric. Everyone avoids prospective validation.
We’ve imported the worst habits of ML benchmarking into a domain where stakes are much higher. In biology and healthcare, a 1% gain that doesn’t transfer is worse than useless—it’s misleading.
What’s missing isn’t more benchmarks. It’s hard benchmarks:
•Prospective, not retrospective
•Perturbation-based, not static
•Multi-site, not single-lab
•Failure-aware, not leaderboard-optimized
If your model only works on the dataset that created it, it’s not a foundation model—it’s a dataset artifact.
In 2026, we need fewer flashy plots and more humility, rigor, and negative results.
3. “Reasoning” in biology is not chain-of-thought.
There’s a growing tendency to directly apply the word reasoning onto biological LLMs.
Let’s be careful.
Biological reasoning isn’t verbal fluency, longer context windows, or prettier explanations. Those are surface-level improvements. Real reasoning in biology shows up elsewhere: in forming hypotheses, deciding which experiments to run, updating beliefs when perturbations fail, and constantly trading off cost, risk, and uncertainty.
A model that explains a pathway beautifully but can’t decide which experiment to run next is not reasoning, it’s narrating.
2025 convinced me that the future lies in agentic biological AI:
systems that couple foundation models with experimentation, simulation, and decision-making loops.
Closing thought:
AI & biology is not lagging behind AI for code or language. It’s just playing a harder game.
The constraints are real. The data is messy. The feedback loops are slow. The consequences matter.
If 2025 clarified anything for me, it’s this:
We won’t make progress by treating biology like text. We’ll make progress by building AI that behaves more like a scientist : skeptical, iterative, and willing to be wrong.
Onward to 2026.
Very cool! I scientific review is progressing very fast and will change everything. @qedScience@bioRxiv is a great first step. AI review can be useful if it is transparent. More soon from @EMBOPress@ReviewCommons with @qedScience. Watch this space ;-)
IT'S HAPPENING! 💥 I'm psyched to launch the collaboration between @qedScience & @openrxiv@biorxivpreprint! Let's go back to concentrating on making discoveries, and not on “getting published”, because it’s not going to matter soon, when feedback & evaluation are detached from the journals. Preprint + q.e.d = your science is out there, and anyone can appreciate it 👇
Hierarchical Reasoning Model
This is one of the most interesting ideas on reasoning I've read in the past couple of months.
It uses a recurrent architecture for impressive hierarchical reasoning.
Here are my notes:
Interesting: https://t.co/zZ0AQlQtQ5 - low quality papers linking some seemingly random phenotypes eg in UKBiobank, "flooding" the literature. Exploring all possible combinatorial possibilities of phenotype combinations sounds like a science business model now...
You can find EMBO at the other platform for short posts now. Become part of the #lifesciences community there, and follow us for the latest news and updates!
Wonderful news to end the year with: the agreement between the European Union and Switzerland, which includes Switzerland's association with Horizon Europe, has been signed! I look forward to welcoming our colleagues from Switzerland back into the ERC!
At its six sites EMBL’s many groups and teams work across the life sciences in interdisciplinary and collaborative ways.
Meet and join this community on Bluesky, with EMBL starter pack 🦋
https://t.co/LmW5Of0wpl
The @ReviewCommons team is visiting Rome!
28 Nov: La Sapienza Universita di Roma @SapienzaRoma
29 Nov: @EMBL Rome
Meet Project Leader @tlemberger & Managing Editor @monaco_rc to discuss preprint peer review and more!
Read the interview by Thomas Lemberger @tlemberger with Michael Bronstein @mmbronstein#DeepMind professor of #AI@UniofOxford.
AI-driven biology: rethinking experiments and computation:
https://t.co/7avPzTkPty
And watch the related podcast here:
https://t.co/SH4JxkXam3
@blaiseaguera Our conversation is inspired by this new paper from Blaise and collaborators, demonstrating the spontaneous appearance of self-replicating computer code from an initially random assortment. With obvious implications for the origin of life.
https://t.co/fRceQ44E75
What if we could universally recombine, insert, delete, or invert any two pieces of DNA?
In back-to-back @Nature papers, we report the discovery of bridge RNAs and 3 atomic structures of the first natural RNA-guided recombinase - a new mechanism for programmable genome design
My thesis develops 3 capabilities necessary for AI-driven cell engineering:
Universal cell representation
Perturbation effect prediction
Closed-loop experiment design
The goal is a 'virtual cell' that augments hypothesis-driven lab experiments with in-silico experimentation.