New Paper!
Thanks to @KamounLab@tofazzalislam@NurUddinMahmud3
and Congratulations to all the co-authors.
A pandemic clonal lineage of the wheat blast fungus https://t.co/48pCRkEuc4
Binder design has come of age thanks to generative models—but how can we access the wider array of dynamic, multistate protein functions, so elegantly employed by nature?
@mihirbafna14 and I are excited to share SwitchCraft, a framework for designing such functions. (1/7)
Are you analysing hydrophobicity in disordered proteins and still using something like the Kyte & Doolittle scale?
Maybe instead consider the "stickiness-scale" that Fan & Giulio derived using SAXS data for >100 IDPs
https://t.co/hN3KtuzsWh
Proteo-R1 (ICML 2026), the first reasoning protein foundation model for protein design, is out! 🚀🧬
Most protein design models generate structures without ever *reasoning* about which residues matter. We think that's backwards.
Human protein engineers👩🔧 don't work this way. They identify critical interaction residues first — charged anchors, hydrophobic hotspots, specificity-determining motifs — and only then optimize geometry around those decisions.
━━━━━━━━━━━━━━━━
🔬 THE CORE IDEA
━━━━━━━━━━━━━━━━
A dual-expert architecture that explicitly decouples molecular understanding from geometric generation:
→ ⚡A multimodal LLM (understanding expert) analyzes protein sequences, structures, and text to identify key functional residues governing binding and specificity
→ ⚡A diffusion model (generation expert) then co-designs sequence + structure — but with those residues locked in as hard constraints
━━━━━━━━━━━━━━━━
📐 HOW IT'S TRAINED
━━━━━━━━━━━━━━━━
Three-stage curriculum:
① Multimodal Alignment — freeze the LLM, train projections to bridge ESM-2 + AF3-style structural features into language space
② Structural Reasoning Mid-Training — unfreeze the LLM, teach it residue grounding → pairwise geometry → interface localization → hotspot prediction
③ Joint Reasoning-Guided Design — end-to-end on antibody-antigen complexes. Gradients from the diffusion objective flow back through the reasoning expert.
━━━━━━━━━━━━━━━━
📊 RESULTS
━━━━━━━━━━━━━━━━
Evaluated on simultaneous multi-CDR redesign and the RAbD CDR-H3 benchmark:
✅ Best RMSD & DockQ on RAbD — redesigned H3 loops are geometrically accurate *and* docked well
✅ Lowest backbone dihedral divergence (JSDbb) among all baselines
✅ Reduced intra- and inter-chain steric clashes
✅ Generated sequences score lower perplexity than native antibodies under IgLM & AbLang
✅ Plug-and-play: swapping the diffusion backend to UniMoMo still improves RMSD and IMP
━━━━━━━━━━━━━━━━
💡 WHY IT MATTERS
━━━━━━━━━━━━━━━━
Proteo-R1 isn't just a better antibody design model. It's a blueprint for coupling deliberative LLM reasoning with any physical generative process — interpretable, modular, and backend-agnostic.
📄 Paper: https://t.co/efquYg3O76
💻 Code: https://t.co/Qxm06IZ4xy
🌐 Demo: https://t.co/nkfEWY32OA
Great thanks to my wonderful collaborators Weihao Xuan, Heli Qi, @Hanqun_CAO, Heng-Jui Chang, @KKuanPang@XiangruTang Zehong Wang, @hcwww_ , @KejunYing@lupantech Chiho Im, Seungju Han, @richardxp888@tikgiau. Also appreciate the guidance from advisors @YejinChoinka@jure@erranlli Naoto Yokoya, Masashi Sugiyama.
Our 9-hour flight from Nairobi to London isn’t that different from the flight of the painted lady butterfly you will see in your gardens this summer.
Read about it here: https://t.co/JSq1prcHNc
A Qubit costs ~$5,260.
I built one for $39.
Not a toy version. A fully working DNA fluorometer: the device you use to measure how much DNA there is in a sample.
This mattered because my first sequencing run underperformed partly because I didn’t know exactly how much DNA I was loading.
For nanopore sequencing, input DNA quality matters a lot. Too little and the pores are underutilised. Too much and flow cell longevity is compromised.
The underlying device is not complicated.
A DNA fluorometer works by adding a dye that binds to DNA, shining light at the sample, and measuring the fluorescence.
The BOM is basically:
> $23 optics + sensor
> $8 Arduino/electronics
> $6 screws/nuts
> $2 enclosure plastic
Biotech especially is full of equipment with insane idiot indexes. With AI you don't really have an excuse not to 1) work out what that the index for a piece of equipment is and 2) build your own version if it's irrationally high.
THINK BEFORE YOU BUY.
A couple of months ago, I announced that I was partway through implementing a simple, readable AlphaFold2 in pure PyTorch, inspired by @karpathy's minGPT.
Today, I'm happy to share minAlphaFold2 - the completion of that project.
Repo link: https://t.co/bU59VUm5sB
We introduce ConforNets, a mechanism for conformational control in AlphaFold3 models
- SoTA at producing diverse conformations on every multistate benchmark (N=104)
- Novel capability: transfer state from one protein to another
Outperforms BioEmu, ConforMix and AFsample3
🧵1/8
Stunning new landmark study from #HHMIInvestigator David Reich & team @harvardmed lets us watch human evolution in real time across 10K yrs, & what ~16K ancient genomes reveal is striking: Natural selection didn't stop when civilization began — it sped up: https://t.co/ebth0rD8N7
I just published: #CelebratingAlumni: The Lab Is a Launchpad, Not the Destination
These last weeks brought several alumni into sharp focus, and I couldn't resist writing it down. A celebration. With links.
https://t.co/1vJfGiIt8y
We have been cooking up this story for a while and we are excited to finally be able to share!
Read on if you're interested in whole plant regeneration WITHOUT the application of hormones!
I am Open-Sourcing PyMolAI!
Meet PyMolAI, an AI agent that can talk to your protein structures.
Built on top of PyMOL, PyMolAI lets you interact with your structures in plain language. Whether you're:
- Analyzing protein structures
- Aligning complexes
- Creating publication-ready figures
- Or running design workflows
PyMolAI interprets your request, executes the necessary PyMOL commands, and manages the workflow for you.
It integrates with @OpenBioAI APIs, giving you access to tools like Boltz, ProteinMPNN, and BoltzGen — directly from your PyMOL session.
It has local chat history with session syncing, so you can pick up exactly where you left off.
Postdoctoral Opportunity. We are seeking one or more NIH/NIGMS funded postdoctoral researchers to work on host–nematode interactions.
Deadline: Feb 28, 2026
Start: Flexible (Summer–Fall 2026)
More details here: https://t.co/YMA9nkRxai
Apply by emailing [email protected]
Introducing The Structural History of Eukarya (SHE): The first proteome-scale phylogeny constructed entirely from 3D structure.
We computed 300 trillion alignments across 1,542 species to map the tree of life. 🧵👇 (1/5)
We wrote a focussed review on recent structural advancements in NLRs. It was a lot to cover we tried our best. We hope its enjoyable to experts in the field and useful to newcomers. Check it out: https://t.co/E8djfdthjm
🔬⚡ The MAPPI system’s orthogonal setup enables Ca²⁺ imaging across the entire shoot — from two angles at once.
@ScienceAdvances
https://t.co/4DHOXTKSeC
@ritamarquex@TheSainsburyLab@greenit_bio4sus@itqbnova Of these 9, 5 are nodule-specific, and 3 of the 5 show downregulation in roots. Within the 9 highly expressed NLRs in nodules, 7 belong to the TIR‑NLR class and two to the CC‑NLR class. Among the nodule‑specific NLRs, 4 are TIR‑NLRs and one is a CC‑NLR.
@ritamarquex@TheSainsburyLab@greenit_bio4sus@itqbnova Fascinating findings. Using RNA-seq data from Roux et al. (2014), I found that the 106 root-specific NLR genes identified in Medicago truncatula, 9 exhibit high expression in nodules.