We are excited to partner with the OpenAI Foundation (@FoundationOAI) to use @ArcInstitute's AI lab-in-the-loop to tackle Alzheimer’s disease. We have established Arc as a full stack biology and AI research organization to go after complex human diseases, and Alzheimer’s in particular. These major killers have resisted traditional approaches because they aren’t caused by a single gene, single pathogen, or single cell type. Instead, they emerge from diverse genetic and environmental risk factors interacting across cell types over decades, creating enormous combinatorial possibilities.
To navigate this complexity, we’re building a comprehensive causal map of how genetic and environmental risk factors interact and reroute cellular signaling in the early stages of the disease. Our “lab in the loop” involves an active learning cycle where predictive AI models guide large-scale genetic/immune/metabolic perturbations guided by clues in patient data, and the experimental data from our human brain organoid models sharpens the causal ML that learns from each cycle to guide the next round.
This research is progressing alongside Arc’s machine learning efforts (e.g. Evo, Evo 2, State, Stack, Virtual Cell Challenge, and more to come!), and represents the other flagship side of what we do at the institute. Learn more about our Alzheimer’s Disease Initiative: https://t.co/uCOaR2wfwK
#Breaking: Today we announced positive additional Phase III clinical results of our investigational BTK inhibitor in people living with relapsing multiple sclerosis (RMS).
Learn more: https://t.co/CLEmQz603p
1/ Exciting news! We just released a preprint on a new single-cell technology, ME-seq, that maps DNA methylation, gene expression & chromatin accessibility—all in the same cell, at scale. And yes, it’s a LOT cheaper 🤑than current methods! 🧬 #SingleCell#Epigenetics
AI-designed proteins that survive 150 °C and nanonewton forces
Proteins are usually fragile machines. Heat them, pull on them, or send them through a high-temperature sterilization step (like those used in hospitals), and most will unfold and aggregate, losing their function. Yet many natural systems—like muscle titin or spider silk—hint that if you organize β-sheet hydrogen bonds in the right way, you can get remarkable mechanical strength and thermal resilience.
Bin Zheng and coauthors take that idea and push it to the extreme. Starting from the titin I27 domain, they use an AI+MD pipeline—RFdiffusion for backbone generation, ProteinMPNN for sequence design, ESMFold/AlphaFold2 for structure prediction, and steered/annealing MD for screening—to systematically elongate the force-bearing β strands and maximize backbone hydrogen bonds in a shearing geometry.
Across multiple design rounds, they grow the network from 4 to 33 backbone H-bonds, creating a “SuperMyo” series of proteins with unfolding forces above 1,000 pN—roughly 4× stronger than I27 under the same pulling conditions. Remarkably, these proteins not only refold after force, but also retain structure and function after exposure to 150 °C and repeated high-temperature sterilization cycles, and can be used as crosslinkers to make hydrogels that survive those treatments intact.
The message is powerful: by combining generative protein design with physics-based simulations, it’s now possible to turn a simple principle—pack as many shear-mode hydrogen bonds as possible into β sheets—into synthetic proteins and materials that rival or surpass nature’s own mechanostable systems, enabling protein-based hydrogels and biomaterials that remain functional under conditions that would normally destroy conventional proteins.
Paper: https://t.co/PMwfQpylqb
I made a Goodreads for academic papers!
(..and blog posts, substacks, lesswrong, etc)
Paper Trails [https://t.co/dPzhlBgo8E] is something I built because I wanted a place where engaging with research felt fun, beautiful, and personal to you
I hope you give it a try & love it!
Today in @Nature, in work led by @aditimerch, we report the ability to prompt Evo to generate functional de novo genes.
You shall know a gene by the company it keeps! 1/n
Today in @Nature we report a new prime editing strategy that can rescue a common cause of many genetic diseases in a disease-agnostic manner. This approach converts a redundant endogenous tRNA into an optimized suppressor tRNA, enabling a single prime edit to rescue premature stop codons across different diseases.
(1/15)
https://t.co/zs0qu5bhXx
Why I have been distracted: Vik Bajaj and I co-founded “Project Prometheus” (Vik built it), with Jeff Bezos—impressive resume and good references, and Rick Klausner, and an amazing team of founding AI researchers and engineers. More to come later. https://t.co/GX0jOOB3kJ
Today, we're announcing @episteme, a new type of R&D company that recruits exceptional scientists to pursue high-impact ideas.
Science isn’t bottlenecked by the availability of talent, but by places where they can do their best work.
Scientific progress has driven human flourishing: extending lifespans, lifting billions from poverty, and expanding our understanding of the universe.
But history is littered with transformational ideas that were overlooked in their time. That problem is still acute today: too much promising talent remains uncultivated, and remarkable ideas die in the lab or are filtered out by misaligned incentives.
Today, scientists face suboptimal paths for translating their research into impact: academia is famously risk-averse and incentivizes publications and winning grants vs. translational research. Industry is too often focused on short‑term incentives. And startups lack the substantial capital, expertise, and complex infrastructure needed to deliver long-term scientific progress.
On top of that, recent funding cuts in the US mean the overall supply of ideas is decreasing. Put together, the global scientific production system is operating at a fraction of its capacity.
How Episteme operates is different: we identify great scientists who can meaningfully benefit humanity, but who aren’t supported efficiently within traditional institutions today. Researcher by researcher, we work with them to determine the bespoke resources, operational support, and environmental conditions to execute on their research. We bring them together in-house, and provide those resources to ensure that their breakthroughs are deployed for real-world impact.
We’ve already assembled an amazing team of operators, ranging from the Gates Foundation, DeepMind, ARPAs, DoE – just to name a few – and researchers who are pursuing important problems across physics, biology, computing, and energy. Our team has spoken to hundreds of researchers across disciplines and geographies to understand the limitations they’re facing and what can be done better, and designed Episteme for them.
We’re backed by individuals like @sama, Masayoshi Son, and other long-term partners who share our mission of enabling ambitious science for tangible human impact.
About me: I started working as a researcher 9 years ago, on problems ranging from AI-driven drug discovery to developing brain-machine interfaces. It was that experience that led me to realize that so many scientists with great potential to change the world don’t have access to opportunities equal to their capacities.
@sama and I believe that much better science should happen for humanity, and that a new engine is needed to support that. We decided to cofound Episteme together, and I am incredibly grateful for Sam’s unwavering support as a thought partner and founding investor.
Our conviction is that by supporting the right people with the right incentives, we're set to generate breakthrough discoveries to benefit humanity. We cannot rely on the course of history to shape scientific progress; we need to proactively shape the system by supporting the most talented people with the right resources and incentives.
We’re thrilled to share that our MERFISH+ preprint is now live on bioRxiv!👉https://t.co/M1SIyfnzMa
In this work, the Bintu and Zhu labs (UCSD) developed MERFISH+, a next-generation spatial genomics platform that combines genome-wide RNA and epigenetic imaging over a large field of view. By introducing acrydite-modified probes covalently anchored to hydrogels, MERFISH+ achieves remarkable imaging stability and enables >1,800-gene, multi-modal, and multi-month experiments.
With this platform, they, together with the Chi lab at UCSD, profiled a whole developing human heart at 12 post-conception week with merely two slides, resulting in a total of 53 slides, 3.1 million single cells and more than 30 cell types. Building upon our previous 3D reconstruction and modeling framework, Spateo (https://t.co/a0BC0Cf3Ec), we reconstruct the 3D human heart that nicely captures the anatomical structure of the heart, including the intricate vasculature network. Sophisticated analyses provide a holistic view of an entire organ and enable systematic characterization of 3D cellular neighborhoods and transcriptional gradients of substructures such as the descending arteries. Furthermore, using a generative integration framework for spatial multimodal data (Spateo-VI), we harmonized these MERFISH+ transcriptomic and chromatin data to reconstruct a 3D spatially-resolved multi-omics atlas of the developing human heart, shared at https://t.co/Jby4Pppzbr and https://t.co/s33SS7jvYL. MERFISH+ thus sets a new standard for large-format, multi-omic spatial profiling, enabling holistic, 3D characterization of organs at subcellular resolution.
Huge congratulations to first authors Colin Kern, @qingquanZhang2, @YifanLu2024 , and Jacqueline Eschbach, and to all collaborators from the Bintu, Zhu, Chi, and Qiu labs for this amazing team effort. Thanks for your diligence, creativity, and hard work on this project. We’re grateful for support from @arcinstitute and our generous donors.
Our lab is expanding—if you’re excited about building the next generation of single-cell and spatial genomics techniques and predictive single cell and spatial foundation models, we’re hiring! If you are interested, please reach out to me via direct message or email at [email protected].
We are excited for any potential collaborations along this line of research in Stanford, UCSF and Berkeley and other labs as well.
This year’s medicine laureate Shimon Sakaguchi discovered a new class of T cells.
Sakaguchi was swimming against the tide in 1995, when he made a key discovery. At the time, many researchers were convinced that immune tolerance only developed due to potentially harmful immune cells being eliminated in the thymus, through a process called central tolerance.
Sakaguchi showed that the immune system is more complex and discovered a previously unknown class of immune cells, which protect the body from autoimmune diseases.
#NobelPrize
Excited to share that my PhD thesis work is out in @ScienceMagazine today. We demonstrate robust rearrangement of the human genome using bridge recombinases, performing programmable insertions, excisions, and inversions at megabase-scale.
Many of the most complex and useful functions in biology emerge at the scale of whole genomes.
Today, we share our preprint “Generative design of novel bacteriophages with genome language models”, where we validate the first, functional AI-generated genomes 🧵
third-party hosting and federated learning lets lilly share their models and biotech partners contribute without exposing anyone's data
exciting step forward in connecting a buildup of data silos
Can you rejuvenate an old brain by giving it young immune cells? 🧠
My lab @calico put it to the test. In our new study, we replaced the brain's immune cells in old mice with young ones.
The result? The old brain environment forced the young cells to age RAPIDLY. A 🧵👇
a core tension in applying "virtual cell" models to therapeutic development is the mismatch between the scale at which we measure biology and the scale at which we intervene.
we often measure at the cell level (eg single-cell RNA-seq), but we treat at the tissue or organ level (e.g., cardiac fibrosis, skin rash).
drugs act on tissues or organs, sometimes with cellular specificity, but often not. biologics and CGT are increasingly targeted, but their impact still depends on the broader tissue context. so the outcome of an intervention—efficacy, toxicity, side effects—emerges at the organ or patient level, not the single cell.
a cell-level model might predict that drug X will downregulate the TGF-β pathway in fibroblasts. but will that reverse lung fibrosis in vivo? that depends on:
- whether the drug reaches the relevant cell types,
- whether it affects the broader ECM remodeling loop,
- whether the immune system modulates or counteracts it, etc
this is why virtual cell predictions can be correct but irrelevant—they solve a problem at the wrong level of abstraction.
a couple promising strategies to address this:
- multiscale modeling, embedding cellular simulations inside larger tissue-level or organ-scale models
- spatial transcriptomics, which adds context to cell-level data by preserving spatial relationships—more closely mirrors tissue biology
- surrogate modeling, training higher-level predictors (e.g., for clinical biomarkers or histopathology) using outputs from cell models
🚨 Just dropped! GREmLN is a graph-based transformer model that uses the molecular logic of cells to enable insights into disease states. This nimble model accounts for causality + locality using gene regulatory networks📄 Read the preprint: https://t.co/qTDeV9msph
Why are some brain cells more sensitive to aging?
This light sheet microscopy video from Claire Gizowski in the @Oliver__Hahn Lab shows a cross section of a mouse brain after a process called tissue clearing, which makes it transparent so scientists can look deep inside. (1/3)
Excited to share #AlphaGenome, a start of our AlphaGenome named journey to decipher the regulatory genome! The model matches or exceeds top-performing external models on 24 out of 26 variant evaluations, across a wide range of biological modalities.1/6