"genes are code" is always vague
I like:
cell nucleus → storage device / storage controller
ribosome → JIT-compiler and runtime
features from a world model (use a SAE) → functions
Proteins → processes
signaling pathways → workflows
Phenotypes → behaviors / outputs
@biohub
Calling something a world model is popular these days.
In Biohub’s case, though, it feels warranted.
The new release consists of what Biohub calls a system with three artifacts: ESMC, ESMFold2, and ESM Atlas.
☑️ ESMC is the protein language model. It was trained on ~2.8 billion protein sequences to guess missing amino acids. That massive data set produced an internal map of protein biology - something like a world model.
☑️ ESMFold2 is the structure model built on top of that map. It predicts how proteins fold and how they interact with other molecules.
Biohub says the system scales in two ways. More training compute made ESMC’s protein representations better. And more compute during ESMFold2 predictions improved antibody-target modeling.
When the team used the system to design protein interfaces and tested them in the lab, they found binders that worked across five cancer and immunology targets.
And take a moment to absorb the importance of this quote:
“What we've shown is that these models have learned such a high-fidelity world model of biology that you can design protein interfaces computationally, take them into the laboratory, and they function as predicted,” says Alex Rives, Biohub’s Head of Science.
☑️ The release also includes ESM Atlas: 6.8 billion protein sequences and 1.1 billion predicted structures. (2/9)
For those interested in trying out ESMFold2, you can use the Biohub API or run the Hugging Face weights directly.
Here's an example of codex using ESMFold2-Fast weights on a neocloud (lambda ai) for predicting the structure of a small VIPR protein + RNA complex. Codex used ChimeraX for the visuals.
Congratulations to @Biohub on today’s release of ESMFold2, ESMC, and ESM Atlas, an ambitious step toward a world model of protein biology.
Biohub’s new AI system learns the underlying rules of protein biology from billions of protein sequences, enabling it to predict protein structures, map relationships across the protein universe, and design entirely new proteins computationally.
We’re excited to see SandboxAQ included among the platforms helping make these models more accessible to researchers and developers worldwide.
Open, scalable AI systems like these have the potential to accelerate how scientists explore protein function and translate biological insight into real-world impact. Can't wait to see what the research community builds next!
Learn more at https://t.co/JQbNx7Zer3
Very impressive work by @alexrives and the CZI team on building a world model of protein biology. I’m especially thrilled to see the models and data are fully open-sourced. These contributions pave the way towards a better understanding of human physiology, and plenty of new health-care discoveries. Exciting times!
today was a massive day for protein engineering.
esmfold2 dropped—next gen of the esm series, fully open on @huggingscience. 1.1 billion predicted structures, 6.8 billion sequences. 800m more entries than the alphafold db, and reportedly edging out alphafold3 on protein complexes, including antibody–antigen binding.
alongside it: the new esm atlas. a huge expansion of known protein space, heavy on metagenomic sequences from soil, ocean, and the parts of biology that have been least characterised (until now!!)
and if that weren't enough, litefold dropped the fineweb of proteins, so every major protein database (pdb included) aggregated, cleaned, and made plug-and-play in one place.
these are the releases that push the whole field forward, and the pace of open science right now is almost motion-sickness inducing
all of it on https://t.co/T4l4r1lDz0 (and ofc @huggingface)
Congrats to the @biohub team! Excited to embed this in the workflow of scientists everywhere.
@salcandido was kind enough to sit down with us and take us behind the scenes building ESMFold2: https://t.co/ALOPKaDiBw
Today’s release includes ESMFold2, ESMC, and ESM Atlas. These tools represent a major scientific advance in one of biochemistry’s hardest problems: designing proteins that bind to specific targets. They are freely available at https://t.co/TaS3XOdTqZ
Today we're announcing ESMFold2, an open scientific engine to power prediction, design, and discovery across protein biology.
The new model delivers state of the art performance on protein interactions, especially antibodies, a critical modality for therapeutics.
We have designed and validated miniprotein binders and single chain antibodies across five therapeutic targets that are important in cancer and immunology. We are seeing very high success rates, and affinities at levels consistent with therapeutic activity.
We’re also releasing an atlas of 6.8 billion proteins, and 1.1 billion predicted structures.
ESMFold2 is built on a state of the art language model that has been trained on billions of protein sequences.
A world model of protein biology emerges through language modeling.
We’ve used the techniques of mechanistic interpretability developed to understand large language models to understand the concepts ESM uses to represent proteins.
The model’s representation space has a compositional organization of features across scales, levels of complexity, and abstraction, that reflects and mirrors the understanding of protein biology developed through a century of empirical science.
This understanding emerges without prior knowledge, just from language modeling of protein sequences.
Language models are becoming a powerful substrate to understand and program biology.
The design of protein interactions is one of the most fundamental problems in biophysics, and has critical implications for the discovery of new medicines. A simple gradient based search with the model was able to discover high-affinity protein binders.
I'm excited by the potential this has to accelerate basic science and the understanding of proteins. And especially for the new avenues it opens up for therapeutic design and medicine.
“未来的世界属于懂 Token 的人。” —— 出自一位10岁博主之口。
刚换了 Mac Studio 的他,不是为了打游戏,而是为了“养龙虾”(跑多个 AI Agent 协同工作)。他把复杂的 AI 产业链比作一个大蛋糕,从能源层到应用层,层层剖析。
别觉得小孩在过家家,他讲的“Token 是 AI 时代的硬通货”这个观点,可能比很多专家的报告都接近本质。
这届小孩的 AI 认知已经 Next Level 了,建议大人们反复观看,治治我们的“算力焦虑”。👇
NSF announces $1.5B NSF X-Labs initiative to pursue generational breakthrough science efforts. NSF X-Labs will scale a new generation of transformative independent research organizations to advance breakthrough science outside of traditional institutions. https://t.co/LljEcBoBFa
After the former YouTube CEO, Susan Wojcicki, died of lung cancer at 56, her husband and sisters are continuing her mission to improve how the disease is detected and treated. @ReeveWill has the story.
Anthropic Says Life Sciences Is Its Biggest Bet After Code.
Eric Kauderer-Abrams started @AnthropicAI 's life sciences division ten months ago. He took on the stage at @SynBioBeta with Marc Tessier-Lavigne from @Xaira_Thera , and what caught my attention was how plainly Eric stated the following:
"The greatest opportunity to have a beneficial, scaled impact with everything that's happening in frontier AI is in the life sciences."
After coding, it's their biggest investment area. They've been training Claude on bioinformatics, chemistry, molecule design, structural biology, clinical regulatory. Their models went from mediocre in life sciences to roughly PhD level across most domains in under a year. That's a steep curve.
But what I found more telling than the benchmarks was the infrastructure they're building around it. Wet labs for basic research so their own scientists hit the walls firsthand. An acquisition of Coefficient Bio (acquired by Anthropic) to teach @claudeai how to think like a biotech program manager, not just a bench scientist. The gap between "Claude can answer a biology question" and "Claude can help you run a drug program" is enormous, and they're clearly aware of it.
Marc mentioned that 90% of drugs fail in the clinic. Two-thirds of those failures aren't bad science, but patient matching. You have a good target, a good drug, and you can't find who will respond. That's the problem both of them kept circling back to, and it's where causal AI models trained on real perturbation data might actually move the needle.
Marc said nobody's pushing a button for a development candidate anytime soon. But Anthropic went from $1B to $30B in revenue in sixteen months. That kind of resource behind this kind of focus is new. It's fun to think of what R&D can look like in the next few months!
#SynBioBeta2026 #SyntheticBiology #Biotech #AIxBio
We are closer than ever to building AI models that can reason about living cells. But those models are only as good as the data they're trained on. Getting the data right is the critical step, and Biohub is leading the way.
Read more on the Virtual Biology Initiative:
Scaling laws are powering AI. It’s time to scale biology.
Today we’re launching the Virtual Biology Initiative to generate the data to unlock scaling laws in biology and build accurate predictive models of the cell.
Digital representations of proteins are already expanding our understanding of life at the molecular level, and accelerating the design of molecules and medicines. Accurate digital representations of the cell could reveal the mechanisms that are responsible for disease, and show how to reverse them.
The protein data bank, and worldwide repositories of protein sequence biodiversity were created through decades of work by the scientific community. The advances in artificial intelligence for proteins would not have been possible without them.
The cell is orders of magnitude more complex, and we will need to create the data in just a few years rather than decades.
This will require a coordinated global effort. We're partnering with Broad, Wellcome Sanger, Arc, Allen, Human Cell Atlas, Human Protein Atlas, NVIDIA, and Renaissance Philanthropy.
Biohub is contributing to this effort as both a funder and a builder. We are developing microscopy to observe millions of cells in living organisms, and cryo-ET to resolve the cell in atomic detail. We're building instruments that expand the range of modalities and parameters that can be simultaneously measured. We’re developing molecular, cellular, and tissue engineering to create models of disease and design interventions.
The data we generate will be available to the worldwide scientific community.
We’re also committing $100M over the next five years to support work beyond Biohub.
We invite other scientific teams and funders to join.
Link: https://t.co/93Nw1QT5iZ
New record🥇
The Artemis II astronauts are now farther from Earth than humans have ever been! At 1:57 p.m. EDT, they broke the record set by Apollo 13 in 1970.
Their journey around the far side of the Moon today will take them a maximum distance of 252,752 miles from Earth.
Christina Koch was a firefighter at the South Pole at -111°F before she ever applied to be an astronaut. That was maybe the fourth most interesting line on her resume. She grew up in North Carolina, got three degrees from NC State, and her first real job was building deep-space instruments at NASA.
Then she left for Antarctica. Spent three and a half years bouncing between the Arctic and Antarctic as a research scientist, including a full winter at the South Pole base. That means going months without sunlight or fresh food, with a crew of about 50 people and no way out until flights resume. While she was down there, she also joined the glacier search-and-rescue team.
After coming back, she went to Johns Hopkins and built instruments for two NASA missions (one of them is still orbiting Jupiter right now). She figured out how to start a tiny vacuum pump that NASA designed for a future Mars rover. Johns Hopkins nominated it for their Invention of the Year in 2009. Then she went back to the field. More time in Antarctica and a stretch up in Greenland. A government research station in northern Alaska, near the top of the world. Then she ran another one in American Samoa, near the equator.
In 2013, NASA selected her from 6,300 applicants. Eight people got in. Her first space mission was supposed to be a normal rotation on the International Space Station, but NASA extended it. She ended up staying 328 straight days and orbiting Earth 5,248 times, covering about 139 million miles (roughly 291 round trips to the Moon). Up there, she ran over 210 experiments, including tests of cancer drugs in zero gravity and 3D printers that can build structures close to human tissue. Six spacewalks, 42 hours floating outside the station. She learned Russian for the training. She flies supersonic jets.
Right now, Koch is on Artemis II, heading for a flyby behind the far side of the Moon. The crew launched on April 1 and is on track to travel about 252,000 miles from Earth, which would break the all-time human distance record of 248,655 miles set by Apollo 13 in 1970. That record has stood for 56 years, and it was set during a disaster that nearly killed the crew. Fred Haise, one of the Apollo 13 astronauts, is 92 now. He told Koch: "I heard you're going to break our record."
Nobody had left Earth's neighborhood since December 1972. Koch and her three crewmates are the first in 53 years, and they are coming home at about 25,000 mph. That is faster than any crewed spacecraft has ever come back through the atmosphere.