Neil Thomas @countablyfinite - Twitter Profile

Pinned Tweet

8 days ago

It's been exhilarating to watch this model get better and better, and I’m grateful to work with such an incredible, cross-disciplinary team across folding, binder design, and interpretability! This paper also sets a new scaling law for papers, compressing 3 papers into 1.

Alex Rives

@alexrives

8 days ago

Today we're announcing ESMFold2, an open scientific engine to power prediction, design, and discovery across protein biology. The new model delivers state of the art performance on protein interactions, especially antibodies, a critical modality for therapeutics. We have designed and validated miniprotein binders and single chain antibodies across five therapeutic targets that are important in cancer and immunology. We are seeing very high success rates, and affinities at levels consistent with therapeutic activity. We’re also releasing an atlas of 6.8 billion proteins, and 1.1 billion predicted structures. ESMFold2 is built on a state of the art language model that has been trained on billions of protein sequences. A world model of protein biology emerges through language modeling. We’ve used the techniques of mechanistic interpretability developed to understand large language models to understand the concepts ESM uses to represent proteins. The model’s representation space has a compositional organization of features across scales, levels of complexity, and abstraction, that reflects and mirrors the understanding of protein biology developed through a century of empirical science. This understanding emerges without prior knowledge, just from language modeling of protein sequences. Language models are becoming a powerful substrate to understand and program biology. The design of protein interactions is one of the most fundamental problems in biophysics, and has critical implications for the discovery of new medicines. A simple gradient based search with the model was able to discover high-affinity protein binders. I'm excited by the potential this has to accelerate basic science and the understanding of proteins. And especially for the new avenues it opens up for therapeutic design and medicine.

74

2K

445

705

589K

6

65

4

5K

countablyfinite retweeted

Zeming Lin

@ebetica

1 day ago

How to design your own PD-1 binder in 4 easy steps: 1. Download the tutorial notebook from the ESM team 2. Get a @modal API key to scale it up 3. Scaling it up, O($1000) will get you a 96 well plate of minibinders with >50% success rates on typical targets 4. Test it in the lab!

ebetica's tweet photo. How to design your own PD-1 binder in 4 easy steps:

1. Download the tutorial notebook from the ESM team
2. Get a @modal API key to scale it up
3. Scaling it up, O($1000) will get you a 96 well plate of minibinders with >50% success rates on typical targets
4. Test it in the lab! https://t.co/PHwMwzyEuz

5

89

15

57

9K

Neil Thomas

@countablyfinite

about 19 hours ago

Yes! This protocol is an open-source replication of the protocol we describe in our preprint. The hit-rates we report were validated in the lab across multiple targets via biolayer interferometry (BLI). You can read more about how we validated our binders in our preprint, including functional screens for PD-L1, epitope validation via ELISA, and even Cryo-EM structure determination! https://t.co/xoCRtAAm33

countablyfinite's tweet photo. Yes! This protocol is an open-source replication of the protocol we describe in our preprint. The hit-rates we report were validated in the lab across multiple targets via biolayer interferometry (BLI).

You can read more about how we validated our binders in our preprint, including functional screens for PD-L1, epitope validation via ELISA, and even Cryo-EM structure determination!

https://t.co/xoCRtAAm33

1

4

0

3

71

Neil Thomas

@countablyfinite

1 day ago

ESMFold2 can be inverted to design new protein binders including miniproteins and scFvs! Take our protocol for a spin on @modal! https://t.co/cJ2O5yYRox

Thomas Hayes

@THayes427

2 days ago

I’m so excited about the launch of ESMFold2, ESMC, and the new ESM Atlas. This was a massive team effort, and I’m grateful to have worked with such an incredible group @biohub. A headline result I’m especially excited about: ESMFold2 can design minibinders and antibodies with nanomolar affinity, target selectivity, and functional activity against therapeutically relevant targets. Today, we’re sharing the full binder design protocol.

2

77

23

33

139K

1

66

5

31

4K

Who to follow

Roshan Rao

@proteinrosh

Research scientist @Biohub.Foundation models for biology. Prev: Co-Founder/RS @EvoscaleAI, RS @MetaAI, PhD @berkeley_ai.

Brian Hie

@BrianHie

AI for biology @Stanford and @arcinstitute

Alex Rives

@alexrives

AI for scientific discovery. Head of Science, Biohub. Founder and scientific director of the ESM project.

countablyfinite retweeted

Polly Fordyce @fordycelab

1 day ago

Characterizing AI-designed proteins requires quantitative biochemistry at massive scale. Enter Amplicon/Protein Bead Display (APB-Display), a fully in vitro platform that quantifies Kd's for >100,000 variants in <3 days (preprint link below!) @Stanford_ChEMH @czbiohub (1/n)

fordycelab's tweet photo. Characterizing AI-designed proteins requires quantitative biochemistry at massive scale. Enter Amplicon/Protein Bead Display (APB-Display), a fully in vitro platform that quantifies Kd's for >100,000 variants in <3 days (preprint link below!) @Stanford_ChEMH @czbiohub (1/n) https://t.co/2nl6RooUtO

2

401

94

222

49K

countablyfinite retweeted

Thomas Hayes

@THayes427

2 days ago

I’m so excited about the launch of ESMFold2, ESMC, and the new ESM Atlas. This was a massive team effort, and I’m grateful to have worked with such an incredible group @biohub. A headline result I’m especially excited about: ESMFold2 can design minibinders and antibodies with nanomolar affinity, target selectivity, and functional activity against therapeutically relevant targets. Today, we’re sharing the full binder design protocol.

2

77

23

33

139K

countablyfinite retweeted

Jonathan Whitaker

@johnowhitaker

4 days ago

A few edible plants have proteins that sit close to miraculin in the ESM Protein Atlas, so I thought I'd try extracting what protein I could from said plants and tasting it... Anyway, null result but an excuse to muck about :) Video lab notes: https://t.co/FwPzCU8R6O

johnowhitaker's tweet photo. A few edible plants have proteins that sit close to miraculin in the ESM Protein Atlas, so I thought I'd try extracting what protein I could from said plants and tasting it... Anyway, null result but an excuse to muck about :) Video lab notes: https://t.co/FwPzCU8R6O https://t.co/kDu0Igcl6b

5

76

12

19

11K

Neil Thomas

@countablyfinite

4 days ago

@jonathanbylos the “roast me” is implicit and will take years of therapy to undo

0

9

0

4K

Neil Thomas

@countablyfinite

5 days ago

claude just told me that my proposed algorithm was "numerically dead on arrival" hope your day is going better

22

284

10

23

328K

Neil Thomas

@countablyfinite

5 days ago

for those wondering, he was correct

1

103

0

3

7K

Neil Thomas

@countablyfinite

5 days ago

@ahramkim1128 If API credits are an issue, consider requesting additional compute credits - we support academic research! https://t.co/fvtT8kiOZB

1

0

44

countablyfinite retweeted

Brian Naughton @btnaughton

7 days ago

I added ESMFold 2 to https://t.co/lzZKSe3GKc -- and some other niceties like an optional reference pdb

1

85

15

39

3K

countablyfinite retweeted

Romain Lopez

@_romain_lopez_

6 days ago

We built a joint experimental and computational platform for scalable multi-modal single-cell chemical screens — profiling RNA, protein (including phospho-signaling), and chromatin accessibility responses to thousands of small molecule perturbations in parallel. https://t.co/M5x4CNLCTA

_romain_lopez_'s tweet photo. We built a joint experimental and computational platform for scalable multi-modal single-cell chemical screens — profiling RNA, protein (including phospho-signaling), and chromatin accessibility responses to thousands of small molecule perturbations in parallel. https://t.co/M5x4CNLCTA

2

180

40

121

13K

Neil Thomas

@countablyfinite

6 days ago

.@proteinrosh out here recruiting research partners

Roshan Rao

@proteinrosh

6 days ago

We've done a million of these deep dives into interpreting and understanding ESMC features, it's just that we don't quite know how to write about them other than to say "here are a bunch of cool observations".

1

29

2

14

6K

0

24

1

6

1K

countablyfinite retweeted

Roshan Rao

@proteinrosh

6 days ago

We've done a million of these deep dives into interpreting and understanding ESMC features, it's just that we don't quite know how to write about them other than to say "here are a bunch of cool observations".

1

29

2

14

6K

countablyfinite retweeted

MolBioMike

@MolBioMike

7 days ago

Finally got the BDBV Trimer folded properly using esmfold2!

0

35

6

3

2K

countablyfinite retweeted

Oligo Research @OligoResearch

7 days ago

@ebetica @anshulkundaje It looks like that made a big difference. It found a much higher confidence pose (0.85 ipTM vs 0.81 with AF3 and 0.8 earlier with ESM2) that actually makes much more sense than the original pose and plausibly explains its MOA. Also no artifacts. Amazing work!

0

14

3

2K

Neil Thomas

@countablyfinite

6 days ago

@alexechu_ @hla_michael We did extensive evaluation of the 6B SAEs but haven't been as thorough in the 300 and 600M models. But we released SAEs for all the ESMC models at all layers in the hopes that the community takes a look! https://t.co/RssvyHZEA5

0

5

0

2

165

countablyfinite retweeted

Ada Fang

@AdaFang_

7 days ago

One feature of the @biohub ESM C release that I think deserves more attention is the interpretability of its latent space. There has been a lot of discussion about whether interpretability is useful for scientific ML models. I think it can become very useful, especially when AI agents can use a model’s internal representations to reason about biology. Here is one example of an AI agent with access to ESM C SAE features correctly interprets the loss-of-function mechanism behind a variant. There is still a lot to improve in how AI agents use model interpretability, but this is an exciting direction for AI agents that don’t just make predictions, but inspect learned representations to generate mechanistic hypotheses. Read more in our blog: https://t.co/QmJlCzJVe4 We've also released the SAE-enabled skills for variant interpretation, loss-of-function analysis, structural annotation, functional mechanism interpretation, and evaluation against experimental datasets via ToolUniverse @ScientistTools Thanks to the team behind this! @GaoShanghua @_yepeng @marinkazitnik @countablyfinite @HarvardDBMI @harvardmed @Harvard @KempnerInst

AdaFang_'s tweet photo. One feature of the @biohub ESM C release that I think deserves more attention is the interpretability of its latent space.

There has been a lot of discussion about whether interpretability is useful for scientific ML models. I think it can become very useful, especially when AI agents can use a model’s internal representations to reason about biology.

Here is one example of an AI agent with access to ESM C SAE features correctly interprets the loss-of-function mechanism behind a variant.

There is still a lot to improve in how AI agents use model interpretability, but this is an exciting direction for AI agents that don’t just make predictions, but inspect learned representations to generate mechanistic hypotheses.

Read more in our blog: https://t.co/QmJlCzJVe4

We've also released the SAE-enabled skills for variant interpretation, loss-of-function analysis, structural annotation, functional mechanism interpretation, and evaluation against experimental datasets via ToolUniverse @ScientistTools

Thanks to the team behind this! @GaoShanghua @_yepeng @marinkazitnik @countablyfinite

@HarvardDBMI @harvardmed @Harvard @KempnerInst

1

203

21

126

12K

Neil Thomas

@countablyfinite

7 days ago

Cool work showing that shifts in the ESMC latent space can be interpreted via agentic workflows to give some mechanistic insight into variant effects. I think we're just at the beginning for this type of analysis.

Marinka Zitnik

@marinkazitnik

7 days ago

AI agents are learning to read @biohub protein models @GaoShanghua @AdaFang_ @_yepeng https://t.co/hPR7IYr9f0 We explored how AI agents powered by ToolUniverse @ScientistTools can interact with new ESM models 🧬 Mutation and loss-of-function analysis Agents compare reference and mutant proteins, identify SAE features most affected by a mutation, and connect those perturbations to structural and functional consequences. The agents then relate these changes to experimental evidence, including deep mutational scanning measurements, to explain potential loss-of-function mechanisms 🧪 Functional mechanism exploration Agents analyze protein representations to identify functional tracks associated with specific molecular activities. By linking SAE features to protein regions, structures, and annotations, the agents can generate hypotheses about how proteins carry out their functions Check out new SAE-enabled ToolUniverse skills for variant interpretation, loss-of-function analysis, structural annotation, functional mechanism interpretation, and evaluation against experimental datasets @HarvardDBMI @harvardmed @Harvard @broadinstitute @KempnerInst

marinkazitnik's tweet photo. AI agents are learning to read @biohub protein models @GaoShanghua @AdaFang_ @_yepeng

https://t.co/hPR7IYr9f0

We explored how AI agents powered by ToolUniverse @ScientistTools can interact with new ESM models

🧬 Mutation and loss-of-function analysis

Agents compare reference and mutant proteins, identify SAE features most affected by a mutation, and connect those perturbations to structural and functional consequences. The agents then relate these changes to experimental evidence, including deep mutational scanning measurements, to explain potential loss-of-function mechanisms

🧪 Functional mechanism exploration

Agents analyze protein representations to identify functional tracks associated with specific molecular activities. By linking SAE features to protein regions, structures, and annotations, the agents can generate hypotheses about how proteins carry out their functions

Check out new SAE-enabled ToolUniverse skills for variant interpretation, loss-of-function analysis, structural annotation, functional mechanism interpretation, and evaluation against experimental datasets

@HarvardDBMI @harvardmed @Harvard @broadinstitute @KempnerInst

2

78

20

50

16K

0

32

2

19

4K

countablyfinite retweeted

Alex Derry @awfderry

8 days ago

Super excited to share what we've been working on! ESMC/ESMFold2 show that protein language modeling learns the principles of protein biology and can be used for state-of-the-art structure prediction and design. We also built an interactive atlas of over 6.8 billion proteins!

1

34

7

3

2K

Neil Thomas

@countablyfinite

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users