Excited to share PoET-2, our next breakthrough in protein language modeling. It represents a fundamental shift in how AI learns from evolutionary sequences. 🧵 1/13
De novo binder design is now available on our GUI! Run full binder workflows with integrated backbone generation (RFdiffusion, BoltzGen), sequence design (PoET-2, ProteinMPNN), and structure validation, bringing end-to-end binder design into a single interactive workflow.
We’ve been selected as a performer on DARPA's NODES program to extend our state-of-the-art PoET-2 foundation model to capture structural dynamics and predict protein function in fundamentally new ways. 1/6
We’re excited to announce our expanded partnership with Boehringer Ingelheim. Together, we are building the future of AI‑driven antibody discovery and optimization.
https://t.co/ifAPfYqOYr
Design protein binders in minutes, not months.
New on https://t.co/NGI2kaXi7Z
→ Improved protein design GUI & refolding metrics for candidate filtering
→ New structure design models (RFdiffusion, BoltzGen)
Nanobody + miniprotein design walkthroughs now live! Links in thread
New job openings @openprotein across protein foundation model research, computational protein design, and cloud platform engineering
https://t.co/4dHDlbXfVX
Understanding Protein Function with a Multimodal Retrieval-Augmented Foundation Model
1. PoET-2, a new protein language model, achieves state-of-the-art performance in predicting the effects of mutations on protein function, especially for challenging cases like insertions/deletions and higher-order mutations. This model combines sequence, structure, and evolutionary information in a novel way to improve protein understanding and design capabilities.
2. The model incorporates a hierarchical transformer encoder and dual decoders with both causal and masked language modeling objectives. This dual training approach allows PoET-2 to excel in both generative tasks (like sequence generation) and bidirectional representation learning, making it versatile for various protein-related tasks.
3. PoET-2 leverages retrieval augmentation, which enables it to learn from context and incorporate new sequences not present in the original training data. This feature enhances its ability to adapt to different protein families and their specific evolutionary constraints, leading to more accurate predictions.
4. In zero-shot variant effect prediction, PoET-2 outperforms previous models significantly, especially on datasets involving multiple mutations and indels. It also shows superior performance in supervised settings with limited data, demonstrating excellent data efficiency and generalization ability.
5. The model's architecture includes a structure-based attention bias mechanism, which integrates structural information into the attention operations. This enhances the model's ability to capture 3D structural relationships, contributing to its improved performance in tasks related to protein structure and function.
6. PoET-2 is compact, with only 182 million parameters, making it efficient and scalable. Despite its smaller size, it matches or exceeds the performance of much larger models, highlighting its efficiency and practicality for real-world applications in protein engineering and design.
7. The authors demonstrate PoET-2's effectiveness across various benchmarks, including deep mutational scanning and clinical datasets. The model's ability to predict the fitness effects of mutations accurately can accelerate the development of new therapeutics and enhance our understanding of disease mechanisms.
📜Paper: https://t.co/uHmQAg0VVm
#ProteinEngineering #AIinBiology #MachineLearning #ProteinFunction #MutationPrediction
PoET-2 (and more) is available now on the @openprotein cloud.
We also open sourced PoET-2 on github: https://t.co/k8wK0bS1Pw
Thanks to @timt1630 and the rest of the @openprotein team!
If you're interested in the PLM used here, check out PoET-2 from @openprotein! It allows controllable protein generation with sequence (homologue) and optional structure conditioning and is available right now. 🦾
At @OpenAI, we believe that AI can accelerate science and drug discovery. An exciting example is our work with @RetroBiosciences, where a custom model designed improved variants of the Nobel-prize winning Yamanaka proteins. Today we published a closer look at the breakthrough. ⬇️