CEO @Ligandal 🧬: Tissue-specific targeting with AI + peptides (LigandForge). On a mission to enable precision personalized predictive medicine. Forbes 30U30.
Extremely excited to announce LigandForge 🧬⚡
Generate high-quality peptides at over 10,000x - 1M the speed of state-of-the-art methods like Bindcraft and Boltzgen. Predict binding affinity with 83% correlation to experimental binding data. 150 protein targets benchmarked.
Making insane progress
Finally got Fable to tell me that the mitochondria is the powerhouse of the cell
Expect to solve all biology this time next month
How do @adaptyvbio twinstrep + splitGFP tags impact predicted binding of short (21aa avg) peptides designed with optimal linker placement (N vs. C)?
ESMFold2 appears to be able to recover decent ipSAE scores with more seeds on a concave surface of CD8 dimers.
@RolandDunbrack Legend!
We use ipSAE scoring as our primary Boltz-2 metric alongside DeltaForge dG/Kd predictions on ligandai. You can also get ipSAE and other scores through our SDK. Grateful you created this scoring approach.
Model ensemble validations are now available via LigandAI SDKs. Compare Boltz-2, ESMFold2, OpenFold, and Protenix-2 along with all relevant scorings on your peptides and binders.
@GregPreibisch@proteinrosh We are saying the same thing and DeltaForge post-folding IS calibrated with 5-fold OOD cross-validation in terms of dG prediction and binary classification.
@proteinrosh It doesn't seem to be an issue with cutoff. ESMFold2 works fine on 198 native peptides. It outperforms Boltz-2 across the board both for ESM-heldout and ESM+Boltz-heldout complexes. However, for LigandForge-generated peptides (no X-ray PDBs), the two models diverge significantly.
@ebetica@proteinrosh Here is a comparison with and without physics-based potentials. The thicker white line represents mean values for OFF (left) vs. ON (right). Also shown are aggregate stats.
@ebetica@proteinrosh Here is a comparison with and without physics-based potentials. The thicker white line represents mean values for OFF (left) vs. ON (right). Also shown are aggregate stats.
C is NOT validated experimentally, to be clear. Our goal is to create reliable workflows for computational predictions, calibrated on as much experimental evidence as possible, to maximize hit rates. Addressing model divergence and reliability is a critical component for our and other people's production workflows. This is a work in progress. LigandForge generates up to ~1000 seqs per second and our aim here is to have a production workflow on https://t.co/S400YiKjuT that has the most reliable folding methodologies as oracles, before you click "synthesize" for BLI validation.
@GregPreibisch Yes, it compares 198 native experimentally validated crystallographic sequences with various training holdouts against Boltz-2 and ESMFold2, vs. LigandForge-generated peptides.
https://t.co/KQh4CUIeLq
@GregPreibisch@proteinrosh Well, yes, experimental success rate is the ultimate goal for everyone given unlimited $$$. However, if you can’t discriminate false-negatives, or models disagree, that is important for the computational bio community to tease out so that our folding models get more predictive.
There’s a reason that DNA synthesizers are export-controlled. However, the most concerning applications and sequences are, ironically, not the most obvious. A sensible policy is to screen the users and orgs, more so than the specific sequences being made, as far as threat vectors go.