Greg Preibisch, MD

@GregPreibisch

Engineering a virus-free future with AI-driven vaccines. Doctor of Medicine & ML | Cofounder @Deepflare

San Francisco, CA

Joined July 2025

159 Following

97 Followers

115 Posts

Greg Preibisch, MD

@GregPreibisch

5 days ago

Yeah exactly. I have been a fan of protein language models since 2021, then actually realized that protein language is not that interpretable, but once you have structure, it starts to be. Design based 100% on sequence is extremely way harder vs just looking at folds and computing RMSDs + analyzing sequences. Structure is the semantics of protein language models.

Greg Preibisch, MD

@GregPreibisch

5 days ago

@nanogenomic @proteinrosh Good to see mindset like this! I’m glad this discussion was productive and led to additional analysis! The only thing I still miss is expected hit rate above/below some ipSAE/pae metric threshold so the result is more interpretable. I guess that’s the next step

Greg Preibisch, MD

@GregPreibisch

6 days ago

Be aware of this kind of reasoning. Biology is extremely humbling. I once analyzed 2 different structural models. One of them was overconfident - everything failed to be solubly expressed. Another model had "mid" metrics, but everything was working perfectly. Everybody is gangsta in in silico metrics, until it has to be validated in the lab.

Greg Preibisch, MD

@GregPreibisch

6 days ago

Hmm, here are the sources that can be useful for validation of "oracle" for the de-novo design. https://t.co/6LhNoNIdIY https://t.co/l88PttbJ71 Both datasets are pretty easy to analyze and very valuable. It gets more tricky if we speak about how to validate the efficacy of design campaigns, but it's a completely different debate.

Greg Preibisch, MD

@GregPreibisch

6 days ago

Most of the researchers/founders don't want to ship actual drugs. They want to show their method/platform is crushing it. Those problems are good model problems because many people know/tried to solve them, so it's easy to compare vs different methods. If somebody cares about shipping things, applies existing methods to low-hanging fruit. To be fair I think a lot of Protein design race is an irrational chase for new de-novo methods, while we can already solve an enormous number of problems by applying existing methods to "low hanging fruits".

Greg Preibisch, MD

@GregPreibisch

6 days ago

@nanogenomic Now it’s much more clearer! That clearly explains left and mid panel! Can you explain how exactly C was validated? I don’t see clearly the connection between exp result and prediction.

Greg Preibisch, MD

@GregPreibisch

6 days ago

Well, I strongly disagree with this because it leads to false conclusions as it’s not grounded in the data. You have plenty of data from campaigns such as protein database or David baker lab paper. You can just use this data. This is a perfect data for this purpose - actually that’s how @TimothyPJenkins team validated ipSAE as successor of iPAE.

GregPreibisch retweeted

Woj Kulikowski

@wojventures

6 days ago

First in person demo of endgoal groupchats at our Warsaw engineering dinner last night

Greg Preibisch, MD

@GregPreibisch

6 days ago

@AyushmanMallick happy to see the progress, just dm me to brainstorm again whenever you need!

Greg Preibisch, MD

@GregPreibisch

6 days ago

Hmm, I think I would rather look at the experimental success rate correlation rather than purely in sillico metrics. I’ve tested for some targets 2 different folding tools. Rfdiffusion- mpnn-> folding. First - in sillico hit rate had nothing to do with correlation of success. One model was just hallucinating more often. Second - I think folding tool performance depends on target. For example esmfold2 crushes Chai1 on influenza like structures, but is worse at RSV. Before I run any campaign I always look at optimal model and hyperparams for a given structure family. Actually I prefer having few models. One as screening, second as final validation.

184

Greg Preibisch, MD

@GregPreibisch

7 days ago

That's really interesting, but what may be surprsing - I think the value is bigger in protein design rather than in pure structure prediction. Everytime, I pick final candidates for testing I bootstrap confidence intervals of structural metrics. I don't pick only the best candidates - I pick those with low variance of scores.

Greg Preibisch, MD

@GregPreibisch

8 days ago

@AlexanderKalian Biology is not a problem to be solved, but reality to be experienced.

Greg Preibisch, MD

@GregPreibisch

8 days ago

@proteinrosh @TomSercu @sokrypton @biohub @AyushmanMallick Update - here is the comparison with Boltz-2 from @boltz_bio. Seems that's an issue of esmfold2, not Boltz-2.

GregPreibisch's tweet photo. @proteinrosh @TomSercu @sokrypton @biohub @AyushmanMallick Update - here is the comparison with Boltz-2 from @boltz_bio. Seems that's an issue of esmfold2, not Boltz-2. https://t.co/M5sPIuA1Sj

Greg Preibisch, MD

@GregPreibisch

9 days ago

Hey guys (cc: @proteinrosh @TomSercu @sokrypton @biohub @AyushmanMallick) - to dig into this I used a protein I’ve worked on, RSV prefusion F, which has well-characterised flexible loops (the site-Ø epitope). Hypothesis: the model is generally well-calibrated, but overconfident specifically on intrinsically flexible regions. I measured each residue’s intrinsic flexibility as the Cα-lDDT between 3 different PDB structures (and between chains within each PDB), then folded the same sequences with ESMFold2 (with MSA) and compared pLDDT vs actual lDDT (vs the experimental structures) - calibration error on the y-axis, intrinsic flexibility on the x-axis. Result: pLDDT tracks true accuracy well overall (r≈0.83), but the more flexible the region, the more overconfident the model gets - overconfidence averages +7.6 lDDT pts on flexible residues vs +1.4 on rigid ones. This is easy to miss in standard evals: well-ordered residues vastly outnumber flexible ones, so aggregate calibration looks fine while the flexible-region overconfidence gets averaged away - a flexibility-class imbalance.

Tom Sercu

@TomSercu

9 days ago

@GregPreibisch @AyushmanMallick @sokrypton @GoogleDeepMind @chaidiscovery @biohub @boltz_bio @proteinrosh What is the diagnosis? Have a MWE repro you can easily share? We'll get to the bottom of this

Greg Preibisch, MD

@GregPreibisch

9 days ago

@proteinrosh @TomSercu @sokrypton @biohub @AyushmanMallick @AyushmanMallick thanks for pointing valuable protein to benchmark on. Pattern is the same

GregPreibisch's tweet photo. @proteinrosh @TomSercu @sokrypton @biohub @AyushmanMallick @AyushmanMallick thanks for pointing valuable protein to benchmark on. Pattern is the same https://t.co/0ocmXJntmm

Greg Preibisch, MD

@GregPreibisch

9 days ago

@proteinrosh @DdelAlamo Oh nice, good to see that works! I'm doing something similar, but each set of critics was trained on another modality (for example solubility, thermostability). The key problem is how to tune the ensemble weights as models tend to have different scales.

Greg Preibisch, MD

@GregPreibisch

9 days ago

@proteinrosh @DdelAlamo What do you mean by multiple critics? Different model weights / random seeds?

Greg Preibisch, MD

@GregPreibisch

9 days ago

@DdelAlamo Yes, there is, but it's getting deprioritized over other metrics. You can see that by reading early David Baker work and comparing it with RFdiffusion. Here is a nice paper comparing multiple different metrics, you should find SASA there https://t.co/h1rJAUEgHu

443

Greg Preibisch, MD

@GregPreibisch

Last Seen Users on Sotwe

Trends for you

Most Popular Users