Josh Rackers

@JoshRackers

Always Dada, sometimes a scientist, occasionally funny.

Albuquerque, NM

Joined July 2020

569 Following

1K Followers

1.4K Posts

Josh Rackers @JoshRackers

11 months ago

https://t.co/HT67kgRO7X very nice @nc_frey

385

JoshRackers retweeted

Nathan C. Frey

@nc_frey

about 1 year ago

Introducing Open Molecules 25, a foundational quantum chemistry dataset including >100M DFT calculations across 83M unique molecules, built with 6B core hours of compute! What does this mean for drug discovery, biology, and BioML? 1/

nc_frey's tweet photo. Introducing Open Molecules 25, a foundational quantum chemistry dataset including >100M DFT calculations across 83M unique molecules, built with 6B core hours of compute!

What does this mean for drug discovery, biology, and BioML?

1/ https://t.co/XhTqWZgWMA

389

258

64K

JoshRackers retweeted

Muhammed Shuaibi @mshuaibii

about 1 year ago

Excited to share our latest releases to the FAIR Chemistry’s family of open datasets and models: OMol25 and UMA! @AIatMeta @OpenCatalyst OMol25: https://t.co/UhIDiFFzzN UMA: https://t.co/eQ2wPIxbxY Blog: https://t.co/VFaeynyZN7 Demo: https://t.co/Dj29ZfhBRO

157

32K

JoshRackers retweeted

Sam Blau @SamMBlau

about 1 year ago

The Open Molecules 2025 dataset is out! With >100M gold-standard ωB97M-V/def2-TZVPD calcs of biomolecules, electrolytes, metal complexes, and small molecules, OMol is by far the largest, most diverse, and highest quality molecular DFT dataset for training MLIPs ever made 1/N

SamMBlau's tweet photo. The Open Molecules 2025 dataset is out! With >100M gold-standard ωB97M-V/def2-TZVPD calcs of biomolecules, electrolytes, metal complexes, and small molecules, OMol is by far the largest, most diverse, and highest quality molecular DFT dataset for training MLIPs ever made 1/N https://t.co/61vUm0xSw5

424

103

242

44K

Who to follow

Nathan C. Frey

@nc_frey

on the Second Quest @AnthropicAI | previously CTO & Co-Founder @CoefficientBio (acq. by @AnthropicAI) | ex-@Genentech | @MIT @Penn

Frank Noe

@FrankNoeBerlin

Scientist, #MachineLearning and #AI for the Sciences (esp. Physics/Chemistry). Scuba Diver and Traveler.

Geemi Wellawatte

@GWellawatte

Chemistry x AI (she/her) 🇱🇰 AI4Science @FutureHouseSF

Josh Rackers @JoshRackers

about 1 year ago

Lots of credit to @nc_frey @bwood_m and, most importantly, Daniel Levine for a heroic effort on the biomolecules portion of the dataset

338

Josh Rackers @JoshRackers

about 1 year ago

A new flagship quantum chemistry dataset is out! Omol25 contains more than 100 million DFT calculations across an expanse of chemical space. Collaborating with the FAIR team on this has been super fun! https://t.co/kPJHqPdvwH

144

JoshRackers retweeted

Andrew White 🐦‍⬛

@andrewwhite01

about 1 year ago

In cheminformatics there is a method called STONED SELFIES. I think it's the most edgy method name in science. Wonder if there are any more like it

JoshRackers retweeted

Theofanis Karaletsos

@Tkaraletsos

over 1 year ago

Announcing https://t.co/LWMnRVg9uP Achira will usher in the next phase of AI for drug discovery building atomistic foundation models for biomolecular simulation to harness the explosive growth of available computation and the frontiers of physics-based synthetic data generation. Our models combine learning accurate AI-representations of physics with simulation, and embrace the paradigm of using inference-time computation to generalize beyond training. Achira’s models will rival experimental accuracy with unprecedented experimental data efficiency, and help us turn drug discovery into engineering. Excited to be part of the journey with my long-time collaborator @jchodera , @zavaindar and this dream team.

Tkaraletsos's tweet photo. Announcing https://t.co/LWMnRVg9uP

Achira will usher in the next phase of AI for drug discovery building atomistic foundation models for biomolecular simulation to harness the explosive growth of available computation and the frontiers of physics-based synthetic data generation. Our models combine learning accurate AI-representations of physics with simulation, and embrace the paradigm of using inference-time computation to generalize beyond training.
Achira’s models will rival experimental accuracy with unprecedented experimental data efficiency, and help us turn drug discovery into engineering. Excited to be part of the journey with my long-time collaborator @jchodera , @zavaindar and this dream team.

206

26K

Josh Rackers @JoshRackers

over 1 year ago

@nc_frey working on NNPs together was amazing! I will miss the regular excuses to hang out and talk big ideas.

580

Josh Rackers @JoshRackers

over 1 year ago

@pedroopinheiro working on VoxMol was a lot of fun! I can't wait to see where that goes!

605

Josh Rackers @JoshRackers

over 1 year ago

@VishnuSresht has been the most supportive manager you could imagine! Go work for him if you get the chance!

377

Josh Rackers @JoshRackers

over 1 year ago

It's bittersweet to be leaving Prescient Design, but I wanted to shout out the amazing people there that made the last two years so fun.

Josh Rackers @JoshRackers

over 1 year ago

@bodhivani is an amazing scientist, and working on JAMUN together with her has been a blast!

371

Josh Rackers @JoshRackers

over 1 year ago

https://t.co/ohtgdozs58

453

Josh Rackers @JoshRackers

over 1 year ago

Some news: I'm joining the founding team at Achira! We are building physics into AI to shape the future of drug discovery. If you're interested or just want to know more, hit me up! https://t.co/JjbMjXlo1R

135

10K

Josh Rackers @JoshRackers

over 1 year ago

finally moved to the other site

331

JoshRackers retweeted

Xiang Fu

@xiangfu_ml

over 1 year ago

Presenting this work at NeurIPS tomorrow morning! I will be at NeurIPS from 12/11 to 12/14, let me know if you’d like to chat about AI for electronic structures, molecular dynamics, materials design, or the FAIR chemistry team!

JoshRackers retweeted

Taco Cohen

@TacoCohen

over 1 year ago

Does equivariance matter at scale? ... When the twitter discourse gets so tiring that you actually go out and collect EVIDENCE :D There has been a lot of discussion over the years about whether one should build symmetries into your architecture to get better data efficiency, or if it's better to just do data augmentation and learn the symmetries. In my own experiments (and in other papers that have looked at this), equivariance always outperformed data augmentation by a large margin (in problems with exact symmetries), and data augmentation never managed to accurately learn the symmetries. That is perhaps not surprising, given that in typical setups the number of epochs is limited and so each data point is only augmented a few times. Still, many "scale is all you need" folks believe that one should prefer data augmentation (or no bias at all) because eventually, with enough compute / data scale, the more general and scalable method will win (The Bitter Lesson). However, is data augmentation really more scalable? Scalability: how fast the method improves with data and compute scale, and for how long it keeps improving. This is exactly what equivariant nets are good at! We use transformers not N-grams for language, because they are more data efficient / scalable / better adapted to that problem domain. Paraphrasing Ilya Sutskever: scale is not all you need; it matters what you scale. In this latest work we decided to study the scaling behavior of equivariant networks empirically. As Johann explains in the thread below, we confirmed that equivariant networks are more data efficient. Interestingly, we were also able to confirm the intuition that in principle, the network should be able to learn the symmetry as well! When data augmentation is applied at sufficient scale, you get the same sample efficiency benefits as equivariance. HOWEVER: you need to do a huge number of epochs (which people don't do in practice), making equivariant networks more efficient / scalable in terms of training compute. So equivariant networks allow you to get the statistical benefits without paying the computational cost. The takeaway for me is that if you are working on a problem with exact symmetries, and are working on it because it is intrinsically important (climate, materials science / chemistry, molecular biology, etc.) rather than as a stepping stone to a more general problem (where the inductive bias could fail), then equivariant nets are still a good candidate in the age of scaling laws. Awesome work @johannbrehmer @pimdehaan Sönke Behrends!

342

137

44K

JoshRackers retweeted

Kresten Lindorff-Larsen @LindorffLarsen

over 1 year ago

Fourier transform of Google Trends of Fourier transform

610

564

325K

JoshRackers retweeted

Biology+AI Daily @BiologyAIDaily

over 1 year ago

JAMUN: Transferable Molecular Conformational Ensemble Generation with Walk-Jump Sampling @PrescientDesign • JAMUN introduces a generative model based on Walk-Jump Sampling (WJS) to efficiently generate molecular conformational ensembles, outperforming traditional molecular dynamics (MD) in speed. • The key innovation lies in using WJS with SE(3)-equivariant denoising networks, preserving the physical priors from MD data while improving transferability to unseen peptides. • JAMUN’s approach addresses the limitations of short time-steps in MD by sampling from a smoothed manifold, offering faster convergence to stable conformations. • Unlike other ML methods, JAMUN demonstrates transferability, generating stable conformations for peptides outside its training set with remarkable fidelity. • Ablation studies highlight that JAMUN generates conformations at a fraction of the time required for MD, providing orders of magnitude speed improvements in ensemble generation. • With applications in drug discovery, JAMUN offers new opportunities to explore cryptic pockets and antibody design through rapid generation of conformational ensembles. @JoshRackers @bodhivani @BigAmeya 📜Paper: https://t.co/NzIIq7AZ5v

$BiologyAIDaily's tweet photo. JAMUN: Transferable Molecular Conformational Ensemble Generation with Walk-Jump Sampling @PrescientDesign • JAMUN introduces a generative model based on Walk-Jump Sampling (WJS) to efficiently generate molecular conformational ensembles, outperforming traditional molecular dynamics (MD) in speed. • The key innovation lies in using WJS with SE(3)-equivariant denoising networks, preserving the physical priors from MD data while improving transferability to unseen peptides. • JAMUN’s approach addresses the limitations of short time-steps in MD by sampling from a smoothed manifold, offering faster convergence to stable conformations. • Unlike other ML methods, JAMUN demonstrates transferability, generating stable conformations for peptides outside its training set with remarkable fidelity. • Ablation studies highlight that JAMUN generates conformations at a fraction of the time required for MD, providing orders of magnitude speed improvements in ensemble generation. • With applications in drug discovery, JAMUN offers new opportunities to explore cryptic pockets and antibody design through rapid generation of conformational ensembles. @JoshRackers @bodhivani @BigAmeya 📜Paper: https://t.co/NzIIq7AZ5v$

11K

Josh Rackers

@JoshRackers

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users