Aditi Krishnapriyan @ask1729 - Twitter Profile

ask1729 retweeted

2 months ago

Code for our new world model planner is live! https://t.co/PnCk0OzqTl Includes our implementation on dino-wm, as well as implementations on jepa-wm and le-wm, and minimal pseudocode for anyone to re-implement themselves.

6

144

21

119

43K

Aditi Krishnapriyan

@ask1729

6 months ago

3/ This project was lead by Yiheng Du, with: Paper: https://t.co/3ElmBykzGN Code: https://t.co/820nBeW9tk

0

7

1

0

567

Aditi Krishnapriyan

@ask1729

6 months ago

1/ How can we improve our models of the physical world? We develop EddyFormer: integrating spectral methods w/ the Transformer architecture. We can accelerate 3D turbulence simulation by up to 30x compared to top numerical solvers, at the same level of accuracy! At #NeurIPS2025 on Dec 4 at 4:30 PM, Exhibit Hall C, D, E: #2316 Project page: https://t.co/J0phE7QG0P

7

209

39

88

19K

Aditi Krishnapriyan

@ask1729

6 months ago

2/ EddyFormer can resolve solutions to fluid dynamics problems that other ML models fail to converge on (taken from the Well). It can also generalize to larger spatiotemporal domains than what it was trained on.

ask1729's tweet photo. 2/ EddyFormer can resolve solutions to fluid dynamics problems that other ML models fail to converge on (taken from the Well). It can also generalize to larger spatiotemporal domains than what it was trained on. https://t.co/4zE0rKkRZd

1

8

1

2

772

Who to follow

Chi Chen

@chc273

Quantum Applications at IonQ. Views are mine.

Muratahan Aykol

@draykol

AI for science @PeriodicLabs | previously at Google DeepMind, TRI, Rivian, Berkeley Lab, Northwestern

Janosh

@jrib_

ML for materials discovery @PeriodicLabs. Prev @ Materials Project (LBNL).

Aditi Krishnapriyan

@ask1729

6 months ago

This was implied in noting “chasing small improvements / state-of-the-art here is scientifically meaningless.” QM9, etc. were great for the field. ML has changed a lot since then and we should be moving with the field accordingly to new problems and datasets.

0

3

0

92

Aditi Krishnapriyan

@ask1729

7 months ago

At this point the AI for Science community should stop focusing on achieving "state-of-the-art” on datasets like QM9 & MD17: chasing small improvements on these outdated datasets is scientifically meaningless. It's like telling vision researchers to ditch internet-scale and go back to benchmarking on MNIST/CIFAR10

6

127

20

19

24K

Aditi Krishnapriyan

@ask1729

7 months ago

(Inspired by ongoing thoughts about the state of the field, including discussions w/ @Andrew_S_Rosen, @SamMBlau, @bwood_m + many others)

0

6

0

2

2K

Aditi Krishnapriyan

@ask1729

8 months ago

We ran experiments with different model sizes and # of epochs, and then fit a power law curve based on this and extrapolated the lines---so no quadratic fitting was explicitly done, but this happened to show that doing this extrapolation gave a quadratic shape. We can make this more clear, let us know if you have any other questions!

0

41

Aditi Krishnapriyan

@ask1729

8 months ago

1/ Can molecular AI move past hard-coded Graph Neural Networks and embrace scalable Transformers that discover molecular structure on their own? We demonstrate that you can train a 1B parameter Transformer model without any graph priors or physical inductive biases. And surprisingly, not only can it maintain competitive performance under equal compute on the Open Molecules 2025 dataset… it’s faster than a 6M parameter equivariant GNN, and exhibits scaling laws that don’t saturate. We use this as a starting point to investigate emergent internal representations, and find that it adaptively discovers molecular structure! Check out the interactive demo on our website: https://t.co/fxgmjAHirU And our paper: https://t.co/diCRh5ywtF In collaboration with @tobykreiman, @YutongBAI1002, Fadi, Elizabeth, and @EricQuCal. Here’s a video showing how the Transformer learns distance-aware attention patterns (purple gradient) that adapt to atomic environments 👇

6

174

32

90

39K

Aditi Krishnapriyan

@ask1729

8 months ago

@atAndreasBurger That's right: the round datapoints are estimated from the trends (given compute budgets, we couldn't run all the experiments for each curve)

1

0

28

Aditi Krishnapriyan

@ask1729

8 months ago

Thanks for sending, and apologies that we forgot to cite (will add). This work also goes in the category that we discuss in our paper of GNNs that incorporate an attention-based mechanism: it's still operating on a predefined graph, and a new graph is still being constructed for every input using a radius cutoff (and then doing message passing). In this case study, we're using a completely unmodified Transformer with no graph-based features https://t.co/oNa20yMVOM

0

1

0

178

ask1729 retweeted

Yutong Bai @YutongBAI1002

8 months ago

I value approaches that work by subtraction: stripping away the unnecessary until the essential insight remains. Doing less often demands more: a deeper understanding of data and method to reveal the simplest formulation at the core. This is an attempt of ours. Hope it resonates.

1

88

8

39

17K

Aditi Krishnapriyan

@ask1729

8 months ago

@erikjbekkers Excited to see it 😃! The beauty of this approach (fully unmodified Transformer) is we can fairly easily scale to 1B+ params, and it's clear that the improvement trend is going to predictably continue to hold. So far, with constraints, it seems one reaches saturation much sooner

1

5

0

483

Aditi Krishnapriyan

@ask1729

8 months ago

@SamMBlau Besides the obvious new capabilities, there are so many fun things that we can explore with OMol25 now: it's exciting to think about all the things we might discover on both the ML and science side (and I think it's just getting started) 😀

0

2

0

137

Aditi Krishnapriyan

@ask1729

8 months ago

@Andrew_S_Rosen Amazing throwback! Honored to be in such company 😃

0

2

0

67

Aditi Krishnapriyan

@ask1729

8 months ago

6/ Our results demonstrate that many favorable properties of GNNs can emerge adaptively and more flexibly in Transformers, challenging the necessity of hard-coded graph inductive biases and pointing toward standardized, scalable architectures for molecular modeling. This has been a hot topic in the community, and we hope that this adds more to the discussion!

1

9

0

1

895

Aditi Krishnapriyan

@ask1729

8 months ago

5/ We really do mean an unmodified Transformer: no explicit calculation of pairwise distances, no graph-based features, no rotational equivariance, etc. Leveraging modern software and hardware, a 1B parameter Transformer trains and runs inference faster than a 6M parameter equivariant GNN.

ask1729's tweet photo. 5/ We really do mean an unmodified Transformer: no explicit calculation of pairwise distances, no graph-based features, no rotational equivariance, etc. Leveraging modern software and hardware, a 1B parameter Transformer trains and runs inference faster than a 6M parameter equivariant GNN.

1

11

0

1

1K

Aditi Krishnapriyan

@ask1729

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users