Yuhui Ding @yuhui_ding - Twitter Profile

Pinned Tweet

12 months ago

Is equivariance necessary for a good 3D molecule generative model? Check out our #icml2025 paper, which closes the performance gap between non-equivariant and equivariant diffusion models via rotational alignment, while also being more efficient (1/7): https://t.co/kFiZptFwsr

4

41

8

12

6K

Yuhui Ding @yuhui_ding

4 months ago

@zhuci19 You might also be interested in our previous work that explored something similar:). We used an autoencoder to learn rotations that aligned 3D molecules, which we found improved non-equivariant DMs: https://t.co/kFiZptEYCT

0

1

0

66

yuhui_ding retweeted

Dimitri von Rütte

@dvruette

6 months ago

🚨 NEW PAPER! (this is a big one; 3B and 10B models included) Masked diffusion LLMs are getting a lot of attention. They outperform other diffusion types (such as uniform diffusion) at small scales. But what if I told you that uniform diffusion actually scales better? 🧵👇

dvruette's tweet photo. 🚨 NEW PAPER! (this is a big one; 3B and 10B models included)

Masked diffusion LLMs are getting a lot of attention. They outperform other diffusion types (such as uniform diffusion) at small scales.

But what if I told you that uniform diffusion actually scales better? 🧵👇

8

218

42

146

56K

Yuhui Ding @yuhui_ding

6 months ago

@dvruette According to the trend, uniform (hybrid) diffusion will finally surpass mask-only diffusion. Where do you predict the cross point will be?

1

3

0

432

yuhui_ding retweeted

Dimitri von Rütte

@dvruette

11 months ago

I'll be at #ICML2025 in Vancouver next week presenting our GIDD (Generalized Interpolating Discrete Diffusion) paper together with @yuhui_ding and @orvieto_antonio! Come chat to us at: 📅 Tue, July 15, 11:00–13:30 📍 Poster Session 1 East

dvruette's tweet photo. I'll be at #ICML2025 in Vancouver next week presenting our GIDD (Generalized Interpolating Discrete Diffusion) paper together with @yuhui_ding and @orvieto_antonio!

Come chat to us at:
📅 Tue, July 15, 11:00–13:30
📍 Poster Session 1 East https://t.co/mVe8iGEkOm

1

10

1

0

818

Yuhui Ding @yuhui_ding

12 months ago

@dvruette Good point! I believe large non-equivariant models still have great potential, especially on larger biochemical datasets where equivariant models face computational bottlenecks. Indeed, we found that the gap between "not aligned" and "aligned" became smaller with larger model

0

4

0

232

Yuhui Ding @yuhui_ding

12 months ago

Is equivariance necessary for a good 3D molecule generative model? Check out our #icml2025 paper, which closes the performance gap between non-equivariant and equivariant diffusion models via rotational alignment, while also being more efficient (1/7): https://t.co/kFiZptFwsr

4

41

8

12

6K

Yuhui Ding @yuhui_ding

12 months ago

We qualitatively visualize the learned rotations and find that molecules tend to arrange common structural semantics (e.g., rings) into similar orientations after alignment. (7/7)

yuhui_ding's tweet photo. We qualitatively visualize the learned rotations and find that molecules tend to arrange common structural semantics (e.g., rings) into similar orientations after alignment. (7/7) https://t.co/Sib5JDVTM2

0

1

0

380

Yuhui Ding @yuhui_ding

12 months ago

Non-equivariant architectures significantly improve the sampling efficiency even at much larger model sizes: (6/7)

1

0

636

yuhui_ding retweeted

Weronika Ormaniec @wormaniec

about 1 year ago

Ever wondered how the loss landscape of Transformers differs from that of other architectures? Or which Transformer components make its loss landscape unique? With @unregularized & @f_dangel, we explore this via the Hessian in our #ICLR2025 spotlight paper! Key insights👇 1/8

wormaniec's tweet photo. Ever wondered how the loss landscape of Transformers differs from that of other architectures? Or which Transformer components make its loss landscape unique?

With @unregularized & @f_dangel, we explore this via the Hessian in our #ICLR2025 spotlight paper!

Key insights👇 1/8 https://t.co/0Hb06w9Kfb

1

26

8

10

3K

Yuhui Ding @yuhui_ding

over 1 year ago

Great work led by @dvruette !

Dimitri von Rütte

@dvruette

over 1 year ago

🚨 NEW PAPER DROP! Wouldn't it be nice if LLMs could spot and correct their own mistakes? And what if we could do so directly from pre-training, without any SFT or RL? We present a new class of discrete diffusion models, called GIDD, that are able to do just that: 🧵1/12

21

1K

153

926

144K

0

1

0

314

Yuhui Ding @yuhui_ding

almost 2 years ago

Tuesday 1:30pm-3pm, Hall C 4-9 #515. Drop by our poster if you are interested in SSMs for graphs👇! Code: https://t.co/IqtA81vhnl

4

11

3

1

2K

yuhui_ding retweeted

Google DeepMind @GoogleDeepMind

about 2 years ago

Announcing AlphaFold 3: our state-of-the-art AI model for predicting the structure and interactions of all life’s molecules. 🧬 Here’s how we built it with @IsomorphicLabs and what it means for biology. 🧵 https://t.co/gjw6Ip4F2M

206

6K

2K

1K

1M

Yuhui Ding @yuhui_ding

about 2 years ago

Excited that our paper about SSM on graphs has been accepted by #ICML2024 ! See you in Vienna!

Yuhui Ding @yuhui_ding

over 2 years ago

Inspired by recent breakthroughs in SSMs, we propose a new architecture, Graph Recurrent Encoding by Distance (GRED), for long-range graph representation learning: https://t.co/59QoZsqefI with @orvieto_antonio, @bobby_he and Thomas Hofmann (1/4)

4

144

17

89

58K

0

10

0

2K

yuhui_ding retweeted

Gregor Bachmann @GregorBachmann1

over 2 years ago

I’ll be presenting "Scaling MLPs" at #NeurIPS2023, tomorrow (Wed) at 10:45am! Hyped to discuss things like inductive bias, the bitter lesson, compute-optimality and scaling laws 👷⚖️📈

GregorBachmann1's tweet photo. I’ll be presenting "Scaling MLPs" at #NeurIPS2023, tomorrow (Wed) at 10:45am!
Hyped to discuss things like inductive bias, the bitter lesson, compute-optimality and scaling laws 👷⚖️📈 https://t.co/MoFfaUF3kH

1

58

12

5

4K

Yuhui Ding @yuhui_ding

over 2 years ago

@Genemmie @orvieto_antonio @bobby_he For datasets with edge attributes (like ZINC), we simply add edge attributes to the features of one-hop neighbors.

0

1

0

39

Yuhui Ding @yuhui_ding

over 2 years ago

Inspired by recent breakthroughs in SSMs, we propose a new architecture, Graph Recurrent Encoding by Distance (GRED), for long-range graph representation learning: https://t.co/59QoZsqefI with @orvieto_antonio, @bobby_he and Thomas Hofmann (1/4)

4

144

17

89

58K

Yuhui Ding @yuhui_ding

over 2 years ago

@Genemmie @orvieto_antonio @bobby_he Thank you for your interest! Since the invariant neural network (AGG) can express an injective function on multisets, our model can encode the number of neighbors (thus the number of edges) for each node

1

0

99

Yuhui Ding

@yuhui_ding

Last Seen Users on Sotwe

Trends for you

Most Popular Users