Shengyu Feng @ShawnSYFeng - Twitter Profile

Shengyu Feng @ShawnSYFeng

5 days ago

@lltjuatja Congrats🚀!

0

1

0

131

ShawnSYFeng retweeted

Lindia Tjuatja ✈️ ACL @lltjuatja

5 days ago

In case you’ve been wondering what I’ve been up to these days… So excited to (re)join an amazing community of linguists and NLP researchers at UT :)

43

290

22

28

42K

ShawnSYFeng retweeted

Frank Xu @frankxu2004

21 days ago

Excited to share as many details on what we @MicrosoftAI have been working on. Building a LLM from scratch is an awesome journey with pain and suffering battling unknowns but also many cool moments to see it (somehow) works out every stage! https://t.co/WTRRRRwUGu

frankxu2004's tweet photo. Excited to share as many details on what we @MicrosoftAI have been working on. Building a LLM from scratch is an awesome journey with pain and suffering battling unknowns but also many cool moments to see it (somehow) works out every stage! https://t.co/WTRRRRwUGu https://t.co/e2xBU5sYHo

9

138

15

32

9K

Shengyu Feng @ShawnSYFeng

21 days ago

Thanks for the question! We didn’t try masked diffusion in our work, as this is not standard for combinatorial optimization, but would be meaningful to try for diffusion language model in the future. The mode collapse did happen a lot, but also quite dependent on the neural architecture (like normalization helps alleviate it a lot). One of the techniques, as in my prior work, is Regularized Langevin Dynamics, https://t.co/qoPZWSraT4, where forcing a certain level of exploration could help model to escape a local optimum.

0

2

0

52

Who to follow

Yixuan Wang

@YXWangBot

CS Ph.D. student @Columbia & Research Scientist @NVIDIARobotic | Prev. Meta FAIR Embodied AI, Boston Dynamics AI Institute, Google X #Vision #Robotics #Learning

Research Scientist @GoogleDeepMind | PhD @UMich

Shengyu Feng @ShawnSYFeng

22 days ago

Introducing Combinatorial Adjoint Matching (CAM)🚀, a paradigm shift from Reinforcement Learning to Adjoint-Method for unsupervised discrete diffusion models! Highlight🌟: training signals from - a single trajectory - the terminal gradient No labels, no RL, no dense rewards.

ShawnSYFeng's tweet photo. Introducing Combinatorial Adjoint Matching (CAM)🚀, a paradigm shift from Reinforcement Learning to Adjoint-Method for unsupervised discrete diffusion models!

Highlight🌟: training signals from

- a single trajectory
- the terminal gradient

No labels, no RL, no dense rewards. https://t.co/i3PGDrhgyn

4

82

16

77

7K

ShawnSYFeng retweeted

Seungone Kim

@seungonekim

22 days ago

🇰🇷Despite rapid progress in AI agent research, Korean agentic benchmarks remain largely absent! To narrow this gap, we release K-BrowseComp, a benchmark that requires searching across Korean websites and Korean-language content. https://t.co/kuHby48uif

seungonekim's tweet photo. 🇰🇷Despite rapid progress in AI agent research, Korean agentic benchmarks remain largely absent!

To narrow this gap, we release K-BrowseComp, a benchmark that requires searching across Korean websites and Korean-language content.

https://t.co/kuHby48uif https://t.co/rCCEqUmJzN

5

110

27

31

19K

Shengyu Feng @ShawnSYFeng

22 days ago

Thanks for the question! In fact, our original objective has already been defined as an integral. Here c_t is a cost measuring the instant KL divergence to a reference process. The objective I showed before is for practical implementation only. For the navigation task, I would replace the second line as the direction to reduce the geodesic distance to the locally improved state at the terminal. The key spirit should be exactly the same!

ShawnSYFeng's tweet photo. Thanks for the question!

In fact, our original objective has already been defined as an integral. Here c_t is a cost measuring the instant KL divergence to a reference process. The objective I showed before is for practical implementation only.

For the navigation task, I would replace the second line as the direction to reduce the geodesic distance to the locally improved state at the terminal. The key spirit should be exactly the same!

0

148

Shengyu Feng @ShawnSYFeng

22 days ago

CAM is accepted to ICML26, see you in Seoul!

0

292

Shengyu Feng @ShawnSYFeng

22 days ago

Paper: https://t.co/x7LufQVFyL Code: https://t.co/ytm8qE6yHT

1

3

0

2

291

ShawnSYFeng retweeted

Yizhe Zhang @YizheZhangNLP

about 1 month ago

Looped Transformers, Diffusion Models, and Latent Reasoning all share a powerful core mechanism: global iterative refinement. But how do we make LTs lightweight to achieve performance AND high efficiency? Enter Linear Attention. Check out our latest work led by @ChunyuanDeng!

0

107

12

72

9K

ShawnSYFeng retweeted

Martin Ziqiao Ma

@ziqiao_ma

about 1 month ago · Ann Arbor

PhDone :)

1

182

3

2

14K

ShawnSYFeng retweeted

Weiwei Sun @sunweiwei12

about 2 months ago

🪭Excited to share that Context Folding has been accepted to #ICML2026! Congrats to all collaborators! https://t.co/ohVoMQQ0el

sunweiwei12's tweet photo. 🪭Excited to share that Context Folding has been accepted to #ICML2026! Congrats to all collaborators!
https://t.co/ohVoMQQ0el https://t.co/JfnmKf4Mst

3

73

12

28

8K

ShawnSYFeng retweeted

Weihua Du

@StigLidu

about 2 months ago

Excited to introduce AdaExplore 🚀✨ AdaExplore teaches LLM agents to improve GPU kernel generation by learning from past execution failures (Adapt Stage) and searching over diverse optimization paths (Explore Stage). With GPT-5-mini as the base model, AdaExplore achieves 3.12×/1.72× speedups on KernelBench Level-2/Level-3 within 100 evaluation steps ⚡ and outperforms existing baselines such as OpenEvolve. Project Page & Demo: https://t.co/cGoUkg5JnV Arxiv: https://t.co/CpyvPgFBC8 Code: https://t.co/dZFayAk3EY More in the thread 👇

StigLidu's tweet photo. Excited to introduce AdaExplore 🚀✨

AdaExplore teaches LLM agents to improve GPU kernel generation by learning from past execution failures (Adapt Stage) and searching over diverse optimization paths (Explore Stage).

With GPT-5-mini as the base model, AdaExplore achieves 3.12×/1.72× speedups on KernelBench Level-2/Level-3 within 100 evaluation steps ⚡ and outperforms existing baselines such as OpenEvolve.

Project Page & Demo: https://t.co/cGoUkg5JnV
Arxiv: https://t.co/CpyvPgFBC8
Code: https://t.co/dZFayAk3EY

More in the thread 👇

4

92

28

60

49K

ShawnSYFeng retweeted

Shanda Li 黎善达

@Shanda_Li_2000

about 2 months ago

New paper: Spend Less, Fit Better Fitting scaling laws for LLMs can cost millions💰-but what if you can get the same insights with just ~10% of the budget? We frame scaling-law fitting as budget-aware experimental design and propose a method to pick the most valuable runs.#LLM

2

28

6

15

24K

ShawnSYFeng retweeted

Xuhui Zhou

@nlpxuhui

3 months ago

Creating user simulators is a key to evaluating and training models for user-facing agentic applications. But are stronger LLMs better user simulators? TL;DR: not really. We ran the largest sim2real study for AI agents to date: 31 LLM simulators vs. 451 real humans across 165 tasks. Here's what we found (co-lead with @sunweiwei12).

nlpxuhui's tweet photo. Creating user simulators is a key to evaluating and training models for user-facing agentic applications. But are stronger LLMs better user simulators?

TL;DR: not really.

We ran the largest sim2real study for AI agents to date: 31 LLM simulators vs. 451 real humans across 165 tasks.

Here's what we found (co-lead with @sunweiwei12).

8

286

67

194

33K

ShawnSYFeng retweeted

Sean Welleck

@wellecks

4 months ago

Excited to announce our workshop on flow-based generative models at CMU: Frontiers of Flows for Generative AI March 26-27, Pittsburgh PA https://t.co/U52Mx5vIYf We have an amazing lineup of featured talks, panel discussions, and lightning talks. Registration is now open!

wellecks's tweet photo. Excited to announce our workshop on flow-based generative models at CMU:

Frontiers of Flows for Generative AI
March 26-27, Pittsburgh PA

https://t.co/U52Mx5vIYf

We have an amazing lineup of featured talks, panel discussions, and lightning talks. Registration is now open!

4

159

25

95

28K

Shengyu Feng

@ShawnSYFeng

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users