Deqing Fu @DeqingFu - Twitter Profile

Excited to share that I've started my summer internship at SystemsResearch@Google in Sunnyvale, working on agentic environment generation! Always happy to chat about coding agents or LLM memory too. If you're around the Bay Area, would love to meet up.

0

23

1

4

2K

DeqingFu retweeted

Tianyi Zhou

@tianyi_zhou12

15 days ago

Love this. The Fourier addition circuit isn't just for math. Llama reuses the same base-10 mechanism for cyclic concepts, then maps the sum back into concept space. Mechanism generalizes further than we showed in https://t.co/M9Q0hKNIEf. Congrats to the team! 🎉

1

15

5

2K

Who to follow

Jesse Zhang @ ICRA 2026 ✈️

@Jesse_Y_Zhang

Robotics Postdoc @uwcse w/ A. Gupta, D. Fox. Collab @allen_ai. Prev: PhD from @csatusc. Intern @ NVIDIA, AWS.

Ting-Yun Chang

@CharlotteTYC

PhD student @CSatUSC @nlp_usc

Jun Yan

@jun_yannn

Agentic Post-training, Agent Memory | Research Scientist @GoogleDeepMind | Previously PhD @nlp_usc, B.Eng @TsinghuaNLP

Deqing Fu

@DeqingFu

15 days ago

@ShichengGLiu @GoogleResearch Of course. Let goooooooo!

0

62

Deqing Fu

@DeqingFu

16 days ago

Excited to share that I’ve started @GoogleResearch as a student researcher today. I'll be working on tabular foundation models. Come and chat if you are around at Google or at the Bay Area.

2

51

4

5

3K

Deqing Fu

@DeqingFu

16 days ago

@EkdeepL @GoodfireAI Honestly feels way harder to me. Numbers have rich structures like orderings and cyclic groups under mod (Z/nZ). Higher-order cognitive concepts don’t have any obvious group or symmetry to start with. It’s obvious how to decode the geometry. Curious if you have any intuitions.

1

0

1

47

DeqingFu retweeted

Deqing Fu

@DeqingFu

20 days ago

@GoodfireAI Cool work!!! It shows that Fourier features are also used for cyclic tasks beyond computing addition (something we studied in our NeurIPS ‘24 paper, https://t.co/IOYEiCI0DO). In our recent preprint, we tried to understand how this behavior emerge (https://t.co/f7u7dsXVBV).

2

16

3

6

1K

Deqing Fu

@DeqingFu

20 days ago

@EkdeepL @prompt_Tunes Yeah it’s great! I like this new work!

0

1

0

158

Deqing Fu

@DeqingFu

about 1 month ago

Glad to share that this paper is accepted to #ICML 2026 @icmlconf with an updated title "Transformers Provably Learn Algorithmic Solutions for Graph Connectivity, But Only with the Right Data". 🥳

Deqing Fu

@DeqingFu

7 months ago

Why do Transformers fail at algorithmic reasoning? We find it's not a lack of power, but a capacity mismatch. Our new preprint proves a tight, non-asymptotic bound: an L-layer model can only solve graph connectivity on graphs with a diameter up to exactly 3^L. https://t.co/JwKKl64709 🧵(1/N)

DeqingFu's tweet photo. Why do Transformers fail at algorithmic reasoning? We find it's not a lack of power, but a capacity mismatch.
Our new preprint proves a tight, non-asymptotic bound: an L-layer model can only solve graph connectivity on graphs with a diameter up to exactly 3^L. https://t.co/JwKKl64709
🧵(1/N)

2

45

9

16

62K

2

31

3

10

4K

Deqing Fu

@DeqingFu

about 1 month ago

@qcyang20xx Wow, a forecasting hackathon — who could’ve forecasted that!

1

0

68

DeqingFu retweeted

Qingchuan (Tony) Yang

@qcyang20xx

about 1 month ago

EPSVec will see you at #ICML2026!!

1

26

7

2

4K

DeqingFu retweeted

fly51fly @fly51fly

about 1 month ago

[CL] Convergent Evolution: How Different Language Models Learn Similar Number Representations D Fu, T Zhou, M Belkin, V Sharan… [University of Southern California & UC San Diego] (2026) https://t.co/zoXVMn8cYL

fly51fly's tweet photo. [CL] Convergent Evolution: How Different Language Models Learn Similar Number Representations
D Fu, T Zhou, M Belkin, V Sharan… [University of Southern California & UC San Diego] (2026)
https://t.co/zoXVMn8cYL https://t.co/hpFc2RSVA5

0

35

10

19

2K

DeqingFu retweeted

Bhavya Vasudeva @bhavya_vasudeva

about 1 month ago

2⃣ On when and why Muon generalizes better than (S)GD 🕙 Poster session 6, 3:15pm-5:45pm 📍Pavilion 3 #809 https://t.co/ZV406xSX8w

0

9

4

2

949

DeqingFu retweeted

Bhavya Vasudeva @bhavya_vasudeva

about 1 month ago

I'm at #ICLR2026 and will present two posters on Day 3: 1⃣ On how Transformer-based language models disentangle and compose latent concepts for in-context learning 🕙 Poster session 5, 10:30am-12pm 📍Pavilion 4 #4008

bhavya_vasudeva's tweet photo. I'm at #ICLR2026 and will present two posters on Day 3:

1⃣ On how Transformer-based language models disentangle and compose latent concepts for in-context learning
🕙 Poster session 5, 10:30am-12pm
📍Pavilion 4 #4008 https://t.co/IWQeddoOIt

1

61

15

26

3K

Deqing Fu

@DeqingFu

about 1 month ago

Unfortunately I can't be at ICLR this year but my advisor Vatsal will present FoNE tomorrow #ICLR2026. 📍10:30 AM – 1:00 PM, Pavilion 3 P3-#918

Deqing Fu

@DeqingFu

about 1 month ago

> Where am I going? the use Then the "actionable interpretability" question: how can a phenomenon lead to improvement? We show that representing numbers with Fourier features teaches models better numeracy (Zhou et al., ICLR 2026). https://t.co/CfnGTUJIQu

1

6

0

3

3K

0

15

2

0

1K

Deqing Fu

@DeqingFu

about 1 month ago

fantastic work! it's also great to see new multimodal models using our Zebra-CoT dataset for interleaved text and image generation. speaking of which, Zebra-CoT will appear this Friday at @iclr_conf #ICLR2026

DeqingFu's tweet photo. fantastic work! it's also great to see new multimodal models using our Zebra-CoT dataset for interleaved text and image generation.
speaking of which, Zebra-CoT will appear this Friday at @iclr_conf #ICLR2026 https://t.co/qTcSh4hwAk

Haoxing Chen @Chenhaoxing249

about 1 month ago

After two months of teamwork, we’re excited to share our team’s latest achievement — LLaDA2.0-Uni, InclusionAI’s first multimodal LLaDA. A unified discrete diffusion LLM built for both understanding and generation across text and images. Highlights: ● One paradigm for VQA, doc understanding, and image generation ● Efficient inference with a new decoding strategy + 8-step distilled decoder ● Interleaved text-image generation enabled by unified discrete representations (SGLang support soon) 🤗 Hugging Face: https://t.co/bDiucSDEN7 📷 ModelScope: https://t.co/qnztdVyl7U

Chenhaoxing249's tweet photo. After two months of teamwork, we’re excited to share our team’s latest achievement — LLaDA2.0-Uni, InclusionAI’s first multimodal LLaDA.
A unified discrete diffusion LLM built for both understanding and generation across text and images.
Highlights:
● One paradigm for VQA, doc understanding, and image generation
● Efficient inference with a new decoding strategy + 8-step distilled decoder
● Interleaved text-image generation enabled by unified discrete representations
(SGLang support soon)
🤗 Hugging Face: https://t.co/bDiucSDEN7
📷 ModelScope: https://t.co/qnztdVyl7U

2

53

9

30

18K

0

5

2

0

788

DeqingFu retweeted

Yuqing Yang @yyqcode

about 1 month ago

🧵 1/8 What should an LLM assistant remember across conversations? Existing memory work studies this one task at a time. But real-world assistants see all kinds of conversations, and that changes the problem. Introducing BEHEMOTH 🦣 + CluE 🌱: a benchmark & self-evolving method for heterogeneous memory extraction. 📄 Paper: https://t.co/szLIOdA4bm

yyqcode's tweet photo. 🧵 1/8
What should an LLM assistant remember across conversations?

Existing memory work studies this one task at a time. But real-world assistants see all kinds of conversations, and that changes the problem.

Introducing BEHEMOTH 🦣 + CluE 🌱: a benchmark & self-evolving method for heterogeneous memory extraction.

📄 Paper: https://t.co/szLIOdA4bm

6

50

16

28

14K

Deqing Fu

@DeqingFu

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users