Rujun Han @hanrujun - Twitter Profile

HanRujun retweeted

9 days ago

If you are into OPD (on-policy distillation), please don't forget to check our Speculative OPD (Speculative Knowledge Distillation). Idea: For each token, sample it from the student but discard if it teacher is unlikely to generate it (top-K), and resample from teacher. This dynamically switches between on-policy and supervised distillation.

WendaXu2's tweet photo. If you are into OPD (on-policy distillation), please don't forget to check our Speculative OPD (Speculative Knowledge Distillation).

Idea: For each token, sample it from the student but discard if it teacher is unlikely to generate it (top-K), and resample from teacher. This dynamically switches between on-policy and supervised distillation.

6

127

19

95

8K

HanRujun retweeted

Gaotang Li

@GaotangLi

27 days ago

🧩 How can we assign fine-grained credit over long tool-use trajectories and let agents learn from past attempts in agentic reinforcement learning when rewards are no longer verifiable? Excited to share RubricEM, an RL framework for long-form deep research agents that plan, search, use tools, and write reports without exact answer checks. 📖 Paper: https://t.co/t1tksq5g30 (1/n)

GaotangLi's tweet photo. 🧩 How can we assign fine-grained credit over long tool-use trajectories and let agents learn from past attempts in agentic reinforcement learning when rewards are no longer verifiable?

Excited to share RubricEM, an RL framework for long-form deep research agents that plan, search, use tools, and write reports without exact answer checks.

📖 Paper: https://t.co/t1tksq5g30

(1/n)

3

109

27

73

16K

Rujun Han @HanRujun

about 2 months ago

Thrilled to see our work TTD-DR (https://t.co/4bWp9XO4QE) empowers Gemini Deep Research Max to become the SOTA DR product!

Sundar Pichai

@sundarpichai

about 2 months ago

We are launching two powerful updates to Deep Research in the Gemini API, now with better quality, MCP support, and native chart/infographics generation. Use Deep Research when you want speed and efficiency, and use Max when you want the highest quality context gathering & synthesis using extended test-time compute — achieving 93.3% on DeepSearchQA and 54.6% on HLE.

sundarpichai's tweet photo. We are launching two powerful updates to Deep Research in the Gemini API, now with better quality, MCP support, and native chart/infographics generation.

Use Deep Research when you want speed and efficiency, and use Max when you want the highest quality context gathering & synthesis using extended test-time compute — achieving 93.3% on DeepSearchQA and 54.6% on HLE.

231

5K

443

723

413K

0

7

0

1

477

HanRujun retweeted

Sundar Pichai

@sundarpichai

about 2 months ago

We are launching two powerful updates to Deep Research in the Gemini API, now with better quality, MCP support, and native chart/infographics generation. Use Deep Research when you want speed and efficiency, and use Max when you want the highest quality context gathering & synthesis using extended test-time compute — achieving 93.3% on DeepSearchQA and 54.6% on HLE.

231

5K

443

723

413K

Who to follow

🌴Muhao Chen🌴

@muhao_chen

🐹Associate Professor of Computer Science @UCDavis🐹 | 💙PhD @UCLAComSci 2019💛 | 🌴加州boy🌴 | 🎸@GALNERYUSOFFIC2#1🎧! |♛Collecting⌚♛

Jieyu Zhao

@jieyuzhao11

Assistant Prof. @CSatUSC, @USC || Postdoc @ClipUMD || PhD from @UCLANLP, @UCLA. #NLP, #ML, #TrustworthyNLP

Kai-Wei Chang

@kaiwei_chang

Associate Professor @UCLAengineering/@UCLA. Area: #NLProc/#ML/#AI https://t.co/zj1ssZj9ox

Rujun Han @HanRujun

3 months ago

Attending #EACL2026 and will present our paper https://t.co/IgqBeVNsGl at Findings Poster Session 6 at 11am on March 27. Please come by and ask any questions if you are around, and happy to discuss internship and full-time opportunities at Google Cloud AI Research.

0

9

0

1

468

Rujun Han @HanRujun

5 months ago

Paper: https://t.co/IgqBeVO0vT Code/Data: https://t.co/GRS7Sfysys (details to be added soon!) Huge thanks to @brunchavecmoi and other co-authors (@anmourchen, @ZifengWang315, @IHung_Hsu, @jun_yannn, @eunsolc, @tomaspfister, @chl260) for the amazing work!

0

2

392

Rujun Han @HanRujun

5 months ago

🚀 New paper on synthetic data generation for deep search agents (accepted to #EACL2026 Findings)! We introduce SAGE, an agentic pipeline for generating high-quality, difficulty-controlled training data for deep search agents on a given corpus, using execution feedback.

HanRujun's tweet photo. 🚀 New paper on synthetic data generation for deep search agents (accepted to #EACL2026 Findings)!

We introduce SAGE, an agentic pipeline for generating high-quality, difficulty-controlled training data for deep search agents on a given corpus, using execution feedback. https://t.co/GserNItl1H

1

26

2

17

2K

Rujun Han @HanRujun

5 months ago

Transfer: Even though SAGE generates data using a fixed corpus (Wikipedia), we show agents trained on it can adapt to Google Searchat inference time without further training, though further research is needed in this direction.

1

0

331

HanRujun retweeted

Yihe Deng

@Yihe__Deng

5 months ago

Supervised RL accepted to ICLR 2026! Thanks to all my co-authors for the great work and especially @IHung_Hsu :)

3

281

15

123

27K

HanRujun retweeted

Tengxiao Liu

@TengxiaoLiu

7 months ago

🏧Giving your agent unlimited tool calls doesn't make it smarter. 💡Why? It lacks 'Budget Awareness'! Introducing Budget Tracker, a simple plug-in that enables more effective scaling behaviors: higher performance, lower cost. Paper: https://t.co/aKm2Tzt1wx

TengxiaoLiu's tweet photo. 🏧Giving your agent unlimited tool calls doesn't make it smarter.
💡Why? It lacks 'Budget Awareness'!
Introducing Budget Tracker, a simple plug-in that enables more effective scaling behaviors: higher performance, lower cost.
Paper: https://t.co/aKm2Tzt1wx https://t.co/XwGEeaNUzD

1

29

16

13

4K

HanRujun retweeted

I-Hung Hsu @IHung_Hsu

8 months ago

🧠🚀 Excited to introduce Supervised Reinforcement Learning—a framework that leverages expert trajectories to teach small LMs how to reason through hard problems without losing their minds. 🤯 Better than SFT && RLVR. Read more: https://t.co/taEL8Vk4X5 #llms #RL #reasoning

IHung_Hsu's tweet photo. 🧠🚀 Excited to introduce Supervised Reinforcement Learning—a framework that leverages expert trajectories to teach small LMs how to reason through hard problems without losing their minds. 🤯

Better than SFT && RLVR.

Read more: https://t.co/taEL8Vk4X5

#llms #RL #reasoning https://t.co/CremfPK3zK

12

333

63

258

21K

HanRujun retweeted

Yossi Matias

@ymatias

9 months ago

New research introduces the Test-Time Diffusion Deep Researcher (TTD-DR) framework, advancing AI agents for complex research: https://t.co/R2PldcoZbO

0

5

4

2

1K

HanRujun retweeted

Cecile Tamura @ceciletamura

10 months ago

Featuring: Dr. @HanRujun of @Google in a deep dive w/ @ceciletamura of @ploutosai https://t.co/WSHqnW09gJ

1

5

2

1

672

HanRujun retweeted

Yumo Xu @yumo_xu

11 months ago

Excited to share our #ACL2025NLP paper, "𝐂𝐢𝐭𝐞𝐄𝐯𝐚𝐥: 𝐏𝐫𝐢𝐧𝐜𝐢𝐩𝐥𝐞-𝐃𝐫𝐢𝐯𝐞𝐧 𝐂𝐢𝐭𝐚𝐭𝐢𝐨𝐧 𝐄𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 𝐟𝐨𝐫 𝐒𝐨𝐮𝐫𝐜𝐞 𝐀𝐭𝐭𝐫𝐢𝐛𝐮𝐭𝐢𝐨𝐧"! 📜 If you’re working on RAG, Deep Research and Trustworthy AI, this is for you. Why? Citation quality is critical for trust, but current metrics are falling short. Let’s fix that! 🧵 [1/10]

yumo_xu's tweet photo. Excited to share our #ACL2025NLP paper, "𝐂𝐢𝐭𝐞𝐄𝐯𝐚𝐥: 𝐏𝐫𝐢𝐧𝐜𝐢𝐩𝐥𝐞-𝐃𝐫𝐢𝐯𝐞𝐧 𝐂𝐢𝐭𝐚𝐭𝐢𝐨𝐧 𝐄𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧 𝐟𝐨𝐫 𝐒𝐨𝐮𝐫𝐜𝐞 𝐀𝐭𝐭𝐫𝐢𝐛𝐮𝐭𝐢𝐨𝐧"! 📜 If you’re working on RAG, Deep Research and Trustworthy AI, this is for you. Why? Citation quality is critical for trust, but current metrics are falling short. Let’s fix that!

🧵 [1/10]

2

46

8

19

5K

HanRujun retweeted

Shahriar Golchin @ShahriarGolchin

11 months ago

Can many-shot ICL be cached and still tailored per test sample? We make it possible. 💡 Excited to share that our paper, "Towards Compute-Optimal Many-Shot In-Context Learning," has been accepted to @COLM_conf! Paper: https://t.co/lBf4z5VCHN #COLM2025 #LLMs #AI #ICL

ShahriarGolchin's tweet photo. Can many-shot ICL be cached and still tailored per test sample?

We make it possible. 💡

Excited to share that our paper, "Towards Compute-Optimal Many-Shot In-Context Learning," has been accepted to @COLM_conf!

Paper: https://t.co/lBf4z5VCHN

#COLM2025 #LLMs #AI #ICL https://t.co/5WibPDV7Zl

1

3

1

0

418

Rujun Han @HanRujun

11 months ago

We performed in-depth analysis to show that diversity of search directions and timely incorporation of retrieved information help research agents scale more effectively. Read our paper for more detailed analysis: https://t.co/tDXt8vxlsV

HanRujun's tweet photo. We performed in-depth analysis to show that diversity of search directions and timely incorporation of retrieved information help research agents scale more effectively. Read our paper for more detailed analysis: https://t.co/tDXt8vxlsV https://t.co/SH4gPj90VG

0

2

0

284

Rujun Han @HanRujun

11 months ago

Very excited to share the project I've been working on over the past several months! We proposed Deep Researcher with Test-Time Diffusion, a novel method to leverage iterative draft+revision to tackle complex questions demanding exhaustive search and reasoning.

HanRujun's tweet photo. Very excited to share the project I've been working on over the past several months! We proposed Deep Researcher with Test-Time Diffusion, a novel method to leverage iterative draft+revision to tackle complex questions demanding exhaustive search and reasoning. https://t.co/iqYeNrMrFj

3

29

10

14

7K

Rujun Han @HanRujun

11 months ago

Further enhanced by self-evolving algorithms, we achieved SOTA results over a variety of research benchmarks with broad coverage of industry domains such as finance, technology, biomedical and legal.

HanRujun's tweet photo. Further enhanced by self-evolving algorithms, we achieved SOTA results over a variety of research benchmarks with broad coverage of industry domains such as finance, technology, biomedical and legal. https://t.co/4XsvzXh3Kn

1

0

334

Rujun Han

@HanRujun

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users