Ziteng Sun @SZiteng - Twitter Profile

Ziteng Sun

@SZiteng

3 days ago

@thegautamkamath Congratulations, Gautam! Well deserved!

0

1

0

227

SZiteng retweeted

Jeff Dean

@JeffDean

3 months ago

⚡ Excited to announce Gemini 3.1 Flash-Lite! We’ve set a new standard for efficiency and capability to give developers our fastest, most cost-effective Gemini 3 model yet. We engineered this model with thinking levels, allowing it to handle high-volume queries instantly, while scaling up its reasoning for complex edge cases. By the numbers: ⏱️ 2.5X faster time-to-first-token than 2.5 Flash while being significantly higher quality 📉 $0.25 per 1M input tokens 📊 1432 Elo on LMArena & 86.9% on GPQA Diamond Thrilled to see what developers build with this kind of speed and quality at scale. Available now in Google AI Studio and Vertex AI. https://t.co/Weal73Juh8

JeffDean's tweet photo. ⚡ Excited to announce Gemini 3.1 Flash-Lite! We’ve set a new standard for efficiency and capability to give developers our fastest, most cost-effective Gemini 3 model yet.

We engineered this model with thinking levels, allowing it to handle high-volume queries instantly, while scaling up its reasoning for complex edge cases.

By the numbers:
⏱️ 2.5X faster time-to-first-token than 2.5 Flash while being significantly higher quality
📉 $0.25 per 1M input tokens
📊 1432 Elo on LMArena & 86.9% on GPQA Diamond

Thrilled to see what developers build with this kind of speed and quality at scale. Available now in Google AI Studio and Vertex AI.

https://t.co/Weal73Juh8

69

1K

121

160

118K

SZiteng retweeted

Mert Cemri

@mertcemri

3 months ago

Introducing SPECS (SPECulative test time Scaling), a test-time scaling (TTS) algorithm with pareto-frontier latency/accuracy trade-off. Scaling test-time compute improves LLM reasoning but imposes a latency overhead. Prior work optimizes TTS accuracy as a function of FLOPS, we propose to further reduce latency by addressing the memory bottleneck of LLM inference through speculative drafts. See a breakdown of the method below. (1/n) 🧵 👇

mertcemri's tweet photo. Introducing SPECS (SPECulative test time Scaling), a test-time scaling (TTS) algorithm with pareto-frontier latency/accuracy trade-off.

Scaling test-time compute improves LLM reasoning but imposes a latency overhead.
Prior work optimizes TTS accuracy as a function of FLOPS, we propose to further reduce latency by addressing the memory bottleneck of LLM inference through speculative drafts. See a breakdown of the method below.
(1/n) 🧵 👇

5

107

24

58

22K

Ziteng Sun

@SZiteng

3 months ago

Our team at Google research is hiring a summer student researcher. DM Asher if you are interested.

Asher Trockman

@ashertrockman

3 months ago

I'm hiring a student researcher to work on RL and RLM-flavored things. DM me if interested

32

563

27

488

126K

2

184

8

146

29K

Who to follow

Sasho Nikolov ([email protected])

@thesasho

Associate professor at U of T. Computer science and math research: (differentially) private data analysis, geometry, discrepancy, optimization.

Lydia Zakynthinou

@zakynthinou

Assistant Professor of Computer Science, Johns Hopkins University @JHUCompSci, @HopkinsDSAI

Zachary Charles

@MatharyCharles

distributed machine learning @ google | sometimes mathematician

SZiteng retweeted

Asher Trockman

@ashertrockman

3 months ago

I'm hiring a student researcher to work on RL and RLM-flavored things. DM me if interested

32

563

27

488

126K

Ziteng Sun

@SZiteng

6 months ago

I will be at NeurIPS from today to Dec. 7th. Excited to meet old and new friends at the conference. Happy to chat about anything related to LLM efficiency, RL, and differential privacy. #NeurIPS2025 At the Wednesday noon session (11 AM – 2PM), I will be presenting our spotlight work: Private Set Union with Multiple Contributions (#1314), where we establish fundamental limits on the utility of discovering set unions privately, and how we can bypass the limit by leveraging a prediction. Joint work with awesome collaborators at Google Research: Travis Dick, Haim Kaplan, Alex Kulesza, Uri Stemmer, and @th33rtha.

SZiteng's tweet photo. I will be at NeurIPS from today to Dec. 7th. Excited to meet old and new friends at the conference. Happy to chat about anything related to LLM efficiency, RL, and differential privacy. #NeurIPS2025

At the Wednesday noon session (11 AM – 2PM), I will be presenting our spotlight work: Private Set Union with Multiple Contributions (#1314), where we establish fundamental limits on the utility of discovering set unions privately, and how we can bypass the limit by leveraging a prediction.

Joint work with awesome collaborators at Google Research: Travis Dick, Haim Kaplan, Alex Kulesza, Uri Stemmer, and @th33rtha.

0

7

0

1

515

SZiteng retweeted

Google DeepMind @GoogleDeepMind

7 months ago

This is Gemini 3: our most intelligent model that helps you learn, build and plan anything. It comes with state-of-the-art reasoning capabilities, world-leading multimodal understanding, and enables new agentic coding experiences. 🧵

211

6K

1K

2M

SZiteng retweeted

Ahmad Beirami

@abeirami

10 months ago

The main ingredient that led to GRPO's performance leap is the calibration of the reward/value via multiple rollouts per prompt. Let me elaborate on what I mean by that and a cheaper way of doing it offline.

abeirami's tweet photo. The main ingredient that led to GRPO's performance leap is the calibration of the reward/value via multiple rollouts per prompt.

Let me elaborate on what I mean by that and a cheaper way of doing it offline. https://t.co/RdfLEZBNER

11

657

53

893

118K

SZiteng retweeted

Ahmad Beirami

@abeirami

11 months ago

Happening now at poster E-2804. Come talk to us about why reward calibration key is to alignment and how to do RLHF for test-time scaling

abeirami's tweet photo. Happening now at poster E-2804.

Come talk to us about why reward calibration key is to alignment and how to do RLHF for test-time scaling https://t.co/W4YoKZXpln

1

20

2

0

4K

Ziteng Sun

@SZiteng

11 months ago

Paper link: https://t.co/N3uMtvDKLX.

0

1

0

98

Ziteng Sun

@SZiteng

11 months ago

[Today 11 am poster E-2804 #ICML2025] Inference-time compute have been instrumental to recent development of LLMs. Can we align our model to better suit a given inference-time procedure? Come check our poster and discuss with @ananthbshankar, @abeirami, @jacobeisenstein, and myself.

SZiteng's tweet photo. [Today 11 am poster E-2804 #ICML2025] Inference-time compute have been instrumental to recent development of LLMs. Can we align our model to better suit a given inference-time procedure? Come check our poster and discuss with @ananthbshankar, @abeirami, @jacobeisenstein, and myself.

1

14

3

4

1K

Ziteng Sun

@SZiteng

11 months ago

Check out this thread for a short intro: https://t.co/hKUszWt4V7

Ziteng Sun

@SZiteng

over 1 year ago

Inference-time procedures (e.g. Best-of-N, CoT) have been instrumental to recent development of LLMs. The standard RLHF framework focuses only on improving the trained model. This creates a train/inference mismatch. Can we align our model to better suit a given inference-time procedure? We answer this affirmatively, check out the thread below.

SZiteng's tweet photo. Inference-time procedures (e.g. Best-of-N, CoT) have been instrumental to recent development of LLMs. The standard RLHF framework focuses only on improving the trained model. This creates a train/inference mismatch.

Can we align our model to better suit a given inference-time procedure?

We answer this affirmatively, check out the thread below.

5

257

49

240

67K

1

2

0

191

Ziteng Sun

@SZiteng

11 months ago

@BanghuaZ Congrats, Banghua!

1

0

720

Ziteng Sun

@SZiteng

12 months ago

@GaoZhaolin Nice work! We used multiple offline roll-outs for reward calibration when studying inference-aware RLHF. We had the observation that it helped for vanilla RLHF as well. Might be of interest. https://t.co/hKUszWsx5z

Ziteng Sun

@SZiteng

over 1 year ago

Inference-time procedures (e.g. Best-of-N, CoT) have been instrumental to recent development of LLMs. The standard RLHF framework focuses only on improving the trained model. This creates a train/inference mismatch. Can we align our model to better suit a given inference-time procedure? We answer this affirmatively, check out the thread below.

5

257

49

240

67K

0

2

0

2

298

Ziteng Sun

@SZiteng

about 1 year ago

@abeirami Congratulations on all the amazing achievements. I am super grateful for the opportunity to be part of the journey and learn from you. Looking forward to your amazing achievements to come.

1

6

0

1

749

SZiteng retweeted

Ziteng Sun

@SZiteng

over 1 year ago

Inference-time procedures (e.g. Best-of-N, CoT) have been instrumental to recent development of LLMs. The standard RLHF framework focuses only on improving the trained model. This creates a train/inference mismatch. Can we align our model to better suit a given inference-time procedure? We answer this affirmatively, check out the thread below.

5

257

49

240

67K

SZiteng retweeted

Nived Rajaraman @Nived_Rajaraman

about 1 year ago

Announcing the first workshop on Foundations of Post-Training (FoPT) at COLT 2025! 📝 Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models! │ 🗓️ Deadline: May 19, 2025