Nadav Timor @NadavTimor - Twitter Profile

Pinned Tweet

Nadav Timor

@NadavTimor

11 months ago

Humbled that @TheRegister covered our oral presentation at #icml25! ⚡️

The Register @TheRegister

11 months ago

Boffins detail new algorithms to losslessly boost AI perf by up to 2.8x https://t.co/YQOfbbKAQJ

0

6

3

1

8K

0

13

0

4

6K

Nadav Timor

@NadavTimor

3 months ago

@feulf @tanishqkumar07 @tri_dao @avnermay Thanks for the ping, Fed! It’s definitely exploring a very similar space to our distributed speculative decoding paper from iclr25 (https://t.co/Hn1wRyD055). Great to see Tanishq, Tri, and Avner working on this too!

1

0

52

NadavTimor retweeted

Yilun Kuang @KuangYilun

4 months ago

How do we build sparsity into JEPA representations by design, while preserving task-relevant information? Introducing Rectified LpJEPA, a JEPA architecture that learns sparse, non-negative, informative representations through principled distributional regularization. 📐 📄 Paper: https://t.co/CZtKHVvKT4 💻 Code: https://t.co/xIJMYkK9fT 📝 Blog: https://t.co/OFCl3FAmik (1/n)

KuangYilun's tweet photo. How do we build sparsity into JEPA representations by design, while preserving task-relevant information?

Introducing Rectified LpJEPA, a JEPA architecture that learns sparse, non-negative, informative representations through principled distributional regularization. 📐

📄 Paper: https://t.co/CZtKHVvKT4
💻 Code: https://t.co/xIJMYkK9fT
📝 Blog: https://t.co/OFCl3FAmik

(1/n)

31

521

99

330

51K

Nadav Timor

@NadavTimor

5 months ago

@miramurati @soumithchintala @soumithchintala @johnschulman2 what a great team you’re building there. congrats!

0

295

Who to follow

Xinyun Chen

@xinyun_chen_

Research Scientist @Meta MSL. Prev. @GoogleDeepMind. PhD @Berkeley_EECS.

Andreas Kirsch 🇺🇦

@BlackHC

My opinions only here. 👨‍🔬 RS DeepMind, Midjourney 1y 🧑‍🎓 DPhil AIMS 4.5y 🧙‍♂️ RE DeepMind 1y 📺 SWE Google 3y 🎓 TUM 👤 @nwspk

5 months ago

@SonglinYang4 @MITEECS @thinkymachines congrats!

0

1

0

955

Nadav Timor

@NadavTimor

5 months ago

@sgl_project released eagle3 checkpoints for sota models (incl. kimi-k2, gpt-oss, deepseek-v3.2) + the training recipe

LMSYS Org

@lmsysorg

5 months ago

Speculative decoding has shown a lot of promise, though broader adoption has taken time due to the complexity of building production-ready tooling and high-quality draft models. We’re releasing SpecBundle, a collection of large-scale EAGLE-3 draft models trained with SpecForge v0.2. This release brings major system improvements, including refactored training pipelines, multi-backend support with SGLang and @huggingface , and better usability at scale. We also built a performance dashboard to make real end-to-end speedups visible across models and settings. See the dashboard and blog in the thread 👇

lmsysorg's tweet photo. Speculative decoding has shown a lot of promise, though broader adoption has taken time due to the complexity of building production-ready tooling and high-quality draft models.

We’re releasing SpecBundle, a collection of large-scale EAGLE-3 draft models trained with SpecForge v0.2. This release brings major system improvements, including refactored training pipelines, multi-backend support with SGLang and @huggingface , and better usability at scale.

We also built a performance dashboard to make real end-to-end speedups visible across models and settings. See the dashboard and blog in the thread 👇

6

169

14

71

47K

0

10

0

4

1K

Nadav Timor

@NadavTimor

5 months ago

@svlevine and how bad is the forgetting?

0

90

Nadav Timor

@NadavTimor

6 months ago

Even w/o training, you can still use speculative decoding. No need to train a speculator per model. Our spec decoding algos for heterogeneous vocabs (open-sourced in HF Transformers; not yet in vLLM) let any off-the-shelf model serve as the speculator. ♻️ That means day-0 support for new models, and spec decoding for anyone who can’t train.

Red Hat AI

@RedHat_AI

6 months ago

Speculative decoding is a powerful way to improve inference performance, but in practice it has been hard to adopt. Training a unique draft model per LLM is time-consuming, and production-ready training utilities that work cleanly with vLLM have been limited. Speculators v0.3.0 closes this gap with end-to-end training support for Eagle3 draft models that run seamlessly with vLLM. The release adds offline data generation using vLLM and training support for single and multi-layer draft models, across both MoE and non-MoE verifiers. Here's a 🧵 on speculative decoding and how to get started today in @vllm_project (1/8):

RedHat_AI's tweet photo. Speculative decoding is a powerful way to improve inference performance, but in practice it has been hard to adopt.

Training a unique draft model per LLM is time-consuming, and production-ready training utilities that work cleanly with vLLM have been limited.

Speculators v0.3.0 closes this gap with end-to-end training support for Eagle3 draft models that run seamlessly with vLLM.

The release adds offline data generation using vLLM and training support for single and multi-layer draft models, across both MoE and non-MoE verifiers.

Here's a 🧵 on speculative decoding and how to get started today in @vllm_project (1/8):

3

25

7

6

3K

0

13

2

5

2K

Nadav Timor

@NadavTimor

7 months ago

Tons of high-impact opportunities! And btw, our NYC open-space inference hub is still welcoming active vLLM/SGLang contributors

Greg Brockman

@gdb

7 months ago

inference is perhaps the most valuable emerging software category. as models get smarter and more economically valuable, compute will increasingly be spent drawing samples from the models. if you'd like to work on inference at openai, reach out — [email protected]. include a description of an exceptional team you've been a part of, and your contribution towards that team's goals. also indicate any experience in inference, large-scale system optimization, or other areas where you've built up domain expertise. lots of exciting problems to work on, ranging from deeply understanding the model forward pass (including simulating/finding creative opportunities for optimization); to system-level efficiencies such as speculative decoding or kv offloading or workload-aware load balancing; to managing and making observable a massive fleet at scale.

101

2K

120

1K

359K

0

11

1

2

2K

Nadav Timor

@NadavTimor

7 months ago

@PoratEitan @elonmusk @StefanoErmon @_inception_ai @itai_gat @helibenhamu @d_haziza @lipmanya 💡

0

103

Nadav Timor

@NadavTimor

7 months ago

@PoratEitan @elonmusk @StefanoErmon @_inception_ai @itai_gat @helibenhamu @d_haziza @lipmanya @PoratEitan, yeah, block diffusion works. But mitigating the ttft tradeoff might require prioritizing the first block

1

0

146

Nadav Timor

@NadavTimor

7 months ago

@elonmusk @StefanoErmon @_inception_ai @elonmusk, 1/ diffusing only the next-k tokens could mitigate this ttft tradeoff. 2/ transformers already handle next-k prediction efficiently (eg fair’s “set block decoding” by @itai_gat @helibenhamu @d_haziza @lipmanya and others)

1

5

1

0

3K

Nadav Timor

@NadavTimor

8 months ago

@lianegalanti @Princeton @HazanPrinceton @tri_dao 🚀

0

1

0

654

Nadav Timor

@NadavTimor

8 months ago

@MattHartman @LeRobotHF @huggingface come get your hands dirty with us 🤓

0

1

0

137

Nadav Timor

@NadavTimor

8 months ago

let’s have some robot training fun 🤓

Matt Hartman

@MattHartman

8 months ago

.@NadavTimor and I are going to train a SO-ARM101 with @LeRobotHF at the @huggingface office next week. If you’re in NYC and have an ARM101 and want to join us let me know! BYO arm 🦾

5

15

1

0

4K

1

6

0

2K

Nadav Timor

@NadavTimor

8 months ago

@WajahatAli_231 Just drop links to your PRs here and we’ll add you to the next sprint 🙂

0

202

Nadav Timor

@NadavTimor

8 months ago

NYC open-source AI infra contributors — we’ve launched a community research hub above Grand Central where GPUs go brrr 🔥🗽 A place to hack, benchmark, and collaborate — vLLM, SGLang, kernels, inference optimizations all welcome. Open space. Open source. Weekends too. Huge thanks to @Company for supporting this initiative 🙌 𝐋𝐢𝐦𝐢𝐭𝐞𝐝 𝐬𝐞𝐚𝐭𝐬. 𝐃𝐫𝐨𝐩 𝐲𝐨𝐮𝐫 𝐏𝐑𝐬 𝐢𝐧 𝐭𝐡𝐞 𝐜𝐨𝐦𝐦𝐞𝐧𝐭𝐬 𝐭𝐨 𝐣𝐨𝐢𝐧 𝐭𝐡𝐞 𝐧𝐞𝐱𝐭 𝐬𝐩𝐫𝐢𝐧𝐭!

8

92

11

30

9K