Shuyao Tim Xu @TimXu222575 - Twitter Profile

Pinned Tweet

1 day ago

Life update 🚀 I started hill climbing my X follower count to receive @elonmusk's payout. I must say elon is very generous! Bear with me if I started posting crazy things. Opinions are my own

1

12

0

503

Shuyao Tim Xu

@TimXu222575

20 minutes ago

@sun_hanchi many are saying starting from 4.7 they used a smaller base

1

0

12

TimXu222575 retweeted

Dan Saunders

@djsaunde

about 5 hours ago

reading tech reports, I feel: 1. many strategies work for LM training 2. any particular strategy that worked feels like it may have come down to details of their setup / tuning these are typically supplied with post-hoc justification of their choices

2

7

0

1

307

Shuyao Tim Xu

@TimXu222575

about 1 hour ago

@djsaunde 100% true

1

0

1

91

Shuyao Tim Xu

@TimXu222575

about 2 hours ago

@liruifengv 等下elon又拿这个作为证据push grok build

0

4

0

606

Shuyao Tim Xu

@TimXu222575

about 7 hours ago

@teortaxesTex @nrehiew_ is the mai 22% mfu also relative to peak bf16 flops of gb200? If yes then it is indeed under optimized

0

1

0

110

Shuyao Tim Xu

@TimXu222575

about 7 hours ago

@thsottiaux may the incidents happen more and more often so we get more resets

0

2

0

313

Shuyao Tim Xu

@TimXu222575

about 13 hours ago

@olive_trees234 @Kimi_Moonshot 不过开箱即用其实也挺好

1

0

155

Shuyao Tim Xu

@TimXu222575

about 13 hours ago

@olive_trees234 @Kimi_Moonshot 把skill bundle进去了，之后可能会改掉

1

2

0

398

Shuyao Tim Xu

@TimXu222575

about 13 hours ago

@ClementDelangue There are many "open routers" in china doing this already haha

0

191

Shuyao Tim Xu

@TimXu222575

about 13 hours ago

Very clever and excited to see it works out well. On-policy self-distillation is perhaps the most efficient way to learn hindsight in multi-turn agent setup. If using standard SFT distillation, it would require continuing from a snapshot of the trajectory and container, and overall adds a lot to complexity.

Sasha Rush

@srush_nlp

about 15 hours ago

On-Policy Distillation is the most active new research direction being explored in RL for LLMs. Had the chance to discuss how it works with Dwarkesh and why it fits so nicely into large-scale pipelines.

15

891

87

721

72K

1

25

0

13

3K

Shuyao Tim Xu

@TimXu222575

about 14 hours ago

@bochencs @elonmusk I haven't received the revenue sharing though . Need at least 2,000 followers.

0

16

Shuyao Tim Xu

@TimXu222575

1 day ago

Life update 🚀 I started hill climbing my X follower count to receive @elonmusk's payout. I must say elon is very generous! Bear with me if I started posting crazy things. Opinions are my own

1

12

0

503

Shuyao Tim Xu

@TimXu222575

about 23 hours ago

@julien_c only hear people using screen when they don't have sudo and can't figure out tmux compilation

1

0

492

Shuyao Tim Xu

@TimXu222575

1 day ago

@eliebakouch I now have a secondary connection with elon!

1

2

0

536

Shuyao Tim Xu

@TimXu222575

1 day ago

@lateinteraction Can we distribute GEPA under a skill? Maybe it can also invoke the dynamic workflow for latest Claude Code.

0

6

0

317

Shuyao Tim Xu

@TimXu222575

1 day ago

@shreshthsaini A mix of creativity and hallucination like deepseek r1 or k2-0704? Just guessing haha

0

2

0

146

Shuyao Tim Xu

@TimXu222575

2 days ago

From first look, it seems that the whole pipeline is very clean It has very little bootstrapping from existing llms, which is very different from the nemotron approach I bet this model will smell very unique and "raw", maybe something like DeepSeek R1

elie

@eliebakouch

2 days ago

WOW microsoft new "MAI Thinking 1" model comes with a 109 page tech report that looks REALLY detailed, this is amazing

24

975

120

672

191K

5

99

1

37

11K

Shuyao Tim Xu

@TimXu222575

2 days ago

This is just deploying hosted static sites though? Claude web, Google AI studio can already do this. Maybe the novelty comes from achieving this with desktop app rather than web, which makes sense. What's cooler? Kimi Agent mode can build and deploy full-fledged full-stack apps! https://t.co/Ye510WJFNj

OpenAI

@OpenAI

2 days ago

Building apps has never been easier. With Sites, Codex can turn your work, ideas, and plans into an interactive website or app your team can explore, use, and share with a URL. Rolling out to Business and Enterprise plans, before expanding more broadly.

883

19K

2K

10K

9M

2

14

1

4

3K

Shuyao Tim Xu

@TimXu222575

2 days ago

@willccbb but yeah, some papers should be honest and just show us how their opsd runs collapsed

1

2

0

203

Shuyao Tim Xu

@TimXu222575

2 days ago

The best motivation for opsd is perhaps pushing the boundary (otherwise, on-policy rl makes fresher and more "on-policy" model) then in that sense, a good opsd run should beat the peak rl checkpoint in a domain. it is acceptable for me if opsd pushed the boundary but collapses eventually. many successful rl runs collapse in the end anyway

1

8

0

825

Shuyao Tim Xu

@TimXu222575

Last Seen Users on Sotwe

Trends for you

Most Popular Users