Xinye Li @vclee8 - Twitter Profile

Xinye Li @vclee8

4 days ago

@_akhaliq

0

2

0

19

Xinye Li @vclee8

4 days ago

@_akhaliq I've just open-sourced an adjacent work — ForgeWM, which applies the same paradigm to discrete keyboard/mouse control on a Minecraft world model (Matrix-Game 2 lineage), 8 GPUs, fully open data/weights/code. Different cell of the same space.

vclee8's tweet photo. @_akhaliq I've just open-sourced an adjacent work — ForgeWM, which applies the same paradigm to discrete keyboard/mouse control on a Minecraft world model (Matrix-Game 2 lineage), 8 GPUs, fully open data/weights/code. Different cell of the same space. https://t.co/xSOvxVDugv

1

2

0

76

Xinye Li @vclee8

5 days ago

8/ #WorldModels Thanks for reading this far. 🙏 If you've worked on world models, video diffusion, or distillation — I'd love to hear what you'd want from a project like this. Always happy to chat. ↓ QR for the repo if it's easier.

vclee8's tweet photo. 8/ #WorldModels

Thanks for reading this far. 🙏

If you've worked on world models, video diffusion, or distillation
— I'd love to hear what you'd want from a project like this.
Always happy to chat.
↓ QR for the repo if it's easier. https://t.co/MsiDwuq1iI

0

1

0

28

Xinye Li @vclee8

5 days ago

We just open-sourced ForgeWM — a fully reproducible recipe for training a playable Minecraft world model on 8 GPUs. Keyboard + mouse control. Causal Forcing distillation. 4-step real-time inference. Code, weights, data — all open. 🧵 (1/7)

vclee8's tweet photo. We just open-sourced ForgeWM — a fully reproducible recipe for training a playable Minecraft world model on 8 GPUs. Keyboard + mouse control. Causal Forcing distillation. 4-step real-time inference.
Code, weights, data — all open. 🧵 (1/7) https://t.co/gh2bgaWhsP

1

2

0

126

Xinye Li @vclee8

5 days ago

7/ Code, weights, data, project page: 🔗 https://t.co/L4wqEvk43i 🌐 https://t.co/5NwX1QKpii 🤗 https://t.co/RpADSmnf8W I'm an early researcher just getting into world models — feedback, PRs all welcome. Built standing on the shoulders of MG2, GameFactory, Causal Forcing 🙏

1

3

0

72

vclee8 retweeted

Mathieu

@miniapeur

22 days ago

Every 5 neurips/icml papers PhD student interviewing for the same internships at DeepMind like

8

930

40

205

169K

Xinye Li @vclee8

24 days ago

@lihanc02 @sheriyuo Is there any reason? US law has more restrictions? or is it an overall management issue with US PhD programs (like TA stuff?)

0

1

0

166

Xinye Li @vclee8

27 days ago

@sheriyuo Gaoling不能实习吗？似乎在外面实习挺多的？（只是听说

1

0

74

Xinye Li @vclee8

27 days ago

Maybe a new proxy to examine the dataset contamination? Interesting

Anthropic

@AnthropicAI

27 days ago

New Anthropic research: Natural Language Autoencoders. Models like Claude talk in words but think in numbers. The numbers—called activations—encode Claude’s thoughts, but not in a language we can read. Here, we train Claude to translate its activations into human-readable text.

595

16K

2K

9K

2M

0

40

vclee8 retweeted

Arun

@hiarun02

about 1 month ago

Claude Code 4.7 is insane. i know literally NOTHING about coding. ZERO. and i just built 3 fully functioning web apps in 30 minutes. http://localhost:3000/ http://localhost:8000/ http://localhost:5000/ check it out.

1K

30K

2K

2M

Xinye Li @vclee8

about 1 month ago

下一轮进展在哪？ 1. 构造"局部最优"的合成教师。 2. 不追求最聪明的老师，而是针对当前学生、当前 checkpoint、每步最有帮助的老师。 3. 目标是在"蒸馏 ↔ RL"之间画一条连续曲线：不依赖真实同族教师，每一点都算力最优。这是作者押注的开放问题。（8/8)

vclee8's tweet photo. 下一轮进展在哪？
1. 构造"局部最优"的合成教师。
2. 不追求最聪明的老师，而是针对当前学生、当前 checkpoint、每步最有帮助的老师。
3. 目标是在"蒸馏 ↔ RL"之间画一条连续曲线：不依赖真实同族教师，每一点都算力最优。

这是作者押注的开放问题。（8/8) https://t.co/9yxWZTpiKV

0

75

Xinye Li @vclee8

about 1 month ago

读完 Will Brown 新博客《On SFT, RL, and on-policy distillation》，信息密度极高，也让我重新把后训练的这些算法串了一遍。一句话总结：后训练的真正框架不是"SFT or RL"，而是"选一个 KL 预算 β，再为它找一个局部最优的教师"。这是我的阅读笔记👇

will brown

@willccbb

about 1 month ago

https://t.co/gCIFKAjB0Z

46

2K

255

4K

479K

1

0

1

87

Xinye Li @vclee8

about 1 month ago

所有方法都是同一个元算法的特例三个旋钮： · α：采样多 on-policy · λ：信号是教师 KL 还是 outcome reward · π_T：教师是谁、在什么条件下 SFT / RL / OPD / OPSD 只是这个空间的不同角点。但作者提醒：不要真去插值 α、λ，真正重要的轴是 KL 预算 β 和教师选择。

vclee8's tweet photo. 所有方法都是同一个元算法的特例

三个旋钮：
· α：采样多 on-policy
· λ：信号是教师 KL 还是 outcome reward
· π_T：教师是谁、在什么条件下

SFT / RL / OPD / OPSD 只是这个空间的不同角点。

但作者提醒：不要真去插值 α、λ，真正重要的轴是 KL 预算 β 和教师选择。 https://t.co/BgVK9rlU0c

1

0

52

Xinye Li

@vclee8

Last Seen Users on Sotwe

Trends for you

Most Popular Users