@_akhaliq I've just open-sourced an adjacent work — ForgeWM, which applies the same paradigm to discrete keyboard/mouse control on a Minecraft world model (Matrix-Game 2 lineage), 8 GPUs, fully open data/weights/code. Different cell of the same space.
8/ #WorldModels
Thanks for reading this far. 🙏
If you've worked on world models, video diffusion, or distillation
— I'd love to hear what you'd want from a project like this.
Always happy to chat.
↓ QR for the repo if it's easier.
We just open-sourced ForgeWM — a fully reproducible recipe for training a playable Minecraft world model on 8 GPUs. Keyboard + mouse control. Causal Forcing distillation. 4-step real-time inference.
Code, weights, data — all open. 🧵 (1/7)
7/ Code, weights, data, project page:
🔗 https://t.co/L4wqEvk43i
🌐 https://t.co/5NwX1QKpii
🤗 https://t.co/RpADSmnf8W
I'm an early researcher just getting into world models —
feedback, PRs all welcome.
Built standing on the shoulders of MG2, GameFactory,
Causal Forcing 🙏
New Anthropic research: Natural Language Autoencoders.
Models like Claude talk in words but think in numbers. The numbers—called activations—encode Claude’s thoughts, but not in a language we can read.
Here, we train Claude to translate its activations into human-readable text.
Claude Code 4.7 is insane.
i know literally NOTHING about coding. ZERO. and i just built 3 fully functioning web apps in 30 minutes.
http://localhost:3000/
http://localhost:8000/
http://localhost:5000/
check it out.
读完 Will Brown 新博客《On SFT, RL, and on-policy distillation》,信息密度极高,也让我重新把后训练的这些算法串了一遍。
一句话总结:
后训练的真正框架不是"SFT or RL",而是"选一个 KL 预算 β,再为它找一个局部最优的教师"。
这是我的阅读笔记👇