The future has always belonged to people who understand and can clearly communicate both the shapes * values of their wants, and wants of the people around them
from apps to material
software used to be something you opened
an app was a room with walls: calendar here, notes there, music there, work there. each one had its own logic, buttons, its own little kingdom. the user moved between kingdoms, carrying context in their head
but ai starts to break the walls
software becomes less like a destination and more like material. something you shape, combine, stretch, ask, remix, and leave behind as traces. a document can become an app. a conversation can become a workflow. a song can become a memory. a task can become an agent. the boundary between using and making gets blurry
the old model was: choose the right tool for each task
the new model is: express the shape of the thing you want, then refine it with the system you built
this changes the role of the interface. ui is no longer only fixed views for fixed functions. it becomes a surface where intent turns into structure. the best interfaces will feel less like menus and more like clay – responsive, persistent, inspectable, and alive
apps won’t disappear. rooms are still useful. but the deeper shift is that software stops being a set of sealed containers and becomes a medium people can think through
like paper, but executable
like language, but spatial
like memory, but programmable
software stops being something only programmers make
it becomes material anyone can shape
Today's Training Data episode takes us BTS on the infrastructure challenges required to do large RL runs at scale, featuring @ellev3n11 (Composer Lead at @cursor_ai) and @dzhulgakov (Co-Founder at @FireworksAI_HQ).
The Cursor team trained Composer 2 on Fireworks by starting with a strong base model (Kimi 2.5) and performing large-scale mid-training on code tokens and web data to learn common patterns and libraries, followed by a large-scale Reinforcement Learning run to learn how to navigate the Cursor harness, call tools, and write correct code.
Today's episode dives into the systems and infrastructure challenges of making that large RL run happening, and there were many (!!), from numerical mismatch to global distribution to synchronizing rollouts across asynchronous pipelines to keeping track of expert activation across runs and more.
Extremely nerdy in-the-weeds challenges that Federico and Dima were delighted to nerd out on together :)
Beyond RL infra, we also discussed Online vs Simulated rollouts, self-summarization for long-horizon agents, environment design ("the most powerful RL environment is the product itself"), and other technical nuggets.
PS: We filmed this episode before the SpaceX news, while the Cursor team was still compute-constrained. While Cursor now has *all* the flops, the takeaways and hurdles crossed ring true for any serious application-level company that is racing to post-train their own models.
I believe that more serious application companies will go the way of Cursor and post-train their own models.
00:00 Introduction
00:53 Why Cursor Trained Composer 2
04:55 Specialization vs Bitter Lesson
06:16 Composer 2 Training Recipe
16:32 Scaling RL Infrastructure Globally
23:32 Floating Point Drift
25:11 MoE Sensitivity Explained
26:25 Router Replay Fix
27:19 Real Time RL Loop
31:49 Long Horizon Agents
34:29 Why RL Everywhere
37:34 LLM as Judge Rewards
39:14 RL in Hard Domains
40:13 Build Your Own Environments
44:34 Closing Thoughts
We’re reimagining a 50-year-old interface - the mouse pointer - with AI. 🖱️
These experimental demos show how people can intuitively direct Gemini on their screens using motion, speech, and natural shorthand to get things done 🧵
The model can in theory also be used for real-time OS level autocompletes or actions suggestions! Since inference happens in 200ms chunks. Can’t wait to play with it
People talk, listen, watch, think, and collaborate at the same time, in real time. We've designed an AI that works with people the same way.
We share our approach, early results, and a quick look at our model in action.
https://t.co/AFJZ5kH7Ku
Introducing Zenbu.js - The framework for hackable software
I wanted the ability to edit the software I use with my coding agents, from there Zenbu.js was born
Zenbu.js allows you to build desktop apps that can be modified by users after installation. This is made possible by:
- shipping the app's raw source code to the user
- a built in plugin system for your apps
npx create-zenbu-app@latest
Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see.
@eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)
My dear front-end developers (and anyone who’s interested in the future of interfaces):
I have crawled through depths of hell to bring you, for the foreseeable years, one of the more important foundational pieces of UI engineering (if not in implementation then certainly at least in concept):
Fast, accurate and comprehensive userland text measurement algorithm in pure TypeScript, usable for laying out entire web pages without CSS, bypassing DOM measurements and reflow