The next stage of AI may not be about making machines better at talking.
It may be about making machines better at understanding the world.
Fei-Fei Li’s essay on World Models is important not because it introduces one specific model, but because it gives us a clearer framework for thinking about where AI is going next.
She breaks World Models into three capabilities:
1Renderer A renderer generates what we see: images, videos, visual scenes.
Most of today’s image and video generation models fall into this category.
But there is a key limitation:
Looking real is not the same as understanding reality.
A video can look beautiful while still getting space, physics, object relations, and causality wrong.
It can be a moving illusion.
1Simulator This is the most important part to me.
A simulator does not just generate pixels. It generates a world state that can be computed, interacted with, and tested.
A good simulator needs geometry, physics, dynamics, materials, and causality.
This is what makes it useful for:
•robot training;
•autonomous driving;
•architecture and design;
•industrial digital twins;
•complex decision-making;
•testing actions before taking them in the real world.
A renderer makes the world look real. A simulator makes the world behave real.
1Planner A planner decides what to do next.
Given a goal and an environment, it chooses actions.
If the renderer is about seeing, and the simulator is about understanding how the world works, the planner is about acting inside that world.
Together, these three capabilities point toward a bigger shift:
from language intelligence to spatial intelligence.
Over the last few years, AI has learned to write, summarize, translate, chat, and code.
But the next frontier is not just generating better text or prettier videos.
It is building systems that can understand, simulate, and act in the physical world.
This matters for robotics, autonomous vehicles, AR/VR, games, cities, manufacturing, and many other domains.
But I think it also matters at a personal level.
Most personal AI workflows today are still about output:
•summarize this;
•write this;
•make a to-do list;
•draft this email;
•create this slide.
Useful, yes.
But limited.
The more interesting future is a personal world model.
An AI system that understands:
•your long-term goals;
•your active projects;
•what is stuck;
•what information matters;
•how tasks depend on each other;
•how today’s actions affect future outcomes;
•what tradeoffs you are really making.
In other words, AI should not only help us express thoughts faster.