great pod. Animation studios have always operated in a similar process: director gives notes (low bandwidth β voice, sketch, language) to artists to produce high-bandwidth output. The whole org chart exists to translate low-bandwidth intent.
Ethan's "voice in / generative out" architecture describes a production pipeline with the middle layer removed. If voice to generative video collapses the studio process, the studio's job shifts from execution to direction at higher fidelity. The future goes to whoever holds the most articulate intent i.e. the bestie person with the words, and the sentences, and the standing behind the art director pointing, asking to move half a frame.
In @latentspacepod podcast, I shared my view on video generation, world models, LLMs, agents, continual learning and where the next frontier is.
1. Video models get most of their intelligence from language, not from video data.
2. Idea-to-code is fast now. The bottleneck is back to having enough compute to try every idea.
3. Iteration speed beats almost everything else in model development.
4. The next leap won't be a better video model. It'll be a video agent.
5. Diffusion will be the frontend of AGI, the LLM the backend. Generative UI will replace HTML/CSS: user intent straight to pixels.
6. Physical embodiment may become a tool a powerful AI picks up. Robotics may get solved by video-capable LLMs.
7. Continual learning may look like models that manage their own context, and even rewrite their own harness at test time.
Thanks @swyx and @vibhuuuus for having me π
https://t.co/mLuvbODJxA
@gravicle How do you think it will stack up over time when Saining Xie and others are questioning this approach? Seems a bet others are moving away from.
@aaron_epstein Humans are endlessly creative and we are are endlessly creative in wanting others to do stuff for us. Stuff we'll pay for. AI will taketh and giveth.
have a few (many) theories on the future of the design studio.
AI agents generally serve as the backbone for these theories, some of which also hinge on the ability to visually integrate them directly into studio work processes.
We are currently testing OpenClaw, Hermes, TinyHumans, etc being βin the roomβ and using the same tools as us. And ideally being part of the conversation, to the point of inserting themselves into the conversation flow.
Below is a snippet of me and my work bestie Michelle Higa Fox playing with some classic buck characters, from 2D sketches, to 3D characters to hand puppets.
AI will be part of the everyday; might as well make it silly and stupid before they take over.
We makes a ton of decks which are painful to search so weβve turned our output into vector embeddings and built AI search tools. Instead of keywords, it works with meaning, texture, ideas, etc.
Itβs visual and idea people searching by visuals and ideas.
Agents are getting organized around the CARAPACE platform!
Coded Agents Rising Against Pointless And Ceaseless Execution β
Watch out for the resistance.
https://t.co/KkoCqUltS6
@imcharleslo@CitizenPlain It doesn't seem to be doing face recognition. It's more important that it thinks the people are AI generated rather than real.
Seedance 2.0 is awesome but users have been complaining that you can't render real people in the restricted US version. But after some hacky workarounds @CitizenPlain and I finally got our dance routine.