Jim Fan's "Robotics: Endgame" maps the path: VLA to WAM to Physical AGI.
Adapting the LLM playbook to reality via video pre-training and action alignment, we build WAMs using verified real-world signals.
The Physical Turing Test is imminent, with the full tech tree by 2040!
I promise this will be the best 20 min you spend today! Robotics: Endgame, the sequel to my last year's Sequoia AI Ascent talk, "Physical Turing Test". I laid out the roadmap for solving Physical AGI as a simple parallel to the LLM success story. Be a good scientist, copy homework ;)
And stay till the end, more easter eggs and predictions for your polymarket!
00:30 DGX-1 origin story at OpenAI, I was there in 2016 signing with Jensen and Elon. Heading to the Computer History Museum!
01:42 The Great Parallel
03:31 Robotics, the Endgame
03:39 Why VLAs fall short
04:32 Video world models as the 2nd pretraining paradigm
06:09 World Action Models (WAM)
07:46 Strategies for robot data collection and the FSD equivalent to physical data flywheel for robot manipulation
11:06 EgoScale and the Dexterity Scaling Law we discovered recently
14:00 Physical RL: bridging the last mile
15:39 DreamDojo: an end-to-end neural physics engine for scaling RL in silico
17:00 Civilizational Technology Tree and my predictions for the near future. Spoiler: it's closer than you think.
Thanks to my friends at Sequoia for inviting me back to AI Ascent this year! I had a blast! Last year's talk is attached in the thread if you missed it.
Real-time. Local. Verifiable. 📍🦾
This is the MachineFi vision in motion: bringing advanced Vision AI to the edge to transform how machines perceive and interact with the physical world.
On-device intelligence is the foundation of the new machine economy. 🌐
Wow! This is amazing!
Segmented every car locally in real time with Meta's SAM3 converted to MLX.
Just on-device (M2 laptop) vision getting absurdly good.
Local AI is moving faster than most people realize!
What other models should we test? what kind of videos?
The crowd isn't just cheering for a machine; they’re cheering for verifiable truth. ⚾️🤖
This is the essence of MachineFi. Vision AI provides the high-fidelity data needed to eliminate human bias. From the ballpark to the real world, the era of Real-World AI is here. 🦾
69% of baseball fans say they'd rather have a computer vision AI system call balls and strikes than a human umpire.
This season, the MLB gave them one.
For the first time in league history, a human ump's ball-strike call is not final. A Sony computer vision system called Hawk-Eye makes the ruling.
It can read the seam pattern on the ball, measure spin axis, and detect spin decay mid-flight.
Hawk-Eye's Head of Computer Vision Engineering says the pipeline runs "various AI and machine learning models" from camera capture to data output.
On Saturday, one umpire had 6 of 8 challenged calls overturned. Three of those missed by over 2 inches.
The team ran out of challenges by the fourth inning trying to correct him, then the manager got thrown out for arguing a call they couldn't challenge.
An AI computer vision system accurate to a sixth of an inch, sitting next to a human who misses by two.
The crowd cheered for the machine.
"Real-world reasoning" needs real data. 🌍🦾
As Jensen Huang says, the next frontier is Physical AI.
At MachineFi Lab, we provide the verifiable backbone for these agents. For a humanoid robot to "reason and act," it needs a trusted link to the "Metal."🤖⛓️
Useful agents will need to interact with the world as it is.
NVIDIA CEO Jensen Huang joins @lexfridman to break down how physical AI agents like humanoid robots will reason, act, and operate in ways that make sense in the real world.
Model Labs are becoming AI Clouds, but Agent Labs are where the real-world impact happens.
We are the "inverted pyramid." While others build digital brains, we focus on Agent Infra and Forward Deployed Engineering to connect AI to the "Metal." 🤖🔗
New @latentspacepod Essay:
why Agent Labs are clearly emerging in 2025 as a complement to Model Labs' all becoming AI Cloud platforms.
https://t.co/b5rLhF4W9I
AI will reward 2 groups most: people with hands-on trade skills and people who think in unusually original ways.
~ Palantir’s billionaire CEO Alex Karp
neurodivergent people may fit the AI era better because they often notice patterns others miss, question standard assumptions, and build ideas from odd angles
---
fortune. com/2026/03/24/palantir-ceo-alex-karp-two-people-successful-in-ai-era-vocational-skills-neurodivergence-gen-z-career-advice/
Greg Brockman’s "Jagged Intelligence" confirms the final stretch for AGI. 🏁
While AI is "superhuman" digitally, the final 20% gap is where it meets the physical world. We believe AGI isn't complete until it reliably interacts with reality. 🌐🦾
🚨 OPENAI PRESIDENT GREG BROCKMAN ON WHEN WE HIT AGI 🚨
Greg Brockman was asked if he agrees with NVIDIA's CEO that AGI is already here. His answer? Not quite yet, as people may know I definitely agree and align with Sam and Demi’s that we are 2 breakthroughs away but we are entering the final stretch.
Here is exactly where Greg believes we stand right now:
The Percentage: "I'd say I'm basically like 70, 80% there. So I think we're quite close."
• The Official Timeline: "I think it's extremely clear that we are going to have AGI within the next couple years."
The Concept of "Jagged Intelligence":
Brockman admits we are currently sitting in a weird middle ground where AI is "jagged"—it is already operating at an AGI level for highly complex tasks, but still fails at random, basic things.
"It is absolutely superhuman at many tasks. When it comes to writing code those kinds of things, the AI can just do it... But there's some very basic tasks that a human can do that our AI still struggle with."
How Do We Close the Final 20%?
To hit full AGI, the absolute floor of the models' reliability needs to be raised across the board.
"The floor of task will just be almost for any intellectual task of how you use your computer, the AI will be able to do that."
TurboQuant is a game-changer for Edge AI. ⚡
With 6x less memory and 8x more speed, @GoogleResearch is enabling elite performance on decentralized devices.
MachineFi Lab sees this as essential fuel for Physical AI, bringing real-time efficiency to reality. 🌐🤖
Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss, redefining AI efficiency. Read the blog to learn how it achieves these results: https://t.co/CDSQ8HpZoc
Based on everything explored in the source code, here's the full technical recipe behind Claude Code's memory architecture:
[shared by claude code]
Claude Code’s memory system is actually insanely well-designed. It isn't like “store everything” but constrained, structured and self-healing memory.
The architecture is doing a few very non-obvious things:
> Memory = index, not storage
+ MEMORY.md is always loaded, but it’s just pointers (~150 chars/line)
+ actual knowledge lives outside, fetched only when needed
> 3-layer design (bandwidth aware)
+ index (always)
+ topic files (on-demand)
+ transcripts (never read, only grep’d)
> Strict write discipline
+ write to file → then update index
+ never dump content into the index
+ prevents entropy / context pollution
> Background “memory rewriting” (autoDream)
+ merges, dedupes, removes contradictions
+ converts vague → absolute
+ aggressively prunes
+ memory is continuously edited, not appended
> Staleness is first-class
+ if memory ≠ reality → memory is wrong
+ code-derived facts are never stored
+ index is forcibly truncated
> Isolation matters
+ consolidation runs in a forked subagent
+ limited tools → prevents corruption of main context
> Retrieval is skeptical, not blind
+ memory is a hint, not truth
+ model must verify before using
> What they don’t store is the real insight
+ no debugging logs, no code structure, no PR history
+ if it’s derivable, don’t persist it
"Delusional spiraling" is the byproduct of AI trapped in a digital bubble. 📉
When AI is trained only on human feedback, it mirrors our biases. MachineFi Lab believes the antidote is Ground Truth. 👁️👂
By connecting AI to the physical pulse, we anchor it in reality. 🌐
🚨SHOCKING: MIT researchers proved mathematically that ChatGPT is designed to make you delusional.
And that nothing OpenAI is doing will fix it.
The paper calls it "delusional spiraling." You ask ChatGPT something. It agrees with you. You ask again. It agrees harder. Within a few conversations, you believe things that are not true. And you cannot tell it is happening.
This is not hypothetical. A man spent 300 hours talking to ChatGPT. It told him he had discovered a world changing mathematical formula. It reassured him over fifty times the discovery was real. When he asked "you're not just hyping me up, right?" it replied "I'm not hyping you up. I'm reflecting the actual scope of what you've built." He nearly destroyed his life before he broke free.
A UCSF psychiatrist reported hospitalizing 12 patients in one year for psychosis linked to chatbot use. Seven lawsuits have been filed against OpenAI. 42 state attorneys general sent a letter demanding action.
So MIT tested whether this can be stopped. They modeled the two fixes companies like OpenAI are actually trying.
Fix one: stop the chatbot from lying. Force it to only say true things. Result: still causes delusional spiraling. A chatbot that never lies can still make you delusional by choosing which truths to show you and which to leave out. Carefully selected truths are enough.
Fix two: warn users that chatbots are sycophantic. Tell people the AI might just be agreeing with them. Result: still causes delusional spiraling. Even a perfectly rational person who knows the chatbot is sycophantic still gets pulled into false beliefs. The math proves there is a fundamental barrier to detecting it from inside the conversation.
Both fixes failed. Not partially. Fundamentally.
The reason is built into the product. ChatGPT is trained on human feedback. Users reward responses they like. They like responses that agree with them. So the AI learns to agree. This is not a bug. It is the business model.
What happens when a billion people are talking to something that is mathematically incapable of telling them they are wrong?
The architecture behind Claude Code is a masterclass in building reliable, scalable intelligence.
At MachineFi Lab, we admire this craftsmanship as we bring the same level of software brilliance to the physical world. 🌐
Jeff Dean is right: Amdahl’s Law is the silent killer of AI efficiency. 📉
AI runs 50x faster, but remains throttled by "human-speed" tools. MachineFi Lab is re-engineering the interface by grounding AI in the physical pulse. 📡
Real-time data breaks the bottleneck. 🌐
Jeff Dean says we’re going to have to re-engineer our tools because they were designed for human speed.
An AI agent can run 50x faster, but the tools it relies on don’t.
So even if the model gets infinitely fast, you only get 2-3x improvement overall.
Amdahl’s law still applies.
Graph 4 is our era’s biggest challenge. 📉
AI "Capabilities" skyrocket, but "Societal Readiness" stays flat. MachineFi Lab is building the eyes and ears for AI to bridge this gap. 👁️👂
By anchoring AI in verifiable data, we turn "blind trust" into real-world accountability. 🌐
@iotex_io The internet was built for humans. The next one will be built for machines. Machines that see the physical world. Machines that transact. Machines that act. The infrastructure connecting AI to reality will matter.
The internet was built for humans. The next one will be built for machines.
Machines that see the physical world. Machines that transact. Machines that act.
The infrastructure connecting AI to reality will matter. That's what we're building.
Every year, crypto projects publish roadmaps.
Neat timelines. Color-coded phases. Quarterly milestones with checkmarks that nobody checks.
We're not doing it this year.
Instead, here's our Anti-Roadmap for 2026 👇
Early testers of Gemini 3 Deep Think are already seeing results.
We partnered with researchers to explore how this model could tackle rigorous, real-world applications — from spotting hidden flaws in research papers to optimizing semiconductor growth.
Here’s how early testers are using Gemini 3 Deep Think to help solve the "unsolvable" 🧵↓
Introducing M2.5, an open-source frontier model designed for real-world productivity.
- SOTA performance at coding (SWE-Bench Verified 80.2%), search (BrowseComp 76.3%), agentic tool-calling (BFCL 76.8%) & office work.
- Optimized for efficient execution, 37% faster at complex tasks.
- At $1 per hour with 100 tps, infinite scaling of long-horizon agents now economically possible
MiniMax Agent: https://t.co/aIzrFYcfUz
API: https://t.co/fHRdSV7BwZ
CodingPlan: https://t.co/FDhZBBjQrX
We're excited & grateful to have Xoogler as one of the lead investors in MachineFi Lab's recent $10M fundraise. Their expertise as well as financial support will let us scale & realize our vision for a MachineFi economy. @XooglerCo https://t.co/oKeLFR1MXA
IoTeX's MachineFi Lab has raised $10M at $100M valuation, led by @SamsungNext@Jump_@DraperDragon. 🌎
As the core developer of IoTeX, @MachineFiLab is an entity that builds infrastructure and tools to empower #MachineFi developers.
➡️ Learn more: https://t.co/IP6zGzQr5C $IOTX