Most egocentric datasets train robots on 5s tabletop tasks.
Real work doesn’t look like that.
We built EgoView, industrial egocentric data with ~20s+ long-horizon, multi-step workflows + strict filtering for training-grade quality.
If you’re building physical AI, this matters ↓
https://t.co/ipWnsMci99
#PhysicalAI #Robotics #RobotLearning #AIData #TrainingData #EmbodiedAI #DeepTech #AIInfrastructure
Attended Ken Goldberg’s ICRA keynote. His “two cultures” framing (after C.P. Snow) is the cleanest articulation I’ve seen of where robotics sits in 2026.
Model-based (classical control, Kalman/PID/motion planning—modular, verifiable) vs. model-free (VLA: pixels in, actions out, end-to-end).
He believes in the bitter lesson, but reframes the question from whether scale wins to when. Three constraints he laid out:
(1) Data gap is structural. Robot state spaces hit 50+ DoF (bimanual humanoids); training data trails LLMs by ~5 orders of magnitude. And data ≠ everything—Waymo > Tesla on operational metrics with far less of it.
(2) VLA “pseudo-generality.” Open X-Embodiment’s 160K tasks are mostly pick-and-place under the hood. Brittleness is the tell: 100% on LIBERO → ~17% when the target shifts 2cm. OOD = failure.
(3) The synthesis: GAP. Agents auto-generate decomposable, verifiable ROS-style computation graphs, self-improve via execute→evaluate→revise, then distill to lightweight deployable graphs. 95–98% on cable insertion, pot-on-stove, robust to variation—well past VLA brittleness on the same benchmarks.
The concept I find most useful: “variational automation.” Between fixed automation and full generality—same task, varying initial conditions, structurally similar. The regime where the data flywheel is actually economically viable.
Takeaway: VLA hasn’t reached robotics’ ChatGPT moment. The near-term path is agent-generated verifiable structure—classical reliability × large-model generality—not pure scale.
@tom_jiahao Specs don’t solve manipulation.
You can have payload, DoF, sensing, still fail in real workflows.
What’s missing is training-grade interaction data
especially long-horizon, multi-step tasks.
That’s where most systems break today.
agree?
Had a blast at the World Models Summit by @TELUS & @PlugandPlayTC!
Key takeaways:
Yann LeCun on the future of World Models.
Insights on data value from the panel w/ @mylesliu, TorySmith, Ben Levin & Rajesh Radhakrishnan.
Great to catch up with Yow-Hann Lee to explore data eval collabs, and Ben V. on Gaussian Splatting vs. Lidar+Vision. 🤖✨
Thanks @scepike & Denny Hao for the invite!
#WorldModels #EmbodiedAI #Robotics #AI #DeepreachAI #PnP
Inspired by @YannLecun on world models and embodied intelligence.
Thanks TELUS Digital for the invitation. Excited to discuss how DeepReachAI is building the perception data layer.
#PhysicalAI#WorldModel#EmbodiedAI#Robotics https://t.co/6iTJI3D6HD