How should an RL agent leverage expert data to improve sample efficiency?
Imitation losses can overly constrain an RL policy.
In RL via Implicit Imitation Guidance, we show how to use expert data to guide more efficient *exploration*, avoiding pitfalls of imitation-augmented RL
We’re rolling out an upgrade designed to help robots reason about the physical world. 🤖
Gemini Robotics-ER 1.6 has significantly better visual and spatial understanding in order to plan and complete more useful tasks. Here’s why this is important 🧵
And it's even better in-person! Got to see Memo live a few weeks ago and it's such a great design :) Love the gloves, seems to enable a scalable path to high quality data. Huge congrats to the team, especially @tonyzzhao so impressed by your resilience over the past 2 years!
Today, we present a step-change in robotic AI @sundayrobotics.
Introducing ACT-1: A frontier robot foundation model trained on zero robot data.
- Ultra long-horizon tasks
- Zero-shot generalization
- Advanced dexterity
🧵->
How should an RL agent leverage expert data to improve sample efficiency?
Imitation losses can overly constrain an RL policy.
In RL via Implicit Imitation Guidance, we show how to use expert data to guide more efficient *exploration*, avoiding pitfalls of imitation-augmented RL
@ShuozheL Thank you! The Cholesky decomp gives an easy way to ensure a symmetric positive definite covariance matrix, but other modeling choices for representing the noise distribution would be interesting to try
@KyleMorgenstein Thanks Kyle! :) We chose the Cholesky decomp just because it gave an easy way to ensure a symmetric positive definite covariance matrix, but it would be interesting to experiment with other modeling choices for representing the noise distribution!
Paper link: https://t.co/r2CAtcbP6D
Super fun project co-led with @perryadong@AlecLessing and with @chelseabfinn.
Excited to further push sample efficiency for RL with better exploration priors!
🚨 We’re thrilled to announce our ICCV 2025 Workshop: MMRAgI – Multi-Modal Reasoning for Agentic Intelligence! 🚨
🌐 Homepage: https://t.co/fJLQrRJH1D
📥 Submit: https://t.co/1aprj1K4gj
🗓️ Submission Deadline (Proceeding Track): June 24th 2025 23:59 AoE
🗓️ Submission Deadline (Non-Proceeding Track): July 24th 2025 23:59 AoE
AI Agents are evolving fast — but true intelligence needs reasoning across modalities. Vision, language, audio… it’s time to unify them.
From digital and virtual agents to wearable and physical embodiments, agentic intelligence is reshaping how AI interacts with the world. As agents increasingly engage in 3D perception and geo-centric reasoning, bridging modalities with spatial understanding is more critical than ever.
💡 Join us to explore the frontiers of multi-modal agents:
• Reasoning with MFM-powered agents
• Applications in OS copilots, Scientific Agents, Digital Agents, Virtual Agents, Wearable Agetns and Embodied Agents!
• Challenges in alignment, evaluation, efficiency, and robustness
📝 Call for Papers is now OPEN!
📅 Workshop: Oct 19–20 2025
Whether you work on models, methods, or applications — we want to hear from you!
#ICCV2025 #MMRAgI #MultimodalAI #AIagents #LLM #MFM #EmbodiedAI #3DVision
How can robots problem solve in novel environments?
We combine high-level reasoning with VLMs with low-level controllers to allow test-time problem solving.
Paper & code: https://t.co/rWUsQ3NuhJ
How can robots autonomously handle ambiguous situations that require commonsense reasoning?
*VLM-PC* provides adaptive high-level planning, so robots can get unstuck by exploring multiple strategies.
Paper: https://t.co/UmR6raIPiW
Leveraging VLMs in this way allows a robot to handle (zero-shot!) a wide range of complex real-world situations that wide range of complex scenarios that would otherwise require environment-specific engineering or human guidance
Data curation is crucial for post-training recipes.
But how do we curate?
Curation is usually manual & tedious. And, it's hard to tell if a strategy in the data will be reliable!
We introduce an automatic way to curate, informed by the robot's policy learning.