🚀 I'm excited to share our new paper: SegDAC: Segmentation-Driven Actor-Critic for Visual Reinforcement Learning
🧠 SegDAC combines large vision models with online RL to reason about its environment at the object and sub-object level, avoiding noisy pixel-level reasoning.
🛠️ Using YOLO-World and SAM, SegDAC breaks the scene into semantically meaningful segments and learns to attend to a variable number of segments and proprioceptive signals, focusing on the most relevant information to complete the task.
⚡ Trained purely with online RL, without human labels or demonstrations.
🏆 Outperforms previous online RL state-of-the-art methods across all difficulty levels on our challenging visual generalization benchmark, with up to 2x better visual generalization in the hardest setting.
📄 Paper: https://t.co/hGfOlVf6O0
🌐 Project Page: https://t.co/edcgnRPrYw
Work done with @GlenBerseth at @Mila_Quebec
#ReinforcementLearning #RobotLearning #ArtificialIntelligence #Robotics
Whole body controllers - effective with contact rich behaviors - are the unsung HEROs of robotics🦸♂️
Without them all we will have - is a bunch of over powerful pincers picking & placing tiny objects on the table.
(a bit harsh but true)
4 years ago, we finished dead-last in the NHL
Today, we are headed to the Final 4, after beating two other top-5 teams, all while being the second youngest team in the entire league.
This is the BEGINNING of the #GoHabsGo era. We’re nowhere near our prime.
@GlenBerseth My intuition is that with some regularization in place to prevent useless continuous rotations, no joint limit might actually help getting out of weird and awkward joint states (just continue the rotation instead of needing to undo a motion).
The bitter lesson in 26 words:
Don’t be distracted by human knowledge, as AI has been historically.
Instead focus on methods for creating knowledge that scale with computation, like search and learning.
World models are becoming a powerful approach for making the most of available data, but how do we create them to help build better agents? Come check out this workshop at @RL_Conference and submit related ideas!
Launching Agentick 🤖🧠
A unified benchmark for training and evaluating general sequential decision-making agents.
RL agents, LLMs, VLMs, hybrids, bots, and humans can all be evaluated on:
same tasks. same seeds. same score.
First result: no single agent dominates.
🧵
It's clever, saying you're restricting access to your model because it's dangerous, when the real reason is you don't have the compute. Makes you look more powerful instead of less.