Do RL solutions share a common structure? We show that all solutions of Reinforcement Learning lie on a hyperplane. Our work, Proto Successor Measure, learns this abstraction of the MDP to do zero-shot RL for any reward function. (1/n)
Two BIG updates for the RLBrew Workshop at #RLC2026! ๐ฃ
1๏ธโฃ Dual submissions are welcome
2๏ธโฃ Weโll be awarding a Best Paper RLBrew Award ๐
You have 2 DAYS LEFT to submit โ deadline: May 29!
Details: https://t.co/segLTne6Tp
Reminder! RLBRew deadline in coming up in 7 days! Submit your works soon๐ฉโ๐ป
Reminder that we accept under review papers! This is a good place to discuss your ideas and get feedback from the community
Introducing RLDP โ a simple, scalable approach for building strong Behavioral Foundation Models. ๐ #ICLR2026
โ Robust objective: avoids brittle unsupervised RL objectives while staying simple and scalable.
๐๏ธ No data-coverage limitations: works across a wide variety of datasets.
๐งฉ One latent space for all tasks: a unified representation space.
๐ฏโก๏ธ๐ค Zero-shot control: specify a reward and directly obtain a policy/behavior โ no additional training required.
Iโll be attending #NeurIPS2025 and presenting our work, โRLZero: Direct Policy Inference from Language Without In-Domain Supervision." Excited to chat about unsupervised RL, reasoning, and RL more broadly. Iโm also exploring industry opportunities โ feel free to reach out!
Intelligent humanoids should have the ability to quickly adapt to new tasks by observing humans
Why is such adaptability important?
๐ Real-world diversity is hard to fully capture in advance
๐ง Adaptability is central to natural intelligence
We present MimicDroid ๐
๐ https://t.co/J8XpND9j1j
Iโll be at #ICML2025 presenting our paper, โProto Successor Measure: Representing the Behavior Space of an RL Agentโ. Excited to connect with others working on unsupervised RL and RL more broadly. Also am on the lookout for research collaborations and opportunities in industries.
Most assistive robots live in labs.
We want to change that.
FEAST enables care recipients to personalize mealtime assistance in-the-wild, with minimal researcher intervention across diverse in-home scenarios.
๐ Outstanding Paper & Systems Paper Finalist @RoboticsSciSys
๐งต1/8
Say ahoy to ๐๐ฐ๐ธ๐ป๐พ๐โต: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to recover from mistakes w/o any additional human feedback! ๐๐ฐ๐ธ๐ป๐พ๐ โต out-performs Diffusion Policies trained via behavioral cloning on 5-10x data!
โ ๏ธ Reminder! Submissions for @RL_Conference's RL beyond Reward Workshop are due May 30 (AoE)!
We are brewing an interesting program and seeking innovative research work in reward-free RL. All papers are welcome, from exploratory abstracts to complete research papers.
It's exciting to think about the capabilities of zero-shot RL as foundation models. Using our work RLZero, you can specify your task as natural language prompts and expect zero-shot policy generation for embodied agents. (1/n)
๐ค Introducing RL Zero ๐ค: a new approach to transform language into behavior zero-shot for embodied agents without labeled datasets! RL Zero enables prompt-to-policy generation, and we believe this unlocks new capabilities in scaling up language-conditioned RL, providing an interpretable link between RL agents and humans and achieving true cross-embodiment transfer.
RLZero imagines the trajectory for the language prompt which is used to produce a policy through zero-shot imitation learning. Opens interesting avenues to apply our recent work on zero-shot RL, PSM (https://t.co/YoRzN1UcGc) (2/n)