How should an RL agent leverage expert data to improve sample efficiency?
Imitation losses can overly constrain an RL policy.
In RL via Implicit Imitation Guidance, we show how to use expert data to guide more efficient *exploration*, avoiding pitfalls of imitation-augmented RL
Not all human-collected demos are created equal:
✔️ All are successful
❌ But some strategies are unreliable or brittle
This can hurt final performance.
Demo-SCORE self-curates reliable training data using online experience.
Paper and videos: https://t.co/FQbQqv10rU
Data curation is crucial for post-training recipes.
But how do we curate?
Curation is usually manual & tedious. And, it's hard to tell if a strategy in the data will be reliable!
We introduce an automatic way to curate, informed by the robot's policy learning.
Very excited to introduce ROAM, our new work that allows a robot to *adapt on-the-go* as it faces OOD situations during deployment, drawing on pre-trained behaviors.
See as ROAM enables our Go1 to roller skate zero-shot 🤖🐕🛼 (without any lessons!)
🧵(1/9)
Reviewing #ZooGems@NASAHubble data, I was struck once again by the sense of depth in this view. The lower band of stars is the edge-on galaxy UGC 12281 at 110 million light-years, barred spiral is at 1.5 Gly, edge-on spiral still beyond. Galaxies beyond galaxies beyond galaxies.
Henri Poincaré, summarizing how I feel about mathematics: "It is by logic that we prove, but by intuition that we discover."
("C'est par la logique qu'on démontre, c'est par l'intuition qu'on invente.")
My heart has always been with the intuition.
We're all spun up! Get up (or stay up) and join the #MarsHelicopter team live in mission control to find out if the first powered flight attempt on another planet was successful.
Mission Control Live
April 19 at 3:15am PDT (6:15am EDT/1015 GMT)
https://t.co/SCVeYsIfBh
Josh Hawley is modern conservatism. He commits voter fraud, but mendaciously accuses Democrats of committing voter fraud. He votes for large corporate tax cuts, yet claims to represent the working class. He rhapsodizes about political virtue, but slavishly supports Donald Trump.