I think we could use this tech to speed up the training of our Agents, but we'd need to mix things up with a wider range of scenarios. The way things are going now, AI Agents are learning at a snail's pace.
Amazing!“ a GPU-powered simulation suite that accelerates physics by 10,000x faster than real time. To put the number in perspective, the robots undergo 1 year of intense training in a virtual “dojo”, but take only ~50 minutes of wall clock time on one GPU card.”
Amazing!“ a GPU-powered simulation suite that accelerates physics by 10,000x faster than real time. To put the number in perspective, the robots undergo 1 year of intense training in a virtual “dojo”, but take only ~50 minutes of wall clock time on one GPU card.”
Not every foundation model needs to be gigantic. We trained a 1.5M-parameter neural network to control the body of a humanoid robot. It takes a lot of subconscious processing for us humans to walk, maintain balance, and maneuver our arms and legs into desired positions. We capture this “subconsciousness” in HOVER, a single model that learns how to coordinate the motors of a humanoid robot to support locomotion and manipulation.
We trained HOVER in NVIDIA Isaac, a GPU-powered simulation suite that accelerates physics by 10,000x faster than real time. To put the number in perspective, the robots undergo 1 year of intense training in a virtual “dojo”, but take only ~50 minutes of wall clock time on one GPU card. The neural net then transfers zero-shot to the real world without finetuning.
HOVER can be *prompted* for various types of high-level motion instructions that we call “control modes”. To name a few:
- Head and hand poses: can be captured by XR devices like Apple Vision Pro.
- Whole-body poses: via MoCap or RGB camera.
- Whole-body joint angles: Exoskeleton.
- Root velocity command: Joysticks.
What HOVER enables:
- A unified interface for us to control the robot using whichever input devices are convenient at hand.
- An easier way to collect whole-body teleoperation data for training.
- An upstream Vision-Language-Action model to provide motion instructions, which HOVER translates to low-level motor signals at high frequency.
HOVER supports any humanoid that can be simulated in Isaac. Bring your own robot, and watch it come to life!
It's a big teamwork from NVIDIA GEAR Lab and collaborators: 🧵
arXiv -> alphaXiv
Students at Stanford have built alphaXiv, an open discussion forum for arXiv papers. @askalphaxiv
You can post questions and comments directly on top of any arXiv paper by changing arXiv to alphaXiv in any URL!
🗣️I've been thinking about data quality & human factor in the process a lot lately, so write a short post on the topic: https://t.co/onWGvdeGVP
More: If you are into the topic, my team is hiring Research Engineer for a new sub-team Human-AI Interaction: https://t.co/jwqp3YT4rR
Noise is a secret destroyer of productivity.
It is secret because it impacts cognition, not effort, so we don’t notice, but a 10db noise increase (from a dishwasher to a vacuum) lowers productivity by 5%. Noise is also greater in poorer neighborhoods... https://t.co/RaIAsSudyY
Here are 7 challenges that AI engineers must solve in order to build large-scale intelligent agents (“LLM OSes”):
1️⃣ Improving Accuracy: Make sure agents can solve hard tasks well
2️⃣ Moving beyond serial execution: identify parallelizable tasks and run them accordingly
3️⃣ Reliability: making sure agents don’t get lost in the middle of intermediate steps
4️⃣ Testability: Creating unit tests for specific code paths
5️⃣ Long-term planning: Making LLMs plan not just 1-5, but 10 or even 100 steps into the future
6️⃣ Debuggability: Move beyond reading the sysout thoughts/reasoning
7️⃣ Fault tolerance: recover from wrong decisions and adapt
We were honored to host @sehoonkim418 and @amir__gholami to present their work on LLMCompiler: an agent compiler for parallel multi-function planning/execution, and also chat about the future of agents at large.
Check it out, it's now live on YouTube! The authors have also graciously shared the slides so you can peruse at your own leisure 👇
Video: https://t.co/OIMMeMuY30
Slides: https://t.co/pyJJ8dKm51
Paper: https://t.co/wIBD2BHAxq
on november 6, we’ll have some great stuff to show developers! (no gpt-5 or 4.5 or anything like that, calm down, but still i think people will be very happy…)
https://t.co/QH1mpXzoqp
1/4 Regarding super individuals, we hope to complete the closed loop from consultation ➡ incubation ➡ investment within two years. Expert models, exclusive data, and multi-agent systems will all enable individuals to offer scalable business services to the entire network.
3/4 The emergence of concepts such as "super individuals," "digital new youth," "super producers," and so on, actually represents a clear trend – a class transition in individual productivity and production relationships.
I believe that there is no all-knowing and all-powerful god in the universe. Human progress is always accompanied by biases, and history has never defined the unknown based on known local information. All distant truths are deductions of logic and assumptions based on facts.
The collective consensus may be the bias of the times. We should be cautious of how obvious common sense can mislead our judgment of the truth. We should maintain an independent and self-critical spirit and approach the concept of the metaverse with an open and inclusive mindset.