Andor @PakotAndor - Twitter Profile

Andor @PakotAndor

1 day ago

0

1

PakotAndor retweeted

Jaskirat Singh

@jaskirat

14 days ago

This is GVHMR, a method for recovering world grounded human motion from monocular video.

7

458

47

358

35K

PakotAndor retweeted

Jiashun Wang

@JiashunWang

15 days ago

Reference motions are often used as trajectories to track or teachers to distill. We explore a different way of learning from them. I am excited to share our work, Generalizing from References (GfR), to appear at RSS 2026, as a follow-up to our previous HIL work. Using a unified multi-task RL framework, we jointly train reference-guided imitation and goal-driven RL within a single end-to-end policy. No distillation. No RL fine-tuning. Just one policy, trained end-to-end, that learns from references and generalizes beyond them. Rather than treating reference motions as trajectories to track, distill, or follow, we use them to shape behavior while allowing RL to explore and adapt beyond the references. In the following example, without human joystick control, the robot can autonomously compose learned skills using only task goals. 🌐 https://t.co/pdMWBWgtCY 🤖 Things beyond locomotion coming soon.

3

145

27

82

9K

PakotAndor retweeted

Louis Le Lay @LouisLeLay4

about 1 month ago

boston dynamics uses mjlab btw :)

9

581

50

300

76K

Who to follow

҉M҉e҉ г҉ г҉y҉ 🍉🔻

@MerryAnarchy

🏴S๓ครђTђєStคtє~FuckUrBorders ♡PROFESSIONALANARCHIST~☆ Anti-fascist #OpDarrienHunt #FreePalestine♡ Free Everyone 🖤

PakotAndor retweeted

about 2 months ago

Sports like tennis are great examples of the sort of dynamic whole-body interaction that’s possible with humanoid robots. But capturing examples of fast, dynamic interactions from humans is really difficult. Enter LATENT, which uses lower-quality human data plus reinforcement learning to teach a robot to play tennis, able to complete back-and-forth volleys at a human level. LATENT has three steps: (1) collecting imperfect human data like a backswing, (2) using these to learn a latent action space, and (3) they train a high-level policy in simulation which can compose these actions and execute tennis skills on a robot. @josh00_lu and @LianYunrui join us to tell us about their method. Watch Episode #80 of RoboPapers, with @chris_j_paxton and @DJiafei, now to learn more!

0

59

14

56

30K

Andor @PakotAndor

about 2 months ago

0

2

PakotAndor retweeted

Tom Dörr

@tom_doerr

3 months ago

Control CAD software with AI via Claude Desktop https://t.co/dtGku2BYS6

6

442

33

443

23K

PakotAndor retweeted

Yunsong Zhou

@Yunsong_Zhou

3 months ago

🧐 Simulation has long promised robot pretraining, but breaks at the moment of real-world deployment. 🚀 Today, we introduce SIM1: the first real-to-sim-to-real paradigm where the generative world becomes the same one as reality. SIM1 produces simulation data whose execution is directly valid in the physical world, enabling policies trained entirely in simulation to transfer zero-shot, at scale. 📈 This unlocks a new scaling law for robotics: we scale intelligence without scaling real-world data. ✨ Few demonstrations in, real-world policies out. Simulation is no longer a proxy; it is supervision itself. https://t.co/Kp1YBe5Gmf https://t.co/GG2SBQfPpG

25

406

63

320

68K

PakotAndor retweeted

Kanika

@KanikaBK

3 months ago

🚨 JUST IN: MICROSOFT just open sourced a VOICE AI THAT TRANSCRIBES 60 MINUTES OF AUDIO in a single pass. 100% FREE. It knows who spoke. It knows when they spoke. It knows exactly what they said. All in one shot. No chunking. No context loss. It's called VibeVoice. Not a transcription tool. Not a basic speech to text wrapper. A frontier voice AI family with ASR, TTS, and real time streaming. All open source. All free. Here's what it actually does 👇 VibeVoice ASR - Speech Recognition: → Processes 60 minutes of continuous audio in a single pass → Never slices audio into chunks so global context is never lost → Identifies WHO spoke, WHEN they spoke and WHAT they said simultaneously → Supports customized hotwords for domain specific accuracy → Works in 50+ languages natively → Already adopted by Hugging Face Transformers library → Already being built on by the open source community BY PEOPLE WHO HAD NO IDEA THIS LEVEL OF ACCURACY WAS ALREADY FREE. VibeVoice TTS - Text to Speech: → Generates up to 90 minutes of speech in a single pass → Supports up to 4 distinct speakers in one conversation → Natural turn taking and speaker consistency throughout → Expressive speech that captures emotional nuances → Supports English, Chinese and multiple other languages VibeVoice Realtime - Streaming TTS: → Only 300 millisecond first audible latency → Streams text input in real time → 0.5B parameters so it actually deploys anywhere → Robust long form generation up to 10 minutes → Lightweight enough for production use today The core innovation nobody is talking about: Most voice AI models slice long audio into short chunks. Every time they slice, they lose context. Speaker tracking breaks. Semantic coherence breaks. Accuracy drops. VibeVoice uses continuous speech tokenizers running at an ultra low frame rate of 7.5 Hz. This preserves audio fidelity while dramatically boosting computational efficiency. The entire 60 minutes stays in context. Nothing gets lost. Nobody gets misidentified. The numbers: → VibeVoice ASR 7B - available now on Hugging Face → VibeVoice Realtime 0.5B - try it on Colab right now → 50+ supported languages → 11 distinct English voice styles → 9 multilingual speaker voices → Already integrated into Hugging Face Transformers → Finetuning code now available The wildest part? A voice powered input method called Vibing just built itself on top of VibeVoice ASR. Available on macOS and Windows right now. The open source community is already shipping products on top of this. 100% Open Source. Free to use. Free to fine tune. Free to build on. 🔖 Save this before your competitors find it first. 👇

48

3K

366

4K

221K

PakotAndor retweeted

mitch @mitchaiet

4 months ago

Introducing G1 Moves! 60 open-source motion capture clips + trained RL policies for the Unitree G1 humanoid robot. Come see live robot mocap and interactive roasts at the Dell booth at #GTC this week! https://t.co/JU33zsTfWH #DellProPrecision #DellTech #NVIDIA #Robotics

7

199

31

106

24K

PakotAndor retweeted

Harrison Kinsley

@Sentdex

3 months ago

Introducing KimoLab: Kimodo + mjlab for prompt to physics-based motion matching for the unitree g1. I accidentally had claude make this and it's actually pretty cool.

10

341

36

161

18K

PakotAndor retweeted

Tom Dörr

@tom_doerr

3 months ago

Web-based robot viewer and simulator https://t.co/oCAmGFoCU1

5

665

90

529

26K

PakotAndor retweeted

BXI Robotics

@bxirobotics

4 months ago

beautiful sparks 🎆 0:05

6

119

13

15

11K

PakotAndor retweeted

Xueyan Zou

@xyz2maureen

4 months ago

Our Latent Encoder-Decoder code base is fully open sourced, you can train and visualize the latent space: Code⚙️: https://t.co/wrYfuOAi5F ArXiv 📚: https://t.co/tu8DjOyJBr #CVPR2026

1

125

35

103

19K

PakotAndor retweeted

Tom Dörr

@tom_doerr

4 months ago

LiDAR SLAM system with 400 FPS performance https://t.co/wpg8GDfSnY

2

446

42

354

21K

PakotAndor retweeted

Tom Dörr

@tom_doerr

4 months ago

Robot simulations with policy control in a browser https://t.co/mg8h5I25bK

5

184

36

140

10K

PakotAndor retweeted

Jinrui Han @jinrui82

5 months ago

Introducing our new project——HUSKY: Humanoid Skateboarding System via Physics-Aware Whole-Body Control ！🎉🎉 Powered by mjlab framework, humanoid skateboarding is coming true right now ! 🙌 View our work at https://t.co/tlHETwj1Jr. Paper available at https://t.co/SO29rGw3g0.

20

414

70

123

67K

PakotAndor retweeted

Jinrui Han @jinrui82

4 months ago

Our HUSKY code is now open-source ! Humanoid skateboarding powered by mjlab 🛹 Check it out: https://t.co/eqcH9czBvG

9

258

53

114

22K

PakotAndor retweeted

stash

@stash_pomichter

4 months ago

Openclaw can now understand physical space and temporality. Integrate with any lidar, stereo, rgb camera. Fully open source. Video below is our openclaw on a Unitree G1 humanoid. We integrate with most drones, quadrupeds as well.

141

4K

555

3K

438K

PakotAndor retweeted

Ilir Aliu

@IlirAliu_

4 months ago

Robots can now reconstruct 3D scenes in real time from a single RGB camera. [📍 Projects page + paper] No depth sensor. No retraining. 30 FPS. Researchers at the Imperial College London introduced KV-Tracker, a training-free method that makes heavy models like π³ and Depth Anything 3 fast enough for real-time tracking. The idea is simple. These models use global self-attention, which is powerful but computationally expensive. KV-Tracker caches the key and value pairs from selected keyframes and reuses them for new frames. That cache becomes an implicit scene representation. Result: • Up to 30 FPS • 10 to 15x speedup • Accurate 6-DoF tracking on benchmarks like TUM RGB-D and 7-Scenes • Works with monocular RGB only It also supports object-level tracking with masks and allows saving the KV-cache for later reuse. For robotics, this reduces hardware constraints and moves real-time 3D perception closer to practical deployment. Credit to Marwan Taher (@marwan_ptr) at Imperial’s Dyson Robotics Lab and many others who contributed to this! 📍 Save projects page + paper for later: https://t.co/IxgTl19zZ7 Video: https://t.co/wVxLhd6VpP ——- if it matters in AI or Robotics you'll read it here first: https://t.co/9Nm01QUcw3

15

853

112

799

51K

Andor

@PakotAndor

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users