Intern Robotics @internrobotics - Twitter Profile

Pinned Tweet

11 months ago

InternRobotics is open-source! 🚀 A Sim-Data-Train/Eval inclusive engine for Embodied AI: ⚙️ 1-line sim deploy 📦 Massive hybrid datasets 🧠 One-click training & eval across 50+ models 🔗 Click to explore: https://t.co/iTV3e12r8U

1

12

6

1

2K

InternRobotics retweeted

Yunsong Zhou

@Yunsong_Zhou

3 months ago

🧐 Simulation has long promised robot pretraining, but breaks at the moment of real-world deployment. 🚀 Today, we introduce SIM1: the first real-to-sim-to-real paradigm where the generative world becomes the same one as reality. SIM1 produces simulation data whose execution is directly valid in the physical world, enabling policies trained entirely in simulation to transfer zero-shot, at scale. 📈 This unlocks a new scaling law for robotics: we scale intelligence without scaling real-world data. ✨ Few demonstrations in, real-world policies out. Simulation is no longer a proxy; it is supervision itself. https://t.co/Kp1YBe5Gmf https://t.co/GG2SBQfPpG

25

406

63

320

68K

Intern Robotics

@InternRobotics

6 months ago

@ModelScope2022 paper is coming：https://t.co/e6l1B904s8

0

2

0

55

InternRobotics retweeted

ModelScope

@ModelScope2022

6 months ago

🤖 Introducing InternVLA-A1 — now fully open-sourced! Many VLA models follow instructions well in static scenes… but struggle in dynamic environments (conveyor belts, rotating platforms, multi-robot setups). Why? They see the present—but can’t imagine the future. InternVLA-A1 solution: unify perception, imagination, and action in one model: ✅ Scene understanding: Image + text → task parsing ✅ Task imagination: Predict future frames → reason about dynamics ✅ Guided control: Execute actions steered by visual foresight Powered by InternData-A1 - Large-scale high-quality simulated dataset, InternVLA-A1 stays robust under complex backgrounds, lighting, and distractions. 🔥 See it in action: 1️⃣ High-speed conveyor: track, predict, and stably grasp or flip packages 2️⃣ Rotating platform: task-aware recognition & precise pick-up of diverse items 📊 Outperforms π0 and Gr00t N1.5 on general manipulation benchmarks! ✨ Model, data, and code are all open! Models: https://t.co/5PYmtZNUoO Datasets: https://t.co/Ipi9XF6sw7 GitHub: https://t.co/AddhXUyugu

14

547

88

424

38K

Intern Robotics

@InternRobotics

6 months ago

Links & Resources �� model：https://t.co/38mWjYiVC9 🧩 code：https://t.co/JlpH9DKbWY 📦 data：https://t.co/IfqCBY1SSn 📄 paper：https://t.co/e6l1B904s8

0

5

0

100

Intern Robotics

@InternRobotics

6 months ago

Meet InternVLA-A1 🤖✨ It unifies scene understanding, visual foresight, and action execution into a single framework. 🧠 The Core: Synergizes MLLM's semantic understanding with world-model-style dynamic prediction, to "imagine" the future and guide adaptive actions. 🚀 The Fuel: Empowered by high-fidelity synthetic data (InternData-A1). The result? A VLA model that tackles high-dynamic scenarios with effortless mastery.

1

4

1

0

167

Intern Robotics

@InternRobotics

6 months ago

Just a camera. No IMU. Low-cost setup — and it’s open-sourced. 🚀 👉 Project (more demos): https://t.co/XTgmG3ICEX 📄 Paper (please upvote): https://t.co/N4N0QeT9dt 💻 Code + deployment (please star): https://t.co/6EhTj5gcCt

Wenzhe Cai @WenzheC7616

6 months ago

🤖Can robots achieve accurate navigation without any external localization feedback? 📸We present #LoGoPlanner, which handles perception, localization, and planning in one go! Check our results on LeKiWi, G1, and Go2 robots. 🌐Project: https://t.co/mWIMzIfuqT

2

214

51

116

42K

0

4

0

1

170

Intern Robotics

@InternRobotics

7 months ago

code: https://t.co/mm45I8zEAD paper: https://t.co/WCQGH5iBMi webpage: https://t.co/l6JtNZVbar model: https://t.co/fVBhQsXDkl huggingface paper: https://t.co/jZsO1E2inA

Wenbo Hu

@gordonhu608

7 months ago

🚀 Introducing G^2VLM: Geometry Grounded Vision Language Model with Unified 3D Reconstruction and Spatial Reasoning G^2VLM can natively predicts 3D attributes (depth, camera pose, pointmaps) and uses them for spatial understanding via interleaved reasoning. 🔧 End-to-End Unified Model ✅ Monocular & Video Depth Estimation ✅ Pose Estimation ✅ 3D Point Reconstruction ✅ Spatial Understanding and Reasoning 🏆 This design helps G^2VLM achieve robust performance in both 3D Reconstruction and Spatial Reasoning! #VLM #spatial #3D #LLM 💻 Code at: https://t.co/WxvTgeG9Jw 👇[1/n]

2

67

14

22

13K

0

2

0

220

Intern Robotics

@InternRobotics

8 months ago

Great release! Gallant demonstrates a clean voxel-grid pipeline for perceptive humanoid locomotion — unified policy, strong generalization across stairs, gaps, stepping stones, and cluttered spaces.

Elgce @BenQingwei

8 months ago

Introducing Gallant: Voxel Grid-based Humanoid Locomotion and Local-navigation across 3D Constrained Terrains 🤖 Project page: https://t.co/9hua2TXfKj Arxiv: https://t.co/BadszLcPut Gallant is, to our knowledge, the first system to run a single policy that handles full-space constraints — including ground-level barriers, lateral clutter, and overhead obstacles on a humanoid robot. Instead of elevation maps or depth cameras, Gallant uses a voxel grid built directly from raw LiDAR as its perception representation, giving it inherent 3D coverage of the scene. With our custom LiDAR simulation toolkit (https://t.co/TsezpNF6cu), we model realistic scans, including returns from the robot’s own moving links, which is crucial for sim-to-real transfer. On the control side, we use a target-based training scheme rather than standard velocity tracking. The robot is given a goal and learns to discover its own in-path velocities and trajectories, so no external high-frequency command stream is needed during deployment. The policy itself is intentionally lightweight: just a 3-layer CNN + 3-layer MLP (~0.3M params), running onboard on the Unitree G1’s Orin NX at 50 Hz with no extra compute. Training takes about 6 hours on 8× NVIDIA RTX 4090 GPUs. The resulting policy transfers directly to the real robot and achieves >90% success rate on most tested terrain types. Gallant is our “half-way” step toward robust perceptive locomotion — a problem we believe remains fundamental for humanoid robots. We’re now working toward closing the gap to near-100% reliability and expanding the pipeline further. Code will be fully released soon. Discussion, feedback, and collaboration are very welcome! 🙌

3

206

37

117

54K

0

147

InternRobotics retweeted

Elgce @BenQingwei

8 months ago

Introducing Gallant: Voxel Grid-based Humanoid Locomotion and Local-navigation across 3D Constrained Terrains 🤖 Project page: https://t.co/9hua2TXfKj Arxiv: https://t.co/BadszLcPut Gallant is, to our knowledge, the first system to run a single policy that handles full-space constraints — including ground-level barriers, lateral clutter, and overhead obstacles on a humanoid robot. Instead of elevation maps or depth cameras, Gallant uses a voxel grid built directly from raw LiDAR as its perception representation, giving it inherent 3D coverage of the scene. With our custom LiDAR simulation toolkit (https://t.co/TsezpNF6cu), we model realistic scans, including returns from the robot’s own moving links, which is crucial for sim-to-real transfer. On the control side, we use a target-based training scheme rather than standard velocity tracking. The robot is given a goal and learns to discover its own in-path velocities and trajectories, so no external high-frequency command stream is needed during deployment. The policy itself is intentionally lightweight: just a 3-layer CNN + 3-layer MLP (~0.3M params), running onboard on the Unitree G1’s Orin NX at 50 Hz with no extra compute. Training takes about 6 hours on 8× NVIDIA RTX 4090 GPUs. The resulting policy transfers directly to the real robot and achieves >90% success rate on most tested terrain types. Gallant is our “half-way” step toward robust perceptive locomotion — a problem we believe remains fundamental for humanoid robots. We’re now working toward closing the gap to near-100% reliability and expanding the pipeline further. Code will be fully released soon. Discussion, feedback, and collaboration are very welcome! 🙌

3

206

37

117

54K

Intern Robotics

@InternRobotics

8 months ago

X-VLA is fully open-source！The first complete open pipeline for long-horizon cloth folding. #VLA #embodiedai #Robotics https://t.co/O9i4q2K76t

18

2

1

120

Intern Robotics

@InternRobotics

9 months ago

🎉IROS 2025 Workshop & Challenge Highlights On Oct 20, the Workshop on Multimodal Robot Learning in Physical Worlds, hosted by Shanghai AI Lab, successfully concluded at #IROS2025. 💡 The event gathered experts from UC Berkeley, MIT, Stanford, Tsinghua, Zhejiang University, and ShanghaiTech to explore interactive and generalizable multimodal robot learning bridging simulation and the real world. 📺 Full talk replays are now live on the official website — check them out! 🔗https://t.co/tnBtuXLMd7 #embodiedai #AIResearch

InternRobotics's tweet photo. 🎉IROS 2025 Workshop & Challenge Highlights

On Oct 20, the Workshop on Multimodal Robot Learning in Physical Worlds, hosted by Shanghai AI Lab, successfully concluded at #IROS2025.
💡 The event gathered experts from UC Berkeley, MIT, Stanford, Tsinghua, Zhejiang University, and ShanghaiTech to explore interactive and generalizable multimodal robot learning bridging simulation and the real world.
📺 Full talk replays are now live on the official website — check them out!
🔗https://t.co/tnBtuXLMd7
#embodiedai #AIResearch

0

3

0

922

Intern Robotics

@InternRobotics

9 months ago

🤖 Go from a task like "set the table" to a complete 3D tabletop scene, ready for robot simulation. Meet MesaTask 🚀 [NeurIPS 2025 Spotlight] ✨ 10K+ physics-verified tabletop scenes ✨ 12K+ curated 3D assets ✨ Outperforms baselines in alignment, realism & physicality All data & code are OPEN — try it now ⚡ 🔗 Dataset: https://t.co/hdRVJC3GMw 🌐 Project: https://t.co/FFUtyFokUh 💻 Code: https://t.co/sDzvjQHPFc 📄 Paper：https://t.co/wOtsh0oZco #NeurIPS2025 #AI #Robotics #EmbodiedAI #Dataset #OpenSource

0

9

2

386

Intern Robotics

@InternRobotics

9 months ago

🚨 Important Notice 🚨 Challenge update: 📅 Test server closing extended → Oct 7 ⏰ Final sprint — registered teams, make sure to submit on time! Don’t miss your chance!🔥

Intern Robotics

@InternRobotics

11 months ago

🔥 Join our Challenge on Multimodal Robot Learning in InternUtopia and Real World! 🎮 Tasks: Manipulation & Navigation 🗺️ Each track includes an online qualifier and on-site finals 🧰 Starter kits open now 🥇 Winner prize: $10K 🔗 https://t.co/xEIqoH2Z1w #IROS2025 #Robotics

InternRobotics's tweet photo. 🔥 Join our Challenge on Multimodal Robot Learning in InternUtopia and Real World!
🎮 Tasks: Manipulation & Navigation
🗺️ Each track includes an online qualifier and on-site finals
🧰 Starter kits open now
🥇 Winner prize: $10K
🔗 https://t.co/xEIqoH2Z1w

#IROS2025 #Robotics https://t.co/RtEp08l39L

0

3

1

773

0

1

0

193

Intern Robotics

@InternRobotics

10 months ago

🤖 Behavior Foundation Model (BFM) for Humanoid Robots #robotics #embodiedai We are excited to re-introduce our Behavior Foundation Model for Humanoid Robots, built upon a unified perspective of diverse WBC tasks —— a promising step toward a foundation model for general humanoid control. 🌐 Website: https://t.co/upnU6FwX7z 📄 Paper: https://t.co/UHSD4y9T13

0

4

1

2

177

Intern Robotics

@InternRobotics

10 months ago

We are trending fast on @huggingface ! 🔥🔥🔥 Thanks for all the love on our models, datasets, and papers 🥰 Fuel us with more LIKES & STARS so we can push even harder 🚀https://t.co/gayHWuvXu0

InternRobotics's tweet photo. We are trending fast on @huggingface ! 🔥🔥🔥
Thanks for all the love on our models, datasets, and papers 🥰
Fuel us with more LIKES & STARS so we can push even harder 🚀https://t.co/gayHWuvXu0 https://t.co/9lqMnQsxkl

0

1

0

132

Intern Robotics

@InternRobotics

10 months ago

Shanghai AI Laboratory has launched InternVLA·A1, the first integrated “embodied manipulation model” capable of understanding, imagining, and executing ❗️ Real-world evaluations show it significantly outperforms π0 and GR00T N1.5, demonstrating strong adaptability in highly dynamic scenarios 🌟. The model has been adapted for multiple robotic platforms, including Ark Infinity, Guodi Qinglong Humanoid Robot, Zhiyuan Genie, 松灵, and Franka🦾, enabling users to quickly adapt to new environments and tasks. With the open-source release of InternVLA·A1, AI Lab has shared the complete technical framework for embodied intelligence's "thinking-acting-self-learning" closed loop: InternVLA·M1 serves as the "brain," responsible for spatial reasoning and task planning; InternVLA·A1 acts as the "cerebellum," enabling agile and precise motion execution; The general reward model VLAC enhances reinforcement learning efficiency in real-world applications. At 7:30 PM on September 19 (this Friday), Shanghai AI Laboratory will collaborate with multiple industry experts to host the second live session of Open Source Week, providing an in-depth analysis of the related technologies. Welcome to reserve your spot and join the broadcast.

InternRobotics's tweet photo. Shanghai AI Laboratory has launched InternVLA·A1, the first integrated “embodied manipulation model” capable of understanding, imagining, and executing ❗️

Real-world evaluations show it significantly outperforms π0 and GR00T N1.5, demonstrating strong adaptability in highly dynamic scenarios 🌟.
The model has been adapted for multiple robotic platforms, including Ark Infinity, Guodi Qinglong Humanoid Robot, Zhiyuan Genie, 松灵, and Franka🦾, enabling users to quickly adapt to new environments and tasks.

With the open-source release of InternVLA·A1, AI Lab has shared the complete technical framework for embodied intelligence's "thinking-acting-self-learning" closed loop:
InternVLA·M1 serves as the "brain," responsible for spatial reasoning and task planning;
InternVLA·A1 acts as the "cerebellum," enabling agile and precise motion execution;
The general reward model VLAC enhances reinforcement learning efficiency in real-world applications.

At 7:30 PM on September 19 (this Friday), Shanghai AI Laboratory will collaborate with multiple industry experts to host the second live session of Open Source Week, providing an in-depth analysis of the related technologies. Welcome to reserve your spot and join the broadcast.

0

2

0

1

195

Intern Robotics

@InternRobotics

Last Seen Users on Sotwe

Trends for you

Most Popular Users