Huge respect to the team behind ABC — open data, training, and infra in one release is exactly what the field needs. 🙌 Great to see 400+ hours of sim teleop data across 10 tasks coming soon.
Excited to share that Axis's teleoperation data in simulation has now surpassed 1,000 hours across 400+ tasks.
Our capacity today: ~5,000 valid trajectories collected in a single hour, equal to 10–15 hours of usable data.
Axis is building the ultimate compounding data engine for Physical AI, at a scale you've never seen.
Axis Weekly
This week was about making the AXIS loop more scalable end to end: automating data-to-model workflows, testing recovery-driven training, expanding TaskGen coverage, and preparing the dataset and model stack for release.
Key updates:
- Data-to-model automation: We used scripts to speed up and standardize several repetitive but critical workflows.
- Continuous-growth training: We completed multi-data-scale training and success-rate comparisons across several failure tasks.
- Failure task expansion: A new batch of failure tasks has been pushed to test, expanding the evaluation range for ablations across data scale, data quality, and randomization.
- TaskGen: Articulated-object generation is now merged into the automatic generation pipeline.
- Model and release prep: We finished the first round of fine-tuning, evaluation, and benchmarking, completed the dataset’s conference submission, and are now improving experimental results for release.
Details below 🧵
GAxis (✱,✱)☀️
New week. New challenges. New access.
We’re giving away 10 BitRobot access codes over the next 72 hours.
Winners will get access to SN/04 and start earning rewards from both Axis and BitRobot.
To join:
1. Follow @axisrobotics & @BitRobotNetwork
2. Like + repost this post
3. Comment with a screenshot or photo of where you’re training right now
Grinding tasks? Climbing the leaderboard?
Show us your journey.
In our conference submission, we evaluate AXIS as a growable data engine for robot manipulation through three questions:
1. Does AXIS pretraining improve π0.5 on downstream LIBERO-Plus robustness tasks, beyond a matched-volume baseline?
2. Does the gain scale with AXIS data volume, from 25% to 50% to 100% of data volume?
3. Which perturbation axes benefit the most, and do they match the diversity targeted by our augmentation pipeline?
Here, “AXIS” refers to our growable manipulation dataset snapshot built around a Franka Research 3 robot: 207 tabletop tasks across 7 scene categories, 50k+ human demonstrations, and 60k+ task/scene variants produced through cleaning and semantic-preserving augmentation.
Findings below 🧵
We start from the released π0.5 checkpoint, optionally pretrain on AXIS or a matched-volume RoboCasa control, then fine-tune on LIBERO and evaluate on LIBERO-Plus.
With AXIS-100% (full dataset), π0.5 improves from 66.5 to 79.4 overall on LIBERO-Plus, a +12.9 point gain.
The matched-volume RoboCasa control uses the same trajectory count as AXIS-100%, but reaches only 57.5 overall, below the vanilla π0.5 baseline.
This suggests that the gain is not driven by data volume alone, but by the AXIS pipeline: broader task coverage, diverse human demonstrations, data cleaning, and large-scale semantic-preserving randomization.
AXIS dataset shows a clear scaling trend.
We evaluate three AXIS snapshots under the same architecture, training budget, rollout protocol, and success checkers. The only major variable is the pretraining corpus.
Overall LIBERO-Plus success increases monotonically as AXIS grows, and the gap from AXIS-25% to AXIS-100% is still +7.4 points, suggesting the benefit has not saturated at the current scale.
This is exactly what we want from a growable data engine: adding validated data continues to produce measurable policy improvement.
The largest gain comes from Layout, the axis where structured scene diversity matters most.
LIBERO-Plus evaluates robustness across seven perturbation axes: Camera, Light, Sensor Noise, Background, Layout, Language, and Robot.
The largest absolute gain from AXIS-100% appears on Layout: +23.2 points over vanilla π0.5.
This directly validates one of the core AXIS data augmentation methods of layout randomization.
AXIS also improves more than visual robustness.
Beyond Layout, AXIS-100% also improves several other LIBERO-Plus axes. These axes measure different kinds of distribution shift.
The result shows that AXIS also improves robustness to viewpoint shifts, degraded visual observations, robot initial-state changes, and task wording variation.
For example, the images below show a real on-site deployment scenario at one of our manufacturing customers.
If we want to help them achieve a scalable, end-to-end “pick-anything” solution, the process is non-trivial. We need to collect large-scale data across their diverse SKUs and part geometries, train a model that can generalize across variations, and then deploy it onto the real production line.
From there, we continue collecting real-world feedback data to fine-tune the policy in production. The goal is to progressively improve stability, cycle time, and accuracy — ultimately delivering a system that is faster, more reliable, and production-ready.
1/6
This marks a shift in what we collect.
Compared with earlier manipulation tasks, long-horizon and bimanual tasks involve more stages, stronger temporal structure, and higher demands on planning, coordination, and recovery.
This is not only an expansion in data volume, but also a meaningful increase in data complexity and learning value.
5/6
From a product and strategy perspective, Axis’s data capability is expanding in two directions at the same time.
On one hand, we continue to improve data diversity by covering more scenes, assets, layouts, and visual variations.
On the other hand, we are also increasing task complexity, pushing the data toward longer-horizon, higher-level behaviors that require more reasoning, coordination, and recovery.
6/6
In short, we are building a more valuable data distribution: from single-step actions to multi-stage tasks, from single-arm manipulation to bimanual coordination, and from a single robot embodiment to cross-embodiment adaptation.
This direction brings us closer to a data infrastructure capable of supporting more complex, realistic, and generalizable robotic intelligence.
We recently launched a new set of robotic data collection tasks, with a focus on long-horizon tasks (LH) and cross-embodiment tasks (Multi Embodiment). These include bimanual teleoperation and task adaptation across different robot morphologies.
Why this matters:
1. Axis is moving toward more complex, real-world robotic tasks.
2. Long-horizon tasks make complex data collection more scalable in simulation.
3. Staged checkers turn long tasks into clearer training signals.
4. Cross-embodiment tasks help Axis support multiple robot forms and control modes.
5. Axis is improving both the diversity and complexity of data.
6. The goal is not just more data, but more valuable data.
Details below. 🧵
My experience on Axis Hub has been engaging and rewarding. Contributing data has helped me understand how real-world interactions improve AI and robotics systems.
The platform is well designed and encourages meaningful participation while earning valuable points
@axisrobotics