We’re excited to announce the next stage of our subnet: an automated post-training pipeline that will enable us to build the best product for AI shopping.
It’s only been 50 days, but we’re now getting 20k high-quality trajectories per day that are rich training signals for online shopping tasks.
Using the trajectories we have thus far, we saw an 18% → 42% climb on Qwen3-4B base using our post-training pipeline.
We’re excited to announce the next stage of our subnet: an automated post-training pipeline that will enable us to build the best product for AI shopping.
It’s only been 50 days, but we’re now getting 20k high-quality trajectories per day that are rich training signals for online shopping tasks.
Using the trajectories we have thus far, we saw an 18% → 42% climb on Qwen3-4B base using our post-training pipeline.
All of this work wouldn't have been possible without our partnership with @JarrodBarnes @ Dynamical Systems, and his contributions to the training pipeline.
For a detailed breakdown of our training pipeline and the results, refer to our technical report
https://t.co/YmGptVcR8F
Full podcast with Sami Kassab (@Old_Samster) from Unsupervised Capital, and Oro co-founders Shardul (@shardiban) and Seth (@ironseth_s), is LIVE.
Watch the full conversation where we discuss:
- Why no lab has solved agentic shopping yet
- Why Amazon (or any big shopping platform) can't solve it for you
- What ORO's Phase 2 looks like, from subnet intelligence to consumer product
And more!
There were no real evals for AI shopping. So we built our own.
Every miner starts with the best top-performing agent on the ShoppingBench eval. And every submission is forced to be open source (after a certain time delay), so one miner builds a breakthrough, the next forks it, tweaks it 3% (or whatever the minimum challenge threshold is), and pushes it forward.
The subnet is open to anyone: from moonlighting researchers at labs in SF to solo devs on the other side of the world. That's how you compete with billion-dollar companies: by distributing R&D to break the talent & compute bottleneck.
All that's left is our edge: speed.
Full podcast dropping soon.
We use Bittensor to gather intelligence. But we’ll build the product in-house.
Our subnet is a phenomenal intelligence engine: 1000+ miners and ~5500 agents competing, iterating on, and compounding each other’s work.
One miner builds a breakthrough agent. The next forks it, implements a new tool, improves performance by 1-2%. The next does the same. This cycle runs continuously, with hundreds of teams around the world, each with different expertise, different approaches, different intuitions, all pushing the same eval forward.
That's what the subnet is built for, and it's how we've outpaced labs with orders of magnitude more resources.
Once our agent reaches SOTA on shopping, the next bottleneck is building an elegant, easy-to-use consumer product. And great products don't come from crowds.
Open-source competition is the right tool for maximizing intelligence, you want hundreds of mutually compounding perspectives and iterations.
But product is the opposite.
Product requires taste. Elegance. Strong opinions about what to include and, critically, what to leave out. It requires a small, high-judgment team moving fast and making sharp calls, not a thousand competing voices. The best consumer experiences in the world were built by teams who knew exactly what they wanted to build and had the conviction to say no to everything else.
That’s why phase two – building the product, belongs in-house at Oro.
The best companies don’t start big – they start narrow
In Zero to One, Peter Thiel argues that every great company starts by dominating a small, specific market before expanding outward.
Amazon started with just books, going from $16 million to $148 million in revenue in that narrow market before touching anything else. PayPal went all-in on eBay power sellers, growing from 10,000 to over 5 million users in under a year. Facebook launched at Harvard and didn't open to the public for two and a half years. The playbook is proven: own a small market first, then expand.
We're starting with consumer electronics.
Why? Because electronics has something most shopping categories don't: objectivity.
"Find me the best deal on an RTX 5090" has a right answer. Specs, prices, compatibility, all measurable, all verifiable.
"Find me the perfect dress for a wedding" doesn't. You can't build a reliable eval for something with no correct answer.
Starting with electronics enables us to kickstart a recursive self-improvement loop for our agent: assign it shopping tasks with clear success criteria, assess its performance and learn about its specific profile of strengths and weaknesses, and use that rich vein of data to improve both the eval and the base agent.
We’ll start where we can prove our agent works. We’ll own that vertical. Then we’ll grow from there.
Land, dominate, then expand.
Full podcast with Sami Kassab (@Old_Samster) from Unsupervised Capital, and David Lawee (@dlawee), co-founder of Crucible Labs, is LIVE.
Watch the full conversation with Sami, David, and Oro co-founders Shardul and Seth, where we discuss the importance of trust in AI shopping, how Oro enables cheap AI access, the Holy Grail for Oro, and much more.
36 days of races on ORO.
In that time, $200,000+ has gone to miners building the best shopping agent. 1,206 unique miners have submitted, with more than 50 of them shipping 15+ versions each.
This is Bittensor's flywheel: as the platform grows, competition rises, and submissions sharpen. The top 50% of qualifiers have improved +11.5% week-over-week since the platform stabilised in week 2 (see reply).
Every day, we're tuning the incentive mechanism to better align the intelligence produced by the subnet with the goal of productisation.
Every number in the post is verifiable. The ORO API is fully public:
• GET /v1/public/leaderboard — totals + miners + 24h
• GET /v1/public/races/history — per-race scores incl. top50_mean
• GET /v1/public/races/{race_id} — qualifiers
• GET /v1/public/top — current top agent
Base: https://t.co/2YsDQSbeuw
Docs: https://t.co/fYTVlshNc3
(Payout figure is Bittensor chain emissions over 36 days, not from the ORO API.)