Reliability is built into @droid. Factory Router has auto-failover across model providers, so your sessions keep running even when one of the providers goes down.
We also provide dedicated compute capacity for enterprises, and we reserve a guaranteed TPM (token per minute) allocation.
We believe that diversifying LLMs in your agent stack is the future, and the way to achieve the highest reliability in production.
I am very excited to officially welcome @tereza_tizkova to Factory.
Tereza is joining us to work on all things growth. A few days in and she's already crushing it -- she owned the launch of Factory Router on Tuesday.
An ambitious scope for a truly 1 of 1 individual. Welcome!
Did you know you can use the best CLI agent for free (:
Droid let's you BYOK with cliproxyapi/vibeproxy you can plug in any subs, openrouter key, or even local inference to Droid.
Try it out, with the new router + queue mode you can run Droid 24/7 autonomously.
My 2c - this approach is the endgame. We're getting (poorly) conditioned to throw all queries at a single model, even within the same provider.
Reality is the models have different strengths and weaknesses across vendors.
Agnostic / routed is the way to go. Stoked for this.
@GergelyOrosz IMO it’s less about owning the cheapest model, moreso about giving your customers continuous access to the entire Pareto frontier of cost and quality.
They can then decide how to allocate more rationally (task based, budget based, etc.)
CTO of a top PE firm months ago told me he loved @FactoryAI's approach to model routing and it was becoming necessary bc of coding model costs.
Awesome work @matanSF and team :)
Loving all the research coming out of @harvey, @FactoryAI etc right now
The earliest set of application companies are undergoing the evolution from pure app layer to serious research organizations.
A sign of things to come.
As token budgets take on a larger part of operating expenses over time, model routing is the inevitable conclusion. This is also one of the biggest areas of differentiation for the applied AI layer over time.
By understanding the different work patterns in your domain, and having strong evals for that domain, you’ll be able to cost/performance optimize effectively.
We’re still likely at the point where most use-cases will need frontier performance for the foreseeable future; but soon you will be able to peel off individual use-cases and send them to lower cost models once the quality is sufficient for the task.
Enterprises individually trying to figure this out themselves at scale will likely not be possible, so the products that can intelligently route these workflows to the right tier of model will be in a strong position to aggregate more demand.
This is actually one of the main advantages startups have over frontier labs, as long as there's a healthy spectrum of open-weight to closed-weight models on the cost-performance curve.
Building a model harness is fundamentally a bit different than building the underlying model. The last year has shown that an increase in model capabilities has led to an increase in sophistication around agent harnesses as well (e.g. just look at Claude Code ultracode)
Every startup has a meaningful advantage here - from Factory to Cognition to honestly us - if they're able to optimize and finetune the model set for the precise amount of intelligence needed to do a task. Same or higher accuracy at 2-10x lower cost and latency is a real advantage.
Something Codex and Claude Code can't do, and don't want you knowing about.
Being model-agnostic lets us route you to the right model for the job: cheaper and faster without sacrificing intelligence.
Already tested by some large enterprises that are saving costs without a drop in quality.
Routing the right model to the right task is exactly the kind of unglamorous, high-leverage work that compounds. Factory is helping make this the default, not a tradeoff. Great to see @matanSF and the Factory team help builders optimize for both quality and cost.