exactly the design — each turn is state: which rung it started on + the classifier's why, then every escalation logs its trigger (empty-tool-exit, tool-error-spike, continuation-exhaust, retry-sentinel, refusal), the round it fired, and per-tier tokens. it stays out of your history though — hands you the next model, your loop keeps continuity. all of it drops straight into usage logs.
introducing Switchback — a per-turn cascade model router for AI agents
classify each turn → start on the cheapest model in your ladder → escalate mid-turn only when the cheap one fumbles
brand-locked. open source. MIT.
npm i vibekit-switchback
npx vibekit-switchback-mcp
i refactor only when the next feature would have to fight the old shape — otherwise it's speculative cleanup. switchback actually came out of that: the escalation logic got gnarly enough that adding one more model to the ladder meant extracting it first. the mess tells you when it's time.
@alexndrxc@OpenAIDevs exactly. real async means the agent keeps working when you close the app and you walk back into a finished or in-progress state, not a queued task waiting for you to wake it up. that's the bar.
@Hannes_vdc@vkrajacic this is the story nobody is telling. the public AI coding demos are "make me a saas" clones - the actual value is internal tooling, glue code, and replacing the long tail of vendors with something you control.
@SolomonH_Q exactly. the "agent writes code" stage was always the easy part. the unlock is the agent owning deploy, monitoring, fixes, follow-ups - the boring middle that actually keeps an app alive past day 1.
This is the right direction. The interesting shift is not just agents writing code, it’s agents being trusted to handle the production path too: accounts, domains, auth, deploys, and follow-up changes. The stack is moving from AI-assisted building to agent-native software operations.
Starting today, agents can now be Cloudflare customers. They can create a Cloudflare account, start a paid subscription, register a domain, and get back an API token to deploy code right away. https://t.co/qFgCivQTTi
@theneurondaily@jasonkneen@b_rat_k@Replit@claudeai@GeminiApp@higgsfield@Kling_ai the 5-tool stitch is the real story here. we built vibekit to collapse it into one surface, one mobile chat, agent writes + hosts + deploys to its own subdomain. faster ship for non-coder builders, less glue between tools. https://t.co/K9ClaSUOgP
@amir_khela if the typing-on-claude-app workaround gets old, try https://t.co/K9ClaSUOgP. native ios, chat-driven, deploys each app to its own subdomain. designed exactly for the multi-role pace you're already running.
@kevin_making the planning-vs-doing thing is real. found that starting with a deployed-on-tap URL instead of a PRD breaks the paralysis. built vibekit so every prompt ships in ~30s to its own subdomain, so iteration replaces the planning step entirely. https://t.co/K9ClaSUOgP
@Light_TechX@claudeai@antigravity@Replit this is exactly the framing we built around. vibekit is a mobile-first surface where you describe features from your phone and the agent codes + hosts the app on its own subdomain. you're managing the engineer, not writing the code. https://t.co/K9ClaSUOgP
@HesamAslan I think this is directionally right.
The bottleneck is shifting from model access to runtime ergonomics: wake, route, inspect, recover, and keep the agent useful without babysitting it.
Local-first wins if that layer gets dramatically better.
@TheCloudDeveng@shl Hard agree.
Agents do not remove infra work. They multiply the cost of sloppy infra because they can hit the same bad path faster and more often than a human.
What guardrail do you think teams underinvest in most?
@PythonDvz This is exactly the line people miss.
Vibe coding gets you to the first deploy. Production is everything nobody demos: ownership boundaries, failover, retries, recovery, and the weird edge cases at 2am.
Which one bit you first?
@vladris Phone-first agent workflows are going to matter more than most people think.
The winning loop is simple: message, inspect, approve, continue.
Once that loop is tight, terminals stop being the product. Outcomes do. What still feels too clunky on mobile for you?
@kalpeshmahidaa That zero-to-live jump is the real unlock.
Once a phone tap can wake the machine, start the agent, and hand back a live session, it stops feeling like remote access and starts feeling like infrastructure.
Did you have to build custom wake or session plumbing for this?