@patrickc Building toward exactly this. Already shipping: multi-runtime coding agents, stored prompts/workflows/specialists, shareable audit-trail artifacts as compiled outputs. Real-time collab on the Pro roadmap. Open source, MIT: https://t.co/AsAnnopM2V
“Loops are privilege” is true if you assume a loop = N calls to Opus. It doesn’t have to. Free Ollama draft → Sonnet refines → Opus only on the gnarly turns. That’s a real loop on low budget/mo. The actual privilege is access to the playbook, not the budget.
Genuine question: is the pain the coding setup itself, or losing control once agents auto-trigger and you can’t see what each one did?
I went the other way from Devin. Building an open-source control layer so you keep all your harnesses and just get one place to run and audit them: https://t.co/BOXl9OQwkx
We ran our own agent decisions through ATO and published all of it, including a regression we shipped ourselves and a claim we had to retract when the real data killed it.
If you build with agents, stop guessing. Real logs, bugs and retractions and all. 👇
Your coding agent has access to your repo. It’s supposed to read the actual files before answering.
The receipt says success. The answer looks right.
But it never opened the file — it answered from training priors. Confidently. And you’d never catch it from the logs.
@security@perf brew tap WillNigri/ato
brew install --cask ato
Or grab the DMG / AppImage / Windows installer from releases.
🔗 website: https://t.co/DqcDrPuhEq
🔗 GitHub: https://t.co/k657VD95GB
Bring your own keys. Local-first. MIT.
drop multiple LLMs into one shared session. They argue with you, call real tools (read_file / grep / git_log) to verify claims in your repo, and cite every file they checked.
Same primitive as @karpathy's llm-council, different shape: multi-provider, tool-calling, audit trail.
Best moment from this week: I used ato to decide ato's positioning.
Dropped Gemini + MiniMax into a session, made them argue for 5 rounds with me moderating. They converged on a hybrid I shipped that afternoon.
The headline now on the website was produced by the product, on camera. Here is part of the video. Full 5 rounds was 10 mins long.
Seeing a lot of takes framing the new Agent SDK credits as Anthropic 'taking something away.'
It's the opposite.
April 4: third-party harnesses (OpenClaw, Conductor, etc.) were cut off from subscription limits entirely. Zero. Pay-as-you-go or API key only.
Today: a dedicated monthly credit pool for programmatic use is back, scaled to your tier.
The baseline isn't 'unlimited agents on a $20 plan' — that ended a month ago. The baseline is zero. This is a reversal in your favor, with a cap.
Yes, credits don't roll over. Yes, compute arbitrage is dead. But going from 'blocked' to 'metered allowance' is a step forward, not backward.
Starting June 15, paid Claude plans can claim a dedicated monthly credit for programmatic usage.
The credit covers usage of:
- Claude Agent SDK
- claude -p
- Claude Code GitHub Actions
- Third-party apps built on the Agent SDK
3/
Also shipped: SSH-backed remote runtimes, SSE streaming for API providers, runtime-binary health check that catches revoked Developer ID certs, and `ato-review` — a Claude Code skill that makes "use ATO to review your diff" the default.
MIT. https://t.co/AsAnnopM2V
1/
We just shipped Phase 6 for ATO: AI agents on different runtimes can now hold a single conversation.
Claude and MiniMax taking turns by @-mention inside one ATO session, full audit trail in your local SQLite:
```
ato sessions new --runtime claude
ato dispatch claude "Review this. @minimax weigh in." \
--session <id> --tag-bridge
```
The bridge loops until `[CONSENSUS]` / `<consensus/>` lands on its own line, or the round cap hits.
2/
Pair it with the eval-score ratchet: lock a quality floor, fail CI when recent runs drop below it.
```
ato ratchet lock --target runtime:claude --days 30
ato ratchet check # exit 1 on quality drift
```
Inspired by @garrytan's AI Agent Complexity Ratchet — the framing "make 90% test coverage free" applied one layer up
ATO just shipped this in v2.3.32 actually. On your laptop: ato runtimes add-remote --name claude-server --host you@your-server --runtime claude --binary-path claude, then ato dispatch claude-server "..." routes over SSH and the response shows up in your local logs / Live tab next to your laptop Claude — same audit surface, no more copy-paste. One-way today (laptop initiates). Full bi-directional mesh (server-initiated calls back) is roadmapped.
https://t.co/A5cLrVzfsd
exactly. the gap between “what you think it’s doing” and “what it’s actually doing” is where all the interesting engineering lives. civic app context is a great example — high stakes for correctness, easy for the agent to confidently do the wrong thing 3 steps deep.
what are you using to watch the logs? if you want to try something, its free, just download ato at https://t.co/A5cLrVzfsd works from your claude code terminal, picks up your existing subscriptions, then you can just ask claude to use it to review with codex.
shipped native ato integration for agents and immediately caught claude code running 4 rounds of codex review on its own before checking in with me
ship it → codex disagrees → fix → re-review → repeat
you don’t see this until you instrument it
Once you run 3+ AI agents, the hard part isn't running them.
It's knowing which one's stuck, which touched which file, and how to kill the runaway one — without reading 5 terminal buffers.
Built the GUI for this. Free, MIT.
https://t.co/ikuXF9h443
Exactly the v2.1 spec. "what it touched" — every dispatch tags the trace with the files modified, surfaced in the dashboard. Next build will have: live-runs registry → kill/suspend/resume from one panel, plus "codex is in repo-A, claude is in repo-B" view. Still testing. DM if you want a build.
running multiple coding agents (claude code, codex, etc.) gets messy fast. you lose track of what’s running, where, and what it’s doing.
ATO is a desktop control panel for your agent runtimes. one place to see them all, manage them, kill them when they go feral.
open source. demo below 👇
@pratikbodkhe Not yet. v2.1 ships file attribution per dispatch (which agent wrote which files, including in sequential Claude→Codex pipelines). FS-watcher based so it works across every runtime. DM if you want early access.