We’ve automated every single thing we can @every with AI agents.
And yet there’s way more human work to do than ever. We’ve gone from 4 -> 30 human employees since GPT-3.
I wrote a report on the structural reasons: how AI makes expert competence cheap, why that drives up demand for experts, and why the dynamic only intensifies as we approach AGI.
After Automation: https://t.co/Lb7SUCduAg
Live from Code with Claude London: we're launching self-hosted sandboxes (public beta) and MCP tunnels (research preview) in Claude Managed Agents.
Run agents inside your own perimeter, with your security controls applied by default.
Introducing https://t.co/ZbU21ECKPE, the largest open-source catalog of skills to reliably perform any task on the internet.
We've researched hundreds of sites to give your agents the playbook they need to navigate the web.
📣 What if every open issue had a Codex agent?
That’s the idea behind Symphony, an open-source agent orchestrator for Codex that turns task trackers into always-on systems for agentic work, letting humans focus on review and direction.
The math on a single PM mockup just dropped from $1,500-6,000 down to $2-7. Most PMs haven't repriced their workflow yet.
Old path: PM writes a brief, waits 3-7 days for a designer slot, designer spends 6-15 hours building it. Loaded cost lands at $1,500-6,000 depending on team.
New path: PM opens https://t.co/1l65nzaBLo, attaches a screenshot, types a prompt, clicks generate. 12 minutes. $2-7 in tokens. Hands off to Claude Code with design intent embedded.
That's roughly 500x cost compression and 50x speed compression. Run the same math on decks. An investor-grade deck from a design agency runs $5,000-15,000 over 2-3 weeks. Claude Design produces it in 8 minutes for $5-10. Brilliant cut complex pages from 20+ prompts in competing tools to 2 prompts in Claude Design. Datadog reports going from rough idea to working prototype before anyone leaves the meeting room.
Two SaaS categories just collapsed into one workflow. AI prototyping (Figma Make, Lovable, v0, Bolt, Magic Patterns) and presentations (Figma Slides, Gamma, https://t.co/JTgqzJ7mhC) both got repriced in one product launch, with brand applied automatically from your codebase. Figma keeps the design system. Claude takes the first-draft work.
Aakash's piece walks the exact setup, including the one-hour design system config that compounds across every prototype after.
The PMs running this workflow this week walk into Q4 with six months of brand-consistent prototypes compounding behind them. Everyone else is still drafting Slack messages to a designer they cannot reach.
That gap widens every Monday.
I still feel like Compound Engineering is the most under hyped / biggest secret /hack in my toolkit and I never shut up about it. How does it only have 15k stars?! Congrats on the big launch @trevin !
Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see.
@eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)
Introducing Claude Design by Anthropic Labs: make prototypes, slides, and one-pagers by talking to Claude.
Powered by Claude Opus 4.7, our most capable vision model. Available in research preview on the Pro, Max, Team, and Enterprise plans, rolling out throughout the day.
The math on building a Team OS in Claude Code is the part most PMs won't believe until they see it.
One PM. 30 days of setup. Then that PM supports 20+ people without being the bottleneck on any of them.
The way most teams use AI coding agents right now: one person installs Claude Code, gets faster at their own tasks, and everyone else still waits in the same queue they were in before. The PM still manually pulls metrics. The data scientist still asks where the SQL lives. The strategist still pings three people to find the right PRD.
Hannah Stulberg spent 1,500+ hours in Claude Code at DoorDash and figured out the leverage wasn't personal productivity. The leverage was a shared repo that every function could traverse on their own.
Step 5 is the one that changes everything. Checking analytics into the repo means metric definitions, SQL queries, table schemas, and dashboard links all live in one place. Any coding agent on the team can self-serve. You stop being the human router between "where does this data live" and "here's the dashboard link."
Step 8 is the enforcement mechanism that makes the whole system durable. No feature launches until metrics, queries, schemas, dashboards, and playbooks are checked in. That sounds bureaucratic. In practice it means the repo never rots because every launch forces an update.
The compound effect is the part that sneaks up on you. Each sprint, you automate one task, free up time, use that time to automate the next task. By sprint six you're running a fundamentally different operation than the PM two desks over who's still context-switching between Slack threads.
30 days of infrastructure investment for a 10x leverage multiplier. The barrier is psychological, not technical. The terminal is less intimidating than the Jira backlog you're already drowning in.
Today we're releasing Personal Computer.
Personal Computer integrates with the Perplexity Mac App for secure orchestration across your local files, native apps, and browser.
We’re rolling this out to all Perplexity Max subscribers and everyone on the waitlist starting today.
Introducing Claude Opus 4.7, our most capable Opus model yet.
It handles long-running tasks with more rigor, follows instructions more precisely, and verifies its own outputs before reporting back.
You can hand off your hardest work with less supervision.
Opus 4.7 also has substantially better vision. It can see images at more than three times the resolution and produces higher-quality interfaces, slides, and docs as a result.
"Routines" is the most undersold product name in AI right now.
What Anthropic actually shipped: autonomous AI agents that run on Anthropic's infrastructure, triggered by your existing DevOps events. No laptop. No human initiation.
Look at the trigger system. PR opened → Claude fetches the diff, checks linked Linear issues, verifies test coverage, posts a review. Issue opened → triage. Release published → changelog. Merge → dependency audit. All server-side, all 24/7.
The constraint that held back every AI coding agent until now was the laptop lid. Your AI could write code, refactor functions, review PRs. But someone had to be sitting at a machine with a terminal open. Routines break that entirely. Claude Code now runs where CI/CD runs: event-driven, on someone else's servers, while you sleep.
The webhook trigger is the one worth paying attention to. "Trigger from your own code via the /fire endpoint" means any external system can invoke Claude Code programmatically. PagerDuty alert fires at 3am → Claude reads the logs, identifies the regression, opens a PR with the fix. No human wakes up.
Claude Code hit a $2.5B run rate in February. Routines explain where the next $2.5B comes from. When the AI runs on your events instead of your commands, you stop paying for a tool. You start paying for a worker.
Today we're open sourcing https://t.co/p76KVdY7dG, a reference platform for cloud coding agents.
You've heard that companies like Stripe (Minions), Ramp (Inspect), Spotify (Honk), Block (Goose), and others are building their own "AI software factories". Why?
1️⃣ On a technical level, off-the-shelf coding agents don't perform well with huge monorepos, don't have your institutional knowledge, integrations, and custom workflows.
2️⃣ On a business level, the moat of software companies will shift from 'the code they wrote', to the 'means of production' of that code. The alpha is in your factory.
Open Agents deploys to our agentic infrastructure: Fluid for running the agent's brain, Workflow for its long-running durability, Sandbox for secure code execution, AI Gateway for multi-model tokens.
(Because of our focus on Open SDKs and runtimes, this codebase is a gem even if you're not hosting on Vercel.)
TL;DR: if you're building an internal or user-facing agentic coding platform, deploy this:
https://t.co/xdsc42nbDN
Another week on the road meeting with a couple dozen IT and AI leaders from large enterprises across banking, media, retail, healthcare, consulting, tech, and sports, to discuss agents in the enterprise.
Some quick takeaways:
* Clear that we’re moving from chat era of AI to agents that use tools, process data, and start to execute real work in the enterprise. Complementing this, enterprises are often evolving from “let a thousand flowers bloom” approach to adoption to targeted automation efforts applied to specific areas of work and workflow.
* Change management still will remain one of the biggest topics for enterprises. Most workflows aren’t setup to just drop agents directly in, and enterprises will need a ton of help to drive these efforts (both internally and from partners). One company has a head of AI in every business unit that roles up to a central team, just to keep all the functions coordinated.
* Tokenmaxxing! Most companies operate with very strict OpEx budgets get locked in for the year ahead, so they’re going through very real trade-off discussions right now on how to budget for tokens. One company recently had an idea for a “shark tank” style way of pitching for compute budget. Others are trying to figure out how to ration compute to the best use-cases internally through some hierarchy of needs (my words not theirs).
* Fixing fragmented and legacy systems remain a huge priority right now. Most enterprises are dealing with decades of either on-prem systems or systems they moved to the cloud but that still haven’t been modernized in any meaningful way. This means agents can’t easily tap into these data sources in a unified way yet, so companies are focused on how they modernize these.
* Most companies are *not* talking about replacing jobs due to agents. The major use-cases for agents are things that the company wasn’t able to do before or couldn’t prioritize. Software upgrades, automating back office processes that were constraining other workflows, processing large amounts of documents to get new business or client insights, and so on. More emphasis on ways to make money vs. cut costs.
* Headless software dominated my conversations. Enterprises need to be able to ensure all of their software works across any set of agents they choose. They will kick out vendors that don’t make this technically or economically easy.
* Clear sense that it can be hard to standardize on anything right now given how fast things are moving. Blessing and a curse of the innovation curve right now - no one wants to get stuck in a paradigm that locks them into the wrong architecture. One other result of this is that companies realize they’re in a multi-agent world, which means that interoperability becomes paramount across systems.
* Unanimous sense that everyone is working more than ever before. AI is not causing anyone to do less work right now, and similar to Silicon Valley people feel their teams are the busiest they’ve ever been.
One final meta observation not called out explicitly. It seems that despite Silicon Valley’s sense that AI has made hard things easy, the most powerful ways to use agents is more “technical” than prior eras of software. Skills, MCP, CLIs, etc. may be simple concepts for tech, but in the real world these are all esoteric concepts that will require technical people to help bring to life in the enterprise.
This both means diffusion will take real work and time, but also everyone’s estimation of engineering jobs is totally off. Engineers may not be “writing” software, but they will certainly be the ones to setup and operate the systems that actually automate most work in the enterprise.
you'll need to explicitly prompt Claude Code to use it, but the Monitor Tool is super powerful
e.g. "start my dev server and use the MonitorTool to observe for errors"
There are now two ways to give your AI agent a browser. One runs on your laptop. The other runs in the cloud. The difference matters more than you think.
Vercel's agent-browser (free, open source):
→ Installs on your machine
→ Uses your local Chrome
→ Your agent clicks, fills, screenshots - on YOUR computer
→ Laptop closed = agent stops
→ Best for: dev workflows, testing your own apps, quick automation
Cloudflare Browser Rendering + CDP ($5/mo):
→ Runs on Cloudflare's edge servers
→ Fresh sandboxed browser every time - no cookies, no saved logins, no personal data
→ Works 24/7 even when your laptop is off
→ Best for: production agents, monitoring, overnight tasks
Here's where it gets interesting for Claude Code / OpenClaw users:
1. Add Browser Rendering as an MCP server in your config
2. Your agent gets tools - navigate, click, type, screenshot
3. Ask your agent: "go to competitor. com/pricing, screenshot it, compare with last week"
4. It runs on Cloudflare's servers. Not your machine. Not your browser. Not your identity.
Where you'd actually use this:
→ Monitor a competitor's pricing page every morning
→ Screenshot your own product across 5 screen sizes before launch
→ Fill government or vendor forms your API can't reach
→ Pull data from dashboards that don't have an API
→ Test your signup flow end-to-end automatically
The real unlock: until now, AI agents could read the web. Now they can use the web - click, scroll, fill, submit — in a sandboxed browser that doesn't know who you are.
Vercel's is great for local dev. Cloudflare's is great for production agents that run while you sleep.
Pick based on where your agent lives.
Thrilled to announce the Monitor tool which lets Claude create background scripts that wake the agent up when needed.
Big token saver and great way to move away from polling in the agent loop
Claude can now:
* Follow logs for errors
* Poll PRs via script
* and more!