Good thought provoking post from Anthropic. I think this paragraph points to the key element of the optimistic scenario of AI:
“There has been an explosion of new ideas, initiatives, tools, and simulations, as a result of Anthropic employees working with highly capable models—far more than we have the capacity to pursue. The rate at which organizations can spot and fix these bottlenecks may be a skill that improves over time, and it may become the most important skill for any organization.”
AI lowers the barrier dramatically to allowing us to do more. As a result of that, we have far more ideas than we can pursue, and for the ones that we want to pursue we’re ultimately limited by our ability to go take on the surrounding work to execute those ideas. There’s almost no amount of AI progress that can happen where that goes away.
AI is going to let us build much more software, launch more marketing campaigns, research more drugs, and so on. All of this work, even when augmented by agents, still ultimately requires people to manage.
anyone thinking about, learning, or already working with agentic systems, you should know this.
the first few steps of your setup matter more than any model or framework you pick later. get them right and you never lose your flow.
the foundation nobody posts about:
> 1. tailscale. a private mesh network across every machine you own. laptop, desktop, rented node, all on one secure tailnet, reachable from anywhere. nothing else works well until this does.
> 2. termius, over that tailnet. one SSH client that reaches every node, phone included. you are never away from your stack.
> 3. tmux. persistent sessions. disconnect, close the laptop, come back, every session exactly where you left it. agentic work runs long, your terminal has to survive that.
> 4. a private git repo. the one i am most glad i found. it is the memory layer across all my agents, they pull, they work, they merge back, the codebase stays alive between sessions. context that would die in a chat window lives in the repo instead.
> 5. script everything from day one. ssh aliases for every node, setup scripts, the boring boilerplate automated. if you will do a thing more than twice, it is a script.
everything past these five is decorative. know these cold.
and the habit that ties it together: ask the AI itself. for the config, for the error, for any of it, let the agent do the lifting, then double check what it hands you.
lock the five, build the habit, and you make it. skip it, anon, and you ngmi.
If you're rethinking your agent setup after the Claude subscription changes — we're opening @Hyperagent today.
First 1,000 signups get $1,000 in bonus inference credits + a 2.5x cost subsidy on frontier models including Opus 4.6 (up to $15K/mo) for a year.
If OpenClaw is Linux, Hyperagent is the Mac.
Same autonomous power — cloud-native, isolated sandbox per session, real browser, shell, code execution, hundreds of integrations. No local setup, no managing a physical device, no infrastructure babysitting.
Beautiful GUI built for how you actually want to use agents: rich visual output, skill learning that compounds over time, deploy agents into your company via Slack with one click, and a command center to manage your entire fleet.
https://t.co/YMuihOpOxS
Every operations leader I talk to has the same problem. They bought the CRM. They bought the ticket system. They bought the data tool.
And their team still runs on post-it notes and Excel.
The instinct is always the same: get disciplined, pick one system, force adoption, build dashboards. It never works. Not because the tools are bad, but because knowledge workers don't live inside any single tool. They live in Slack. They live in their inbox. They live wherever the conversation is happening.
Instead of dragging people into the system of record, build an agent that goes to every system of record on their behalf. It pulls open tickets from your support platform. It pulls account ownership and last-touch dates from your CRM. It pulls revenue data from your analytics layer. Then it synthesizes all of that into a prioritized daily view, delivered where people already work.
The key is that the prioritization logic isn't a black box. It's inspectable code. Anyone on the team can ask the agent to explain how it ranked their accounts. Leadership can review the algorithm and adjust the rules. When someone on the team has an idea for how priorities should change, that becomes a concrete improvement to the agent's logic, not a suggestion lost in a meeting.
This is what enterprise AI operationalization actually looks like. It's not replacing your systems. It's not another dashboard. It's an agent that reads five tools at 7am so your team knows exactly where to focus at 8am. The data stays where it lives. The work gets done where people already are.
The organizations that figure this out won't have better tools. They'll have the same tools, with an agent layer that finally makes them useful.
After reading @AnthropicAI blog on Agentic AI. spent some time to create a mental model to understand how to design, and explain Agentic AI architecture
Define a task/goal - what you want agent to do achieve?
1. Orchestration layer : it is your control panel
3. Agents layer: this layers made of agents (multi /specialised)
4. tools: your tools are made of this layer (web search, DB, APIs etc)
5. memory: this is the brain to store information - long or short term etc.
6. monitoring : This is the most crucial to monitor each and every step
7. Reliability & failure management: identify errors, retry, fallback, involve human
8. Governance and security: compliance, audit, auth etc.
I still feel like Compound Engineering is the most under hyped / biggest secret /hack in my toolkit and I never shut up about it. How does it only have 15k stars?! Congrats on the big launch @trevin !
The more enterprises I talk to about AI agent transformation, the more it’s clear that there is going to be a new type of role in most enterprises going forward. The job is to be the agent deployer and manager in teams. Here’s the rough JD:
This person will need to figure out what are the highest leverage set of workflows on a team are (either existing or new ones) where agents can actually drive significantly more value for the team and company.
In general, it’s going to be in areas where if you threw compute (in the form of agents) at a task you could either execute it 100X faster or do it 100X more times than before. Examples would be processing orders of magnitude more leads to hand them off to reps with extra customer signal, automating a contracting review and intake process, streamlining a client onboarding process to reduce as many straps as possible, setting up knowledge bases than the whole company taps into, and so on.
This person’s job is to figure out what the future state workflow needs to look like to drive this new form of automation, and how to connect up the various existing or new systems in such a way that this can be fulfilled. The gnarly part of the work is mapping structured and unstructured data flows, figuring out the ideal workflow, getting the agent the context it needs to do the work properly, figuring out where the human interfaces with the agent and at what steps, manages evals and reviews after any major model or data change, and runs and manages the agents on an ongoing basis tracking KPIs, and so on.
The person must be good at mapping the process and understanding where the value could be unlocked and be relatively technical, and has full autonomy to connect up business systems and drive automation. This means they’re comfortable with skills, MCP, CLIs, and so on, and the company believes it’s safe for them to do so. But also great operationally and at business.
It may be an existing person repositioned, or a totally net new person in the company. There will likely need to be one or more of these people on every team, so it’s not a centralized role per se. It may rile up into IT or an AI team, or live in the function and just have checkpoints with a central function.
This would also be a fantastic job for next gen hires who are leaning into AI, and are technical, to be able to go into. And for anyone concerned about engineers in the future, this will be an obvious area for these skills as well.
a great way to get started is to use our skill, built into Claude Code. get the latest Claude Code release:
$ claude update
then start Claude Code and run our subcommand:
$ claude
/claude-api managed-agents-onboarding
Introducing Claude Managed Agents: everything you need to build and deploy agents at scale.
It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days.
Now in public beta on the Claude Platform.
some more ramblings from working at @AnthropicAI.
I've been asked a few times what the single most important thing a growth marketer should be doing with AI that most aren't.
surprise, it's not just a single specific task. after running dozens of growth workflows through Claude, I think the useful stuff worth doing falls along four dimensions 🧵
Competition is largely an illusion. 95% of people don't even try to do great things. 0.1% of the people are loud, so you overestimate how many people there are. The rest get stuck worrying about competition and quitting after 2 weeks.
Had meetings and a dinner with 20+ enterprise AI and IT leaders today. Lots of interesting conversations around the state of AI in large enterprises, especially regulated businesses.
Here are some of general trends:
* Agents are clearly the big thing. Enterprises moving from talking about chatbots to agents, though we’re still very early. Coding is still the dominant agentic use-case being adopted thus far, with other categories of across knowledge work starting to emerge. Lots of agentic work moving from pilots and PoCs into production, and some enterprises had lots of active live use-cases.
* Agentic use-cases span every part of a business, from back office operations to client facing experiences from sales to customer onboarding workflows. General feeling is that agentic workflows will hit every part of an organization, often with biggest focus on delivering better for customers, getting better insights and intelligence from data and documents, speeding up high ROI workflows with agents, and so on. Very limited discussion on pure cost cutting.
* Data and AI governance still remain core challenges. Getting data and content into a spot that agents can securely and easily operate on remains a huge task for more organizations. Years of data management fragmentation that wasn’t a problem now is an issue for enterprises looking to adopt agents. And governing what agents can do with data in a workflow still a major topic.
* Identity emerging as a big topic. Can the agent have access to everything you have? In a world of dozens of agents working on behalf, potentially too much data exposure and scope for the agents. How do we manage agents with partitioned level of access to your information?
* Lots of emerging questions on how we will budget for tokens across use-cases and teams. Companies don’t want to constrain use-cases, but equally need to be mindful of ultimate token budgets. This is going to become a bigger part of OpEx over time, and probably won’t make sense to be considered an IT budget anymore. Likely needs to be factored into the rest of operating expenses.
* Interoperability is key. Every enterprise is deploying multiple AI systems right now, and it’s unlikely that there’s going to be a single platform to rule them all. Customers are getting savvier on how to handle agent interoperability, and this will be one of the biggest drivers of an AI stack going forward.
Lots more takeaways than just this, but needless to say the momentum is building but equally enterprises are acutely aware of the change management and work ahead. Lots of opportunity right now.
We've been building an internal Claude Code plugin system at Intercom with 13 plugins, 100+ skills, and hooks that turn Claude into a full-stack engineering platform. Lots done, more to do. Here's a thread of some highlights.
JUST SHIPPED: Developer's Guide to AI Agent Protocols.
Make sense of MCP, A2A, UCP, AP2, A2UI, AG-UI.
My new blog is now live on Google for Developers.