This is true of all agents, not just coding agents. Probably the biggest challenge that most companies run into in their agent strategy is getting agents the right constrained context to work with for a task.
Too much information or conflicting sources, and the agent can easily draw from the data and produce the wrong result. Conflicting sources of truth for documents, data sources that haven’t been kept up to date, knowledge management systems that rely on tribal knowledge to navigate, and so on.
On the other end, of course, too little information and the upside is highly limited of agents in the first place. Thus, a lot of challenges with AI strategies are actually data strategy challenges in disguise.
This is why there’s such a significant premium on getting structured and unstructured data environments setup properly so agents can work with information effectively. Critical for any large enterprise adopting agents, and also a clear benefit in some cases to startups that can be designed this way from scratch.
Another week on the road meeting with a couple dozen IT and AI leaders from large enterprises across banking, media, retail, healthcare, consulting, tech, and sports, to discuss agents in the enterprise.
Some quick takeaways:
* Clear that we’re moving from chat era of AI to agents that use tools, process data, and start to execute real work in the enterprise. Complementing this, enterprises are often evolving from “let a thousand flowers bloom” approach to adoption to targeted automation efforts applied to specific areas of work and workflow.
* Change management still will remain one of the biggest topics for enterprises. Most workflows aren’t setup to just drop agents directly in, and enterprises will need a ton of help to drive these efforts (both internally and from partners). One company has a head of AI in every business unit that roles up to a central team, just to keep all the functions coordinated.
* Tokenmaxxing! Most companies operate with very strict OpEx budgets get locked in for the year ahead, so they’re going through very real trade-off discussions right now on how to budget for tokens. One company recently had an idea for a “shark tank” style way of pitching for compute budget. Others are trying to figure out how to ration compute to the best use-cases internally through some hierarchy of needs (my words not theirs).
* Fixing fragmented and legacy systems remain a huge priority right now. Most enterprises are dealing with decades of either on-prem systems or systems they moved to the cloud but that still haven’t been modernized in any meaningful way. This means agents can’t easily tap into these data sources in a unified way yet, so companies are focused on how they modernize these.
* Most companies are *not* talking about replacing jobs due to agents. The major use-cases for agents are things that the company wasn’t able to do before or couldn’t prioritize. Software upgrades, automating back office processes that were constraining other workflows, processing large amounts of documents to get new business or client insights, and so on. More emphasis on ways to make money vs. cut costs.
* Headless software dominated my conversations. Enterprises need to be able to ensure all of their software works across any set of agents they choose. They will kick out vendors that don’t make this technically or economically easy.
* Clear sense that it can be hard to standardize on anything right now given how fast things are moving. Blessing and a curse of the innovation curve right now - no one wants to get stuck in a paradigm that locks them into the wrong architecture. One other result of this is that companies realize they’re in a multi-agent world, which means that interoperability becomes paramount across systems.
* Unanimous sense that everyone is working more than ever before. AI is not causing anyone to do less work right now, and similar to Silicon Valley people feel their teams are the busiest they’ve ever been.
One final meta observation not called out explicitly. It seems that despite Silicon Valley’s sense that AI has made hard things easy, the most powerful ways to use agents is more “technical” than prior eras of software. Skills, MCP, CLIs, etc. may be simple concepts for tech, but in the real world these are all esoteric concepts that will require technical people to help bring to life in the enterprise.
This both means diffusion will take real work and time, but also everyone’s estimation of engineering jobs is totally off. Engineers may not be “writing” software, but they will certainly be the ones to setup and operate the systems that actually automate most work in the enterprise.
Multiple times per day now I have the feeling of "I cannot believe how amazing Claude is". But it doesn't make me feel *worried* for my job, it makes me feel *enabled*. I cannot believe how fast I can move now.
And it's not in the places you expect. Claude doesn't make me *much* faster on the work that I am an expert on. Maybe 15-20% depending on the day.
It's the work that I don't know how to do and would have to research. Or the grunge work I don't even want to do. On this it is hard to even put a number on. Many of the projects I do with Claude day to day I just wouldn't have done at all pre-Claude.
Infinity% improvement in productivity on those.
AI makes everyone better, but not everyone will benefit equally.
AI doesn't add; it multiplies.
I simplify this idea in my mind like this:
Let's say your baseline is level 5. If AI is a 10x multiplier, you'll reach level 50 (5 x 10 = 50). Someone starting at level 10 will reach level 100 (10 x 10 = 100).
Compare that with a simple addition:
The person starting at 5 will end up at 15 ( 5 + 10 = 15). The person starting at 10 will end up at 20 (10 + 10 = 20).
There are two takeaways from this:
1. AI can help you improve in a dramatic way (not just a little bit.) It's a force multiplier.
2. The better prepared you are, the farther AI will take you. Don't see AI as "water that lifts all boats", because that's an oversimplification that misses the point.
@svpino Also tell the agent to write the plan down in a markdown file so you can have it read the file again after a context compact or long coding task that pushes your previous planning out of the context window.
@JoshuaWorth Claude Sonnet 4.5 in Claude Code might be the tipping point for me - when it executes the code while writing it and then adds thorough passing tests I'm beginning to trust it to the point that I can review more loosely than before
I'm beginning to suspect that a key skill in working effectively with coding agents is developing an intuition for when you don't need to closely review every line of code they produce. This feels deeply uncomfortable!
@simonw@JoshuaWorth Sonnet 4.5 is so much more consistent and better behaved. If I can manually test the end result I’m finding myself less concerned with the details of the implementation. It may slightly different than what I would do but much less likely to be wildly wrong than previous versions.
want to build your own github pr review bot?
@WayneEaker built a mastra template for that in our templates hackathon!
here is how you can grab the template...
In an open letter to EU legislators this week, @drupal, @joomla, @typo3, and @WordPress raised concerns about the proposed Cyber Resilience Act.
Learn how this policy could impact the #opensource CMS community in Europe and across the globe: https://t.co/yWmS1QQ4sY
@drupaltruth If Drupal is to be for ambitious site builders, then those with the conviction to build their site twice, on two different stacks, must be our focus.
We’re looking for a freelance WordPress developer to help us with overflow work, would be on a retainer basis.
(sorry freelancer only, not looking to outsource to another company)
More details here https://t.co/1COSJ2zoeW