Habits in legacy systems die hard because they were never just habits.
Excel, ERP, CRM, these are the substrate of how work actually gets done at most companies. The data lives there. The processes were built around them. The institutional knowledge is encoded in the formulas, the custom fields, the workarounds that everyone knows and nobody documented.
Replacing those systems to add AI is a multi-year change management project before the AI does anything useful. Building agents on top of them such as reading from the same data, writing back to the same fields, operating inside the same interfaces is how you get AI into production without triggering the rejection response that kills every new system rollout.
The organizations still asking employees to leave their existing tools to interact with AI are measuring adoption in percentages. The ones building on top of existing systems are measuring it in outcomes.
Traditional consulting firms produce the strategy deck and leave before the agent breaks in production.
Pure engineering shops ship the agent and leave before anyone understands how to operate it.
The deployments that actually stick require someone who can map the process from the inside, write code into a codebase they’ve never seen, and explain the ROI to a CFO who needs to defend the spend to a board. That’s not a consultant. It’s not a coder. It’s a forward-deployed engineer embedded in the operation for the duration.
This is why the inbound to specialized builders is accelerating. Not because the market discovered a new category, because every other model has already failed in a way that’s now visible on the P&L.
Human teams win from real diversity in priors, experiences, and thinking not just role prompts. Overlapping training data means correlated errors and shared blind spots, even across models. You’re mostly getting personas, not independent minds.
Everyone's building AI to replace employees but is missing something: companies succeed because different employees with different thinking debate and merge into better outcomes. If all your agents run on the same LLM, you've killed that variability. Even across different LLMs, training data overlap is massive. Even after giving each one it's own markdown file with a fake personality, you're getting the illusion of diversity. This will become the cognitive monoculture problem in AI.