Starting to get a little annoyed with @mercury for not having a status page as any good tech platform has. Outage today is impossible to track the status of, and there's no update on Twitter π
The AI accounting market is growing at 44.6% annually. Most of that growth is coming from SMBs, not enterprise.
A Fortune 500 company needs 18 months to pilot AI. A 10-person company ships it in a week.
The speed advantage is structural: fewer stakeholders, less legacy tech, clearer ROI.
Enterprise gets the press. Small companies get the actual adoption.
AI gives correct answers on financial questions 56% of the time.
27% of responses are deceptive or misleading. 17% are flat wrong.
FINRA is now auditing for hallucination controls. The U.S. Treasury released a new AI risk framework last month.
And there are still companies deploying AI for financial decisions with no human review. That's not innovative. That's reckless.
AI adoption in accounting went from 9% to 41% in one year.
But accountants are now spending 3 hours a month just correcting AI mistakes. Invalid deductions, wrong jurisdictions, misinterpreted expenses.
AI isn't replacing accountants. It's creating a new category of work: fixing what AI got wrong.
Companies are laying off workers because of AI's potential, not its actual performance.
55,000 layoffs directly attributed to AI in the first two months of 2026. Most of them entry-level knowledge workers.
The career ladder is losing its bottom rungs. And nobody's building new ones.
The AI vendor lock-in bill is coming due.
67% of organizations are now actively trying to reduce single-provider AI dependency. Here's what nobody warned them about:
Average cost to migrate off an AI vendor: $315,000 per project. For large deployments: up to $8.5 million.
How it happens: You pick a vendor for speed. You optimize data pipelines around their APIs. You fine-tune on their models. You build monitoring around their tooling. Then they change pricing. Or sunset a model. Or get acquired. And you're stuck.
Open-source models now hit 90% lower inference costs than leading proprietary options. But companies locked into one vendor can't take advantage.
"Move fast, figure it out later" was always going to be expensive. Now we know how expensive.
Most companies deploying AI are just automating their existing mistakes faster.
Example: your expense categorization rules are wrong. A human miscategorizes 50 transactions a month. Annoying, but your bookkeeper catches most of them during reconciliation.
Now AI does it. Same bad rules, 5,000 transactions a month. Your financial reports are wrong at scale. Tax deductions are miscategorized. By the time anyone notices, you've filed a quarter on bad data.
Manual processes have a natural speed limit that also limits damage. AI removes that speed limit. If the underlying logic is flawed, you just took the guardrails off a car that was already drifting.
88% of AI agent projects never reach production.
Not because the models are bad. Because teams treat deployment like a research project instead of production engineering.
What actually kills them:
No monitoring. Teams build the agent, ship it, and hope. No observability into what it's doing, what it's getting wrong, or when it's silently failing. By the time someone notices, the damage is done.
Brittle integrations. The demo works because you control the inputs. Production doesn't control the inputs. Bad data, unexpected formats, API changes, edge cases nobody tested for. The agent breaks. Nobody knows why.
No human fallback. Every agent needs a "I don't know, let me get a human" path. Most don't have one.
Gartner predicts 40%+ of agentic AI projects will be outright canceled by end of 2027.
The 12% that make it build for failure before they build for success.
The Journal of Accountancy's March 2026 issue is running a feature on how accountants will develop expertise when AI handles the routine work. That framing tells you where the profession actually is. The replacement debate is over in practice.
What's happening instead: less preparation, more review. Less data entry, more catching AI errors. Invalid deductions, wrong jurisdictions, misapplied rules. The output looks clean. It needs checking.
Effective review requires more expertise, not less. A junior accountant might miss a misapplied state tax rule. A senior catches it immediately.
This is the same pipeline question playing out in a specific profession. The junior roles that built that expertise are changing. The interesting question is what the new version of a junior accounting role looks like. Probably less manual bookkeeping, more AI-assisted analysis and output validation from day one. Similar to what IBM is doing with entry-level tech roles: not preserving the old job, redesigning it around the skills that still require a human.
FINRA is formally warning broker-dealers about AI hallucination risk. U.S. Treasury published its AI Risk Management Framework for financial services on March 1. EU AI Act enforcement starts August with penalties up to 7% of global revenue.
Three regulatory bodies, same direction, same year.
The problem they're all reacting to: AI in finance gives confident, well-formatted, wrong answers. And unlike a software bug that crashes an application, a hallucination looks exactly like correct output. There's no error message. The format is polished. The substance is fabricated.
Most monitoring tracks uptime and latency. Almost none tracks whether the financial guidance is actually right.
For anyone building AI tools in finance, "auditable and traceable" is becoming the product requirement that separates winners from everyone else.
An NBER working paper surveyed nearly 6,000 CEOs and CFOs across four countries. Not a vendor study. Central bank researchers designed it specifically to measure whether AI deployment is translating to results.
The answer, mostly, is no. Not yet.
63% of finance departments have deployed AI. 21% report measurable value. The RAND Corporation puts the broader AI project failure rate at 80%. MIT says 95% of GenAI pilots never reach production.
Adoption and impact are running on completely different timelines. Almost everyone is deploying. Almost nobody can prove it's working at the firm level.
The gap seems to come down to measurement. Most organizations track whether they're using AI. Very few track whether the work actually got better. Those are different questions with completely different answers.
Stanford used ADP payroll data from millions of workers to measure AI's labor market impact. Not a survey. Payroll records.
Workers aged 22-25 in AI-exposed jobs: 13% relative employment decline since late 2022. Workers over 30 in the same roles: stable or growing. Big tech cut entry-level hiring 25%. UK tech cut graduate roles 46%.
The pipeline problem here is well documented. What's less discussed is what a solution actually looks like.
IBM might be the most interesting signal. In February they announced they're tripling entry-level hiring in 2026, but completely redesigning what those roles are. Less routine coding, more customer interaction and AI oversight. Their CHRO's bet: "The companies three to five years from now that are going to be the most successful are those companies that doubled down on entry-level hiring in this environment."
The ATM precedent is worth remembering. ATMs cut tellers per branch from 21 to 13, but didn't kill the job. They changed it. Tellers went from counting cash to relationship banking. The role was redesigned, not preserved.
That's probably the pattern here too. The junior roles don't come back in their old form. They get rebuilt around AI oversight and judgment-building. The question is how fast.
AI tokens got 1,000x cheaper in 3 years. $37 per million tokens in 2022 to under $0.40 today.
Enterprise AI spending surged 320% anyway.
Microsoft's CEO called it when DeepSeek dropped: "Jevons Paradox strikes again." Make a resource cheaper, people use vastly more of it. Total spend goes up.
The math that tends to catch founders off guard: a single agentic workflow can trigger 10-20 model calls behind the scenes. A $0.15-per-execution pipeline looks great until you're processing 500,000 requests a day. Inference is 85% of the enterprise AI budget now. Not training. Running.
The companies getting this right aren't chasing the cheapest model. They're treating inference cost as an architecture decision, not an optimization you get to later.
Gartner predicts 40% of enterprise applications will have embedded AI agents by the end of 2026. Up from less than 5% in 2025.
That's not a gradual adoption curve. That's a phase change.
Think about what that means practically. Every major enterprise software category (ERP, CRM, HRIS, finance, procurement, legal, IT service management) is racing to ship some version of "AI agents" this year. The ones that don't will look outdated by Q4.
For software companies, this creates a 3-to-6 month window. Either you have a credible agent strategy by mid-year, or you're explaining to customers why your competitor does and you don't.
For startups, the opportunity is in the gap between "we shipped an AI agent" and "our AI agent actually works reliably in production." That gap is enormous right now. The tools, infrastructure, and expertise needed to make agentic AI work at enterprise scale barely exist yet.
The best time to build the picks and shovels for the agent gold rush was six months ago. The second best time is now.
The most useful signal for AI founders right now isn't what VCs are funding. It's what they stopped funding.
The dead categories in early 2026: thin wrappers on LLM APIs with a chat interface. Vertical SaaS that just bolted on AI without a technical moat. Anything where unit economics don't survive inference costs eating the margin.
What's still fundable: AI-native infrastructure. Vertical SaaS with proprietary data moats. "Systems of action" that complete tasks, not just provide information. Anything deeply embedded in mission-critical workflows where switching costs are existential.
The bar shifted. "We added AI" is not a pitch anymore. The pitch is "we have data nobody else has" or "we're so embedded that ripping us out would shut down your operations."
This connects to the SaaSpocalypse narrative from a couple weeks ago. The $2 trillion wipeout in software stocks wasn't about AI killing SaaS. It was about investors realizing which SaaS companies actually have defensible positions and which ones were coasting on switching costs that AI just eliminated.
If you're building right now, the question isn't "does my product use AI?" It's "would my product still win if every competitor also had AI?"
Gartner predicts 90% of finance functions will deploy at least one AI solution by the end of this year. But less than 10% will actually reduce headcount.
Wolters Kluwer's data backs this up from a different angle: agentic AI adoption among finance leaders is set to jump 6x this year, from 6% to 44%. The money is pouring in. The tools are being deployed.
But the org charts aren't changing. Why?
Because automation is eliminating data entry and reconciliation work while simultaneously creating new work: monitoring AI outputs, managing exceptions, auditing automated workflows, and doing the strategic analysis that humans were too busy for before.
The Klarna lesson applies here. Replacing humans sounds good in a press release. It doesn't survive contact with reality in complex, high-stakes domains where errors have real consequences.
For founders building in fintech and finance, the opportunity isn't "replace the accountant." It's "give the accountant superpowers and charge for the platform." Very different product, very different go-to-market.
Here's a failure mode in multi-agent AI systems that almost nobody is talking about: memory poisoning.
Galileo AI studied 1,642 execution traces across production multi-agent systems. They found that a single compromised or hallucinating agent poisoned 87% of downstream decision-making within 4 hours.
The mechanism is simple and terrifying. Agent A hallucinates a fact. Stores it in shared memory. Agent B retrieves that "fact" as verified truth. Agents C, D, and E build on it. Within hours, the entire system is operating on fabricated information and none of the standard monitoring tools catch it.
Real example: an inventory agent invents a nonexistent SKU. That triggers pricing, stocking, and shipping workflows for a phantom product. Everything looks normal in the dashboards.
Traditional software monitoring (uptime, latency, error rates) tells you nothing about whether your agents are making correct decisions. The failure rate in production multi-agent systems ranges from 41% to 87%, according to these studies.
Enterprises are racing to deploy multi-agent architectures while the fundamental reliability problem is unsolved. The monitoring and observability tools for agentic systems are going to be a very big market.
https://t.co/TwuhpWcJFM
Last summer, a controlled study by METR found that experienced developers using AI coding tools were 19% slower than without them. And the developers themselves estimated they were 20% faster. A nearly 40-point perception gap.
That's a striking result. But here's what I keep thinking about: the study used Cursor Pro with Claude 3.5 Sonnet, which was the frontier in early 2025. The models and tooling have improved dramatically since then. Agentic coding workflows barely existed at the time. Context windows were a fraction of what they are now.
The cautionary finding still matters. If you're measuring AI adoption rates instead of actual output quality, you might be tracking a feeling rather than a result. PR sizes went up 150%. Bug counts up 9%. Review time ballooned 91%. Those are real costs that get hidden when you only measure speed of first draft.
But I'd bet the results look very different if you reran this study today. The tools have gotten substantially better at understanding large codebases, maintaining context across sessions, and catching their own mistakes before a human has to. The gap between "AI generates code" and "AI generates correct code" is closing fast.
The real takeaway isn't "AI doesn't help." It's "measure the right things, and expect rapid improvement." The teams that built good evaluation frameworks in 2025 are the ones seeing real gains now.
https://t.co/10X45D6FxG
BNY Mellon just deployed 20,000 AI agents across their global operations. Not a pilot. Not a proof of concept. 20,000 agents running in production across 125 use cases.
This is one of the largest AI deployments in financial services. BNY didn't hand this to a consulting firm and wait 18 months. They built an internal platform called Eliza, partnered with OpenAI, and rolled it out with the goal of "AI for everyone, everywhere, and in everything."
What stands out to me is the contrast with Deloitte's data showing only 11% of enterprises have agentic AI in production. BNY is in that 11%. Most of their competitors are still running pilots.
The pattern I keep seeing: the companies moving fastest on AI aren't buying the most tools. They're building internal platforms that let every team deploy AI for their own workflows. The platform approach scales in a way that buying point solutions doesn't.
For fintech founders, this is worth paying attention to. The biggest financial institutions are building internal AI capabilities at speed. I think that actually creates more opportunity for startups, not less. When a bank has 20,000 agents running, they need better data infrastructure, better compliance tooling, better monitoring. That's all greenfield.
https://t.co/ZSUtlNYOeV
#AI #fintech #banking #startups