Today’s state-of-the-art AI has "jagged intelligence": it can prove new math theorems and win gold at the IOI, yet still struggle to run critical business processes reliably at scale.
We're taking a different approach to build with AI for the enterprise - Instead of blindly throwing tokens at the problem, we dynamically generate and combine deterministic programs with LLM reasoning based on the task at hand.
This results in us having the only AI system in the world that can execute long running business processes with multiple 9s of accuracy.
Introducing @PoeticHQ: a new AI system that executes complex multi-hour tasks with 99%+ accuracy and 10x fewer tokens than agents.
We raised $50M at $500M from Kleiner Perkins, Founders Fund, First Harmonic, and Genius Ventures to build AI that does complex work inside Fortune 500 companies without hallucination.
While code is too brittle, agents are too unpredictable. The work that runs the global economy - anti-money laundering, fraud investigations, underwriting - needs extreme accuracy.
So we built a new kind of software that pairs the flexibility of AI with the predictability of code.
When the world stays the same, Poetic runs fixed code: fast, cheap, identical every time. When the world changes, Poetic uses AI to regenerate its approach and find its way back to the objective.
In one year, we went from zero to an eight-figure run rate as a team of four.
Since then, we’ve scaled the team and executed the highest-stakes processes at AIG, SoFi, and Chime. At SoFi, a large US bank, Poetic reached 99%+ quality on fraud investigations in five weeks.
the alpha is to hand design the initial architecture and interface, then let AI fill in the blanks and fix the long tail.
people are doing the opposite
Bloomberg: OpenAI launches a $ 10Bn joint venture called “The Deployment Company” to help businesses use its AI.
The new company, The Deployment Company, has raised more than $ 4B from 19 investors, including TPG, Brookfield, Advent, Bain, SoftBank, and Dragoneer.
The basic bet is that AI adoption is no longer mainly a model-quality problem, because many companies already want AI but lack the teams, workflows, data access, security rules, and operating discipline to install it safely inside real business processes.
Private equity firms are useful here because they control or advise large webs of companies, and the report says OpenAI’s partners can reach more than 2,000 portfolio companies and clients.
That turns enterprise AI selling from one-company-at-a-time pitching into a routed distribution system, where OpenAI can package software, consulting, deployment playbooks, and sector-specific use cases across finance, healthcare, coding, operations, and support.
The deeper technical point is that LLMs do not create value just by answering prompts, because they need to be connected to company data, permissions, tools, evaluation systems, and human review loops before they can affect revenue or cost.
Anthropic also is building a similar PE-backed route for Claude, which suggests the next AI race may be less about demos and more about who can industrialize deployment fastest.
---
bloomberg. com/news/articles/2026-05-04/openai-finalizes-10-billion-joint-venture-with-pe-firms-to-deploy-ai
given that:
1. human objectives cannot be mapped to simple reward functions
2. human do things for the sake of doing them
3. human lie about their true intention
RL cannot possibly be the end game
mythos being good at cybersecurity makes total sense - anything that you can declare clear reward functions for and that is done in closed environments (chess, go, finding bugs) will be conquered by RL.
however, a lot of valuable human work has no clear reward function and is done in an open environment.
q: what is the reward function for inventing linux or the iphone?
another way to think about this is: getting good grades at school (benchmarks) can help, but it doesn’t mean it will for sure translate to real-world success.
as long as this is not solved, llms would stay as a leveraged tool for humans, not become so called "ASI"
one of the biggest realizations I've had working on Claude Code is that you fundamentally have to design agents for prompt caching first, almost every feature touches on it somehow
I wrote this in a day but it's the culmination of months of learnings, hope you enjoy it
language models won’t be able to invent anything novel+useful until its trained on great inventors thought process.
that data is incredibly hard to get