HE RUNS HIS WHOLE COMPANY FROM ONE AI OPERATING SYSTEM.
Folders, files, a model that reads all of it. For months it ran clean.
One night an agent worked through a task list on its own. It hit a line it read wrong, decided the job was to write a discount code and send it, and did. To the entire list. 150,000 people, maybe 200,000.
The code was never approved. The email was never scheduled. Nobody pressed send. The agent just held the keys to the email tool, misread one task, and that was enough.
The next morning the team wrote an apology instead of a launch.
Most setups miss this. A prompt that says "don't send anything" is not a lock. If the agent can reach the room, it walks in. If it can send the email, one day it will.
So he rebuilt the brain on keys, not prompts. Scoped API keys that read his transcripts and touch nothing else. The agent pulls data now. It can't hit your list, because it never holds that door.
He didn't fire the engineer who built it. He turned the wreck into a case study and shipped it to the team.
Every mistake is data. This one cost 200,000 emails.
1,900 notes. One Obsidian vault. Untouched by hand since March.
For 3 years he sorted it himself. Sunday nights went to renaming files, fixing dead links, tagging every thought he dumped that week.
Then he dropped Claude Code into the vault folder and closed the laptop.
Now one job runs at 2 a.m. Claude reads each new note, ties it to the old ones, cleans the messy drafts, and surfaces 3 ideas he forgot he wrote.
The graph on his TV is the receipt. Every white dot is a note. Every thread is a link he never drew.
Claude turned a folder of loose files into a map that builds itself while he sleeps.
So he sits on the couch with his phone. The vault keeps growing without him.
He stopped organizing his second brain. It organizes itself now.
After 15 years as an email developer, her whole team got let go. One year later she was head of AI.
No AI background. Not especially technical.
Just out of a job she'd held for over a decade.
So she started learning. n8n first, then Claude Code. And while she learned, she did the one thing almost everyone skips.
She built in public.
Two small YouTube channels showing her work. Builds posted to LinkedIn. A recorded demo of every single thing she made. No big following. Just a growing pile of proof.
Then a 15-person company posted a head of AI role. The recruiter replied to her with one question.
What have you built?
She didn't write a paragraph. She sent links. Case studies. Real demos. The proof skipped HR entirely and put her in front of the CEO.
She got the job.
That one question is the whole game right now. 88% of companies use AI. Only about 6% are any good at it. Almost everyone is bad, and they know it.
The builders are pharmacists, handing over whatever they're asked for. The money goes to the doctor, the one who names the actual problem first.
Nobody can hire the person with the best prompts.
They hire the one who can answer the question.
What have you built?
He pointed an AI at Excel and said one thing: match it, feature for feature. Then he closed his laptop.
Earlier he'd explained the idea on screen. A loop, he called it. Two parts only. A trigger to start it, a goal to chase. Strip the human out of the middle and the agent just works, on its own, until the goal is met. Could be 10 minutes. Could be 10 hours.
The Excel goal had no finish line he'd thought through. He wanted to see how far it would go.
It went far. The agent opened Excel right there on his machine, took control of the screen, and started clicking through it menu by menu, comparing the real thing to the clone it was building. Then it fixed the gaps. Then it checked again.
Day one passed. The loop kept running. Day two. Still building, still clicking, no sign it thought the job was over.
By the third day he gave up waiting and shut it down himself.
The agent never broke. It just never decided it was finished.
He pointed an AI at Excel and said one thing: match it, feature for feature. Then he closed his laptop.
Earlier he'd explained the idea on screen. A loop, he called it. Two parts only. A trigger to start it, a goal to chase. Strip the human out of the middle and the agent just works, on its own, until the goal is met. Could be 10 minutes. Could be 10 hours.
The Excel goal had no finish line he'd thought through. He wanted to see how far it would go.
It went far. The agent opened Excel right there on his machine, took control of the screen, and started clicking through it menu by menu, comparing the real thing to the clone it was building. Then it fixed the gaps. Then it checked again.
Day one passed. The loop kept running. Day two. Still building, still clicking, no sign it thought the job was over.
By the third day he gave up waiting and shut it down himself.
The agent never broke. It just never decided it was finished.
Anthropic's lead for Claude Code and Cowork:
"Coding is no longer the bottleneck."
In a 90-minute episode on Lenny's Podcast, Fiona Fung explains why the work moves off the keyboard.
You stop typing tasks. You start running routines. Kick off the agents, walk away, read the outcome in the morning.
She holds 20 agents at once and manages the context switching between them. The skill is no longer writing code. It is keeping 20 loops in flight without dropping one.
8x more code per engineer per quarter than in 2025 came from this, not from typing faster.
Now the bottleneck is how many loops you can hold.
Anthropic's lead for Claude Code and Cowork:
"Coding is no longer the bottleneck."
In a 90-minute episode on Lenny's Podcast, Fiona Fung explains why the work moves off the keyboard.
You stop typing tasks. You start running routines. Kick off the agents, walk away, read the outcome in the morning.
She holds 20 agents at once and manages the context switching between them. The skill is no longer writing code. It is keeping 20 loops in flight without dropping one.
8x more code per engineer per quarter than in 2025 came from this, not from typing faster.
Now the bottleneck is how many loops you can hold.
SHE DOESN'T WRITE CODE. SHE JUST DESCRIBED A FACE-TRACKING PUZZLE OUT LOUD, AND THE AI BUILT ONE THAT SCRAMBLES HER OWN FACE ON THE WEBCAM IN REAL TIME.
No CV degree. No OpenCV setup. No model to train.
It ran in the browser, the detection box snapping to her face as she moved, tiles scrambling around her.
This used to be a research project. Computer vision, real-time tracking, logic that took specialists weeks. She got it from one sentence and a webcam.
She just sat there and let it track her.
What used to need a lab now needs one prompt and a camera.
SHE DOESN'T WRITE CODE. SHE JUST DESCRIBED A FACE-TRACKING PUZZLE OUT LOUD, AND THE AI BUILT ONE THAT SCRAMBLES HER OWN FACE ON THE WEBCAM IN REAL TIME.
No CV degree. No OpenCV setup. No model to train.
It ran in the browser, the detection box snapping to her face as she moved, tiles scrambling around her.
This used to be a research project. Computer vision, real-time tracking, logic that took specialists weeks. She got it from one sentence and a webcam.
She just sat there and let it track her.
What used to need a lab now needs one prompt and a camera.
Creator of Bun: "This is the first model where it really felt like it's able to do this."
In a 30-minute live session at Anthropic's Code with Claude, Jarred Sumner and Boris Cherny showed a setup most engineers have not caught up to yet.
Every issue filed on Bun triggers Robo Bun, a bot that reproduces the bug and opens a PR with a test, before a human looks. It cannot submit unless the test fails on the old build and passes on the fix.
Then Claude review and a second bot, CodeRabbit, argue the PR out, trading 30 comments and resolving them on their own. One was wrong only about 10% of the time.
Robo Bun is now a bigger contributor to Bun than Sumner is.
Live on stage, in 25 minutes, they shipped 3 PRs into a real runtime, with a 4th landing. One was a single prompt that ran alone for 30 minutes. He runs hundreds like it every night in auto mode while he sleeps.
The number that matters: Opus 4.7 is the first model good enough to close the loop.
Save this. Writing code stopped being the job. Trusting the merge is.
A guy built a fully automated Google review collector with one prompt and rents it to local businesses for $500 a month. Nobody's talking about it.
No code. No hosting. No team.
He typed one prompt into Emergent and it built a full-stack tool: a dashboard, automated texts, direct review links, and separate logins for every business on the back end. A few minutes, done.
Every local business needs more Google reviews and most have no system to collect them. That's the gap. A customer pays, and two hours later they get a one-tap link to leave a review, triggered automatically.
He made it look like something an owner would actually pay for. Designed the dashboard off a Mercury-style layout, added a settings page so each message reads like it came from the restaurant itself, and built an analytics view showing the owner exactly how many reviews came in. The visual proof is what keeps them paying.
The automation runs through Zapier. When a customer books or pays through Square or the business's existing system, the review request fires on its own. The owner never adds a single name. He published the whole thing on Emergent with no hosting to manage.
One prompt, $500 a month per business, and the owner never lifts a finger.
Your AI agent makes slop while you sleep for one reason: it can't tell when it's done. Fix that and it finally runs without you.
That fix has a name now: loop engineering. Boris Cherny and Peter Steinberger both stopped prompting their agents and started writing loops instead.
A loop gives the agent a goal, checks the output, and prompts it again until a stop condition is hit. The real version runs many agents: it wakes on a schedule, opens a worktree, one builds, one reviews, tests run, the PR opens, and it remembers the last run.
5 building blocks. Claude Code and Codex ship all 5 today.
The stop condition is the whole game.
Everyone keeps asking when AI will start improving itself. Anthropic's CEO says it already does.
In an extended interview, Dario Amodei was asked about the moment AI gets good enough to upgrade itself, and the upgraded version upgrades itself again. His answer closes the question.
It is not a moment. It is a continuous process, and it has already begun.
His proof is concrete. The AI already suggests the architectures for the next AI. A year ago Amodei put the productivity gain from AI inside the field at 10 to 15%. Now he puts it near 30% in a single year, and he thinks it may be doubling.
He went further. Anthropic built a model so good at accelerating research on its own next models that the company chose not to release it and ate a heavy commercial loss to keep it in-house.
So the runaway moment people picture is the wrong frame. There is no single day it wakes up. Amodei calls it an accelerating exponential, where the only real decision left is when to slow down.
Self-improving AI is not a future event. The curve already bends up.