Claude Code creator:
"100% of our pull requests at Anrtopic are run by Claude Code. 80–90% of code review too.
The feature I’m using the most today is /loops. I’m not prompting Claude anymore - I’m building loops"
in 1-hour interview, Boris reveals his setup, which helps him build the #1 coding tool of this year.
Worth more than a $500 vibe-coding course.
Karpathy said something you'll regret ignoring:
"Remove yourself as the bottleneck. Maximize your leverage. Put in very few tokens, and a huge amount of stuff happens on your behalf."
Loop engineering is the exact thing that does that.
In a hand-run session, the operator handles two things:
- deciding what the agent runs next
- and checking its output before the next step
Both are manual, and both decide how far the agent gets on its own without the operator.
Loop engineering moves both steps into the system.
A core operating structure surrounds the loop, and the diagram below depicts it.
- A schedule decides what to run
- Loop is the maker that produces the work
- A separate checker agent grades the output
- A file on disk holds the state they both read.
The loop runs until either done, max iterations, or an exhausted budget.
Here are some practical engineering considerations:
1) A model grading its own output justifies what it already did instead of catching where it failed.
That's why a separate checker's findings return to the maker as the next instruction. And the cycle repeats until the checker finds nothing left to fix.
2) A loop with no stop condition burns tokens, and the cost climbs fast once sub-agents and long runs add up.
That's why the exit must be set before the loop runs, not while it is running.
A simple exit could be:
↳ fix only the major issues, run one final pass, and stop after two loops, with "all tests pass and lint clean" as the rule that ends it.
3) State has to live on disk, not in context.
The model forgets everything between runs, so an MD file or a knowledge graph holds what is done and what is still open.
Each run reads it and writes back to it, which lets a loop pick up again after days.
4) The lower the verification bar, the safer the loop.
Boring, repetitive checks like a stale version string or a missing test are trivial to verify, so a loop runs them with little risk while the operator is away.
Judgment-heavy work is loopable too, but only as far as the checker can confirm the result.
Let's look at how an unattended loop fails in two ways.
1) It reports done when nothing is actually verified.
The separate checker exists to prevent it, but it merges code faster than anyone reads it, so over weeks, the team stops understanding its own codebase while every check stays green.
Green tests say the code passed the tests, not that anyone knows what shipped. Someone still has to read what the loop merges.
2) The checker keeps a running loop honest, but it only catches failures inside a run.
The harness around the loop, like the prompts, tools, and checks wrapped around the model, still drifts and breaks in production as models change.
That repair loop is usually run by hand based on observability traces.
My co-founder wrote a detailed walkthrough (with code) on making that harness repair itself, where a failing trace gets diagnosed, the fix is verified against the exact input that failed, and the failure is locked as a regression test so it cannot recur.
Read it below.
Anthropic Managed Agents team:
"Fable 5 is our best model for running self-improving agent systems.
Add /loops, dynamic workflows, dreaming and you are unstoppable"
in 13-minutes, Anthropic team shows how to build self-improving agent systems with Fable 5 from scratch.
Worth more than a $500 agent building course.
Live from the last Anthropic stage in Japan. Unpublished.
The “unfair tricks” YC tells every founder on day 1 to get their first 100 customers ASAP:
1/ When someone says "sure, I'll try it," average founders say "great, I'll send you a link." Stripe's founders said "give me your laptop" and set it up on the spot (the infamous Collison installation)
2/ If someone won’t adopt your tool, be the product manually (even fake it). When merchants wouldn't build their own stores, @paulg (Viaweb) built them by hand using their own software.
3/ Charging less will often lose you the deal. Early adopters aren't price sensitive - they care more about beating competitors. Price too low and they assume something's wrong with you
4/ Price off value, not cost. The gap between your price and the value delivered is literally the customer's incentive to buy. Widen it on purpose
5/ Every discount you give trains the customer to devalue you. One-off "just this once" pricing becomes the permanent expectation
6/ A fast "no" is almost as valuable as a "yes." Early on, optimize for speed of decision, not size of logo. The prospect dragging you through 4 calls costs you the bandwidth to find 4 real buyers
7/ You're not targeting your entire market. You're hunting the top ~1% of companies who are "innovators." Your job is to sift fast enough to find them. You can use tools like Origami or Clay to build hyper targeted lists of these 1% adopters
8/ Your only 2 unfair advantages as a founder-seller are passion and domain expertise, not technique. You will never out-technique a real salesperson, so lean entirely on the two things they can't fake
People on X love to dunk on these "hacks"
But when anyone can build anything and you've got 100+ competitors, getting off the ground takes every trick in the book
Claude Fable 5 is by far the most ridiculous model that makes me genuinely afraid for the future of software engineering.
I compiled the top 10 most unbelievable things I've seen Claude Fable 5 do today:
— Migrate a 50M line codebase from Stripe in a day (humans take 2mos)
— Draw amazing 3D graphics a) Boeing 747 b) space simulations with >5000 objects c) Minecraft roller coasters d) full photorealistic forest scenes e) NYC skyline f) stormy clouds)
— One-shot Pokemon FireRed the game
— Optimize a real world proprietary interaction net evaluator 10x more than the next best model, gpt5.5
AND it's about the same price as GPT 5.5 ($10/M input, $45/M output) vs Fable 5 ($10/M input, $50/M output) and 6x cheaper than GPT 5.5 Pro.
How do you get Claude Code to check its own work before handing it back?
Watch how you can encode your manual checks so Claude closes its own feedback loop:
Andrej Karpathy just revealed that vibe coding – the term he coined – is no longer enough
he says what comes next is "agentic engineering"
"vibe coding is incredible. but agentic engineering is about preserving the quality bar of professional software"
he built an entire app to add photos to restaurant menus. OCR, image generation, Vercel deployment – the whole stack
then someone did the same thing with one Gemini prompt. his entire app became obsolete in seconds
29 minutes. free. at Sequoia AI Ascent
bookmark & watch ↓
Hiring a Malaysian software engineering intern for A47 media. Not a remote role. If this sounds like you, DM me with a project you’ve built that you’re proudest of.
Compensation : Rm2k/m + your choice of codex pro or Claude max
Comment the smartest young engineer you know here.
Claude Code creator:
"I don’t prompt Claude anymore. What I mostly use now is loops. I create loops - they do the rest of my job."
In 24 minutes, Boris reveals his real daily Claude Code setup:
Claude + loops + routines + dynamic workflows
Worth more than a $500 vibe-coding course
I built a content machine.
It turned me into a one-person media company, has driven tens of millions in pipeline for @tenex_labs, and is allergic to AI-slop.
It has also turned all of my employees into content creators.
I may opensource the whole thing, but for now, I'm going to share how I built it & how it works.
Feel free to copy & paste the steps to Claude/Codex if you want to build your own content machine.
Step 1: Map out the process
In order to make any of your work AI-native, you need to understand the way in which it's been done historically. This is why business context & domain expertise REALLY matters, even in a post-AI world.
Content has been my bread & butter for the last decade, so I started by pulling out an 8.5x11 sheet of printer paper and drawing the traditional process.
1) Look for inspiration
2) Pick a 10x content idea
3) Research the idea
4) Brain dump all of my thoughts about the idea
5) Decide the post format I want to create
6) Create a draft of the post
7) Edit the post
8) Create derivative versions of the post
9) Go live
10) Track performance
Step 2: Where am I needed vs. not needed?
I am needed for the first & final mile:
First mile: picking the idea/direction & providing all of the necessary context
Final mile: going through the final draft with a fine tooth comb & giving final sign-off.
AI can handle the rest:
Looking for inspiration, researching the idea, pulling my thoughts out, writing the post, doing a first edit, creating derivative content, and tracking performance.
Step 3: Build the Content Machine
The machine is one pipeline, run end-to-end or step-by-step. It is a directory of skills that mimic the steps in the content process that I've delegated.
1) The Oracle [AI]
Mines my Slack, Notion, call transcripts and Gmail for spikes, moments I naturally said something worth expanding, while the Internet Reader curates an external feed of X accounts & websites I've selected.
Qualifying ideas (≥6/10) are written to The Vault (a notion database of content ideas).
2) Select the idea from The Vault [Human]
3) The Researcher [AI]
Before any interview, build a sourced research-report.md: TL;DR, key facts with links, current developments, what's already been said, contrarian angles, and open questions for the interview. Claims are adversarially checked; fact is separated from opinion.
4) Interview Panel [AI + Human]
Six world-class interviewers (Joe Rogan, Howard Stern, Michael Barbaro, etc) ask 12–15 questions, one at a time, each pushing a different dimension...and never satisfied with vague answers. Won't advance without 2–3 specific stories, real numbers, and emotional specificity.
5) Production [AI]
The interview becomes a raw .md file: transcript, key stories, core insights, quotable moments, emotional anchor, surprising reveals, and the "so what." This raw file is sacred: my exact words, never paraphrased away.
6) Refinement [AI + Human]
I tell the machine what content type I want to create. It reads my custom style guide + past feedback lessons + content-type spec, then drafts in my voice...pulling real stories and quotes from the raw file. The #1 rule: write like you're texting a friend. Supports long posts, LinkedIn, X threads, and more.
7) Writer's Council [AI]
Six expert reviewers (Shaan Puri, Morgan Housel, David Perell, etc) score the draft through their own lens, splitting fixes into editorial (the machine can rewrite) and information gaps (only the creator can answer...these route back to the interview panel).
8) Revision Loop [AI]
Iterate until council scores 9/10.
9) Repurposing Engine [AI]
One anchor → 10+ natively-formatted derivatives, each re-hooked for its platform and each held to the same full Council → revision bar of 9/10. This is how two people produce like a hundred.
10) Final revision [Human]
11) Learning Loop [AI]
After approval, the machine compares first draft vs. final, extracts confirmed lessons, and saves them to that creator's content-lessons.md. Every future first draft starts smarter. Lessons override the style guide when they conflict.
Feel free to steal the machine & ask me any questions about how it works!
Workflows are the biggest upgrade to Claude Code’s capabilities since skills and subagents.
I dove deep into it with @sidbid to figure out best practices, examples and more. I’m particularly excited about the non-technical tasks it enables for Claude Code.