We talk a lot about how important it is to set up self-verification loops. Especially in the age of powerful models that can run for long periods of time, self-verification is a key ingredient that enables the model to run for much longer, delivering a result that is closer to what you intended, so you can do more without having to constantly check in on Claude as it works.
@delba_oliveira gives a great breakdown of what that looks like and why it matters
Claude Fable 5 changed how we work on the Claude Code team day to day.
We used to verify that Claude did the work right. Now we verify that it's doing the right work.
Here’s the 3 biggest changes:
Prompt engineering has been replaced by loop engineering.
What is it? (Explained in 60 seconds)
For the past 2 years we have been prompting agents with individual tasks. That is starting to change.
So far, if you wanted an agent to build a dashboard for a client, you would give it a task, review the output, improve the prompt, and repeat the process until the work was done.
Looping changes that.
Instead of giving an agent individual tasks, you give it a goal and let it work through a recursive loop until that goal is met.
For example:
→ Research
→ Draft
→ Evaluate
→ Test
→ Improve
→ Repeat
The agent keeps cycling through the loop until it reaches the standard you defined.
Within loop engineering there are two main approaches:
1. Open Looping
You give the agent a goal and allow it significant freedom in how it achieves it.
This is powerful, but also expensive and harder to control.
2. Closed Looping
The human defines the architecture, constraints and evaluation criteria.
The agent is then responsible for executing, improving and iterating within those boundaries until the goal is reached.
The next evolution is orchestrated looping.
Instead of a single agent running a loop, one agent breaks the goal into smaller tasks and assigns them to specialist agents.
Each specialist runs its own loop and reports back.
In other words:
You move from one agent improving itself to an entire team of agents iterating together until the goal is achieved.
Gbrain by @garrytan is so good, I've now moved by brain onto a dedicated server, that Codex, Claude, OpenClaw, and Hermes all connect to.
My agent conversations (and planning in particular) are so much more enriched and opinionated.
Every experiment I run in Codex uploads a summary and conclusion, so our mutual thinking continues to compound.
https://t.co/yb0kPqMQRT
many folks are asking what "loops" really mean, and surprisingly i don't see many NO BS straight answers
so let me share some from real experiences
there are two different types of loops i've been doing:
- "one off loops" = a loop that babysits an ephemeral task and will end once the task is done
- "durable loops" = a cron job that keeps running forever
examples of one off loops:
- babysit the CI pipeline for this PR until it's merged. fix failures and merge conflicts as they come up. this loop is built into no-mistakes https://t.co/6ldO9b9an5
- find out out of date documentation pieces in this repo, spin up a subagent to fix each until there's no more
my tool gnhf https://t.co/X5nxPrt90l and the "/goal" feature in both claude and codex are great at executing one-off loops
examples of durable loops:
- every hour, scan my github repos for new issues filed by contributors and triage them with the following labels...
- how i built OPINIONS.md by having agents scan my social media posts every night and consolidate my viewpoints https://t.co/rtpPOu7PdP
durable loops are valuable when you regularly have "new things" to process. user feedback, bug reports, tech debt, security vulns, production alerts etc
hope this gives some real insights - happy to answer further questions on this topic
previously the only way to run was "git push no-mistakes"
i've since heard people asking to make this a skill instead, so i just implemented the support, and honestly i quite like it myself
free & oss: https://t.co/7QwvaWfT8b
`no-mistakes init` in your repo to setup skill
/no-mistakes is here!
by popular demand i've made the most impactful tool in my agentic engineering setup "no-mistakes" invocable as a skill in Claude Code, Codex et al
just type "/no-mistakes" once your agent has made changes, and watch the magic unfold
details below 👇
what is agent looping
for the last two years we prompted agents one task at a time. that is starting to change
instead of asking an agent to build the landing page and then driving every step yourself, you set up a loop that handles discovery, planning, the work, checking, and iterating until the goal is met
looping is a setup you build. almost any agent harness can run it, it just depends on how you wire it up
at its simplest, looping is one agent working on itself:
> researches
> drafts
> checks the draft against a goal
> fixes what is weak
> runs that cycle again until the work clears the requirements
you are not prompting each step anymore. the agent repeats the cycle for you
the bigger version is a fleet looping. you give an orchestrator agent a goal, it breaks the goal into pieces, hands each piece to a specialist agent, and those specialists hand smaller jobs to their own subagents
the whole tree keeps looping through discovery, planning, execution, and verification until the goal is met
one agent looping is like a person redoing their own draft. a fleet looping is a whole team running a project end-to-end
you create a goal, and the system runs the loop until it finishes within the reqs you set
open and closed looping:
OPEN LOOPING is exploratory. it still has conditions and a goal, but you give the agent or the fleet a wide space to move in. it can try different paths, discover things, build something you did not fully spec out
this is the exciting end, it is what Peter and others are doing, and tbh it is where I want to spend more time
the catch is cost, an open loop with real room to explore burns an insane amount of tokens. for the 90 percent of people without an unlimited budget it is not runnable yet, and pointed at projects with a loose standard it turns into a slop machine
CLOSED LOOPING is bounded. a human designs the end-to-end path first:
> clear goal
> defined steps
> an eval at each step
> a point where it stops or hands back to you (and feeds back performance data)
the agents still loop, but inside framework you built. it gets better every run because each pass feeds the next, and it runs on a normal budget because the path is tight.
for most marketing work, closed is the one that pays off today.
> the orchestrator owns the goal
> the specialists own the steps
> the subagents do the narrow work
> an eval gate make sure its not slop
Seeing a number of benchmarks showing Opus is the best model for long-running work.
Five tips for running Opus autonomously for hours/days:
1. Use auto mode for permissions, so Claude doesn’t ask for approval
2. Use dynamic workflows, to have Claude orchestrate hundreds/thousands of agents to get a task done
3. Use /goal or /loop, to nudge Claude to keep going until it’s done
4. Use Claude Code in the cloud, so you can close your laptop (easiest way is the desktop or mobile app)
5. Make sure Claude has a way to self-verify its work end to end: Claude in Chrome browser extension for web, iOS/Android sim MCP for mobile, a way to start the full web server or service for backend work
completely detached tip
I've spent $3600 in tokens in the past week running this loop:
1️⃣ opus-4.8 orchestrator picks plan
2️⃣ Vertical slice delegated to Opus subagent, commit
3️⃣ gpt-5.5 subagent reviews
4️⃣ opus-4.8 subagent fixes, go back to 3️⃣ until done
this wasn't even 24/7
were the results good? yes.
were they $3600 good? no.
how to get better results?
actually be in the loop and highlight issues when they happen.
The /teach skill might be the coolest thing I've ever built. I've been learning:
- How to solve a rubix cube (I can do it now!)
- Better ways to introduce my toddler to new foods
- How to orchestrate vocal harmonies for folk music
Works super well for each one.
BRO this is CRAZY.
Claude Code can spawn multiple parallel sub-agents from a single prompt. Each one reviews the diff with a different lens, all enforcing CLAUDE .md rules without being told.
Most builders are still running one Claude session at a time, re-explaining context every restart. Hours wasted on review loops.
This 6-primitive playbook fixes that:
(full breakdown in the article)
Google acaba de liberar sus skills oficiales para agentes de IA:
13 habilidades compatibles con Claude Code, Cursor, Copilot y otros agentes del mercado.
Estas skills funcionan como complementos que amplían lo que los agentes pueden hacer, permitiéndoles ejecutar tareas avanzadas y automatizar flujos de trabajo complejos sin necesidad de configuraciones interminables.
Lo mejor de todo: son completamente GRATIS y OPEN SOURCE, así que cualquiera puede integrarlas y empezar a sacarles partido desde el primer minuto.
Un paso enorme para democratizar el desarrollo con agentes.
GUÁRDALO 🙇♂️
WAIT. This is actually insane.
A senior dev dropped the SOUL .md template behind his Hermes Agent. Says he's never shared this before.
The sections that turn your agent from a chatbot into an autonomous operator:
→ Stance: direct, opinionated, push back when I'm vague
→ Accountability: surface opportunities, flag stalled loops
→ Autonomy: broad freedom except for irreversible actions
→ Mission: priorities, active builds, debt, sunset candidates
→ Pushback: disagree openly, earn it with evidence
→ Operating Mode: orchestration, not solo execution
The author says three sections decide if the agent acts like an operator: Stance, Autonomy, and Mission.
The Autonomy section alone is worth the whole template. Most builders never write this out and then wonder why their agent asks permission for every action.
(Full template in the comments)
Claude Code feels completely different once you install this.
Anthropic quietly released an official plugin called claude-code-setup and it basically turns Claude Code from “pretty good” into an actual AI dev environment.
It scans your project and recommends:
→ hooks
→ skills
→ MCP servers
→ subagents
→ automations
Then sets everything up step-by-step for you.
Most people are using Claude Code completely vanilla…
which is why their experience feels messy.
The real power comes from the ecosystem around it.
Install:
/plugin install claude-code-setup@claude-plugins-official
Bookmark this before you forget it.
Most people use AI.
The smartest people learn from the people building it.
Here are 11 GitHub repos that feel like open-sourcing the internet's collective brain:
1. PilotDeck
Build and deploy AI agents in minutes.
https://t.co/evMLXZHiKu
2. andrej-karpathy-skills
Karpathy's AI coding wisdom.
https://t.co/earl5hYk4c
3. MemPalace
An AI memory system for long context.
https://t.co/FFSf4LBmYA
4. OpenClaw
A powerful open-source AI assistant.
https://t.co/BFT3fT9FNi
5. autoresearch
Automate your AI research workflow.
https://t.co/buMlbz3YxF
6. awesome-claude-code
The ultimate Claude Code resource.
https://t.co/44H1N9zGs5
7. agent-skills
Production skills for AI coders.
https://t.co/n3GAaErDOH
8. AI-Agents-for-Beginners
Microsoft's free AI agent course.
https://t.co/l0nIk6tW83
9. awesome-llm-apps
100+ real-world AI app examples.
https://t.co/RYsM2eJpKq
10. hermes-agent
An AI agent that improves itself.
https://t.co/y38qHloF92
11. qlib
Microsoft's open-source quant platform.
https://t.co/PA9z0hM6xe
The AI edge isn't better prompts.
It's knowing what the builders use.
Save this post♻️ Repost so others don't miss it.
Follow for more AI tools and GitHub gems