Google CEO, Sundar Pichai:
"If you don't learn how to orchestrate agents now, you'll spend 2027 catching up to people who started today"
In 30 minutes, he explains why the best engineers are moving from writing code to running agents
One agent researches
One writes
One tests
One reviews
One fixes
The human becomes the operator, not the bottleneck
Bookmark and watch the interview
Karpathy said something you'll regret ignoring:
"Remove yourself as the bottleneck. Maximize your leverage. Put in very few tokens, and a huge amount of stuff happens on your behalf."
Loop engineering is the exact thing that does that.
In a hand-run session, the operator handles two things:
- deciding what the agent runs next
- and checking its output before the next step
Both are manual, and both decide how far the agent gets on its own without the operator.
Loop engineering moves both steps into the system.
A core operating structure surrounds the loop, and the diagram below depicts it.
- A schedule decides what to run
- Loop is the maker that produces the work
- A separate checker agent grades the output
- A file on disk holds the state they both read.
The loop runs until either done, max iterations, or an exhausted budget.
Here are some practical engineering considerations:
1) A model grading its own output justifies what it already did instead of catching where it failed.
That's why a separate checker's findings return to the maker as the next instruction. And the cycle repeats until the checker finds nothing left to fix.
2) A loop with no stop condition burns tokens, and the cost climbs fast once sub-agents and long runs add up.
That's why the exit must be set before the loop runs, not while it is running.
A simple exit could be:
↳ fix only the major issues, run one final pass, and stop after two loops, with "all tests pass and lint clean" as the rule that ends it.
3) State has to live on disk, not in context.
The model forgets everything between runs, so an MD file or a knowledge graph holds what is done and what is still open.
Each run reads it and writes back to it, which lets a loop pick up again after days.
4) The lower the verification bar, the safer the loop.
Boring, repetitive checks like a stale version string or a missing test are trivial to verify, so a loop runs them with little risk while the operator is away.
Judgment-heavy work is loopable too, but only as far as the checker can confirm the result.
Let's look at how an unattended loop fails in two ways.
1) It reports done when nothing is actually verified.
The separate checker exists to prevent it, but it merges code faster than anyone reads it, so over weeks, the team stops understanding its own codebase while every check stays green.
Green tests say the code passed the tests, not that anyone knows what shipped. Someone still has to read what the loop merges.
2) The checker keeps a running loop honest, but it only catches failures inside a run.
The harness around the loop, like the prompts, tools, and checks wrapped around the model, still drifts and breaks in production as models change.
That repair loop is usually run by hand based on observability traces.
My co-founder wrote a detailed walkthrough (with code) on making that harness repair itself, where a failing trace gets diagnosed, the fix is verified against the exact input that failed, and the failure is locked as a regression test so it cannot recur.
Read it below.
Anthropic CEO:
"We use Claude Code across all stages of product development cycle. That's how we ship so fast."
In a 1-hour interview, Dario Amodei reveals how the fastest-growing AI company runs on its own model.
"Claude + loops + routines + dynamic workflows" - that's the secret.
Watch this interview or read the article on the same setup below.
The top Hermes integrations to give your agent superpowers:
1. Obsidian
The Karpathy-style second brain, but one that talks back.
Every note, page, and backlink in the vault becomes live context. The agent doesn't just store knowledge, it reasons over it across everything that's been written and saved.
2. Reddit
Unfiltered opinions from real users on any product, niche, or problem.
No SEO fluff, no corporate blogs. Just raw signal from people who actually use the thing. One of the best research integrations for market validation.
3. InsForge
A full agentic backend behind one semantic layer.
Auth, database, storage, edge functions, all accessible without wiring five services together. The agent reasons about backend primitives directly instead of calling disconnected APIs.
Closest analogy: a PaaS built for agents.
GitHub: https://t.co/OoF773WWZn
(don't forget to star 🌟)
4. GitHub
Code, issues, PRs. Turns Hermes into an engineering teammate that can actually read the repo.
Essential for anyone shipping software.
5. Firecrawl
Web search designed specifically for agents.
Returns clean structured data instead of raw HTML, which means faster responses and fewer tokens burned per query. Worth keeping on by default.
GitHub: https://t.co/PO6zQXdcLg
(don't forget to star 🌟)
6. YouTube transcripts
Converts any video into searchable text. Hour-long podcasts, tutorials, conference talks, all become indexed notes in seconds.
Easily the most underrated research integration in the stack.
7. Google Workspace
Gmail, Calendar, Drive, Docs, and Sheets through one connector.
An agent that can't check the inbox, read the calendar, or write to shared docs is basically decorative. This should probably be the first integration anyone enables.
8. Discord
Ideal for channel-based automation.
Hermes can be plugged into specific channels with dedicated workflows in each. Support tickets from email can be scanned, categorized, and dropped into an organized channel every morning without anyone lifting a finger.
9. Stripe
Revenue, refunds, subscription changes, failed charges, all surfaced through a single question instead of clicking through dashboards.
"How many trials converted last week" or "which customers downgraded this month" gets a direct answer. Turns Stripe from a payment processor into a queryable business intelligence layer.
10. Bland (or Twilio)
Gives Hermes a voice for real phone calls. Booking reservations, confirming appointments, following up on invoices.
The call recordings are worth listening to just for entertainment.
11. Graphiti (by Zep)
Real-time knowledge graphs that build structured relationships from conversations and documents.
Instead of flat vector similarity, the agent traverses typed connections between entities. The difference between "find similar text" and "understand how things actually relate."
GitHub: https://t.co/zCj2zBKC7K
(don't forget to star 🌟)
12. FireFlies
Every meeting transcript, fully searchable. "What did that client say about pricing last month" gets answered instantly instead of scrubbing through a 45-minute recording.
That said, if you’re looking to set up Hermes, I wrote a full deep dive covering the Hermes agent’s architecture, memory system, self-evolving skills, GEPA optimization, and how to set up multiple specialized agents.
The article is quoted below.
the founder of a $20b ai company breaks down how a swarm of ai agents can replace an entire company.
in one minute. for free.
doesn't matter if you've never touched an agent or you've been living in claude for a year. you'll follow it.
i pulled the key ideas into a practical guide for building with kimi.
it's below ↓