Jaychu.base.eth @jaychu247 - Twitter Profile

4 days ago

about loop engineering. everyone's saying the same thing this week. you don't prompt agents anymore, you design loops that prompt them. here's the job that loop hands right back to you. a loop running unattended is also a loop failing unattended. loop engineering takes you off prompting. it takes you off curating context. it takes you off babysitting a single run. it does not take you off debugging. it just moves the debugging somewhere worse, into runs you were never watching, with far too much of it to read through by hand. even the loop engineering posts admit this themselves, usually somewhere near the end. you can only walk away from a loop if you trust the thing checking it. a checker you don't trust drops you right back into reading every output by hand, which is the exact work the loop was supposed to take off you. so stack the layers up, prompt, context, harness, loop, and one job survives all of them. closing the loop on failure. the leverage point moved. debugging stayed exactly where it was. i was writing about this exact gap yesterday, before the loop talk picked up today. the idea was simple. make debugging its own loop. a failure leads to a root cause, a proposed fix, a rerun against the exact inputs that broke, and a test that locks it out for good. the checker gets built from your real failures instead of guessed at up front. Opik, the tool i was writing about, does exactly this. a built-in agent reads the trace, finds the root cause, proposes a diff, you approve it, and that failure becomes a permanent regression test. every break you debug makes the loop a little harder to break next time, which is the kind of checker the loop engineering crowd keeps saying you need before you walk away. if you're designing loops you actually plan to walk away from, it's worth a look. Opik is 100% open-source under Apache-2.0 license. GitHub repo: https://t.co/MEC26owCdo (don't forget to star 🌟) loop engineering moved the leverage point. it didn't remove the engineer who still has to close the loop when something breaks. the full article, Your Agent Harness Should Repair Itself, is quoted below.

akshay_pachaar's tweet photo. about loop engineering.

everyone's saying the same thing this week. you don't prompt agents anymore, you design loops that prompt them.

here's the job that loop hands right back to you.

a loop running unattended is also a loop failing unattended.

loop engineering takes you off prompting. it takes you off curating context. it takes you off babysitting a single run. it does not take you off debugging. it just moves the debugging somewhere worse, into runs you were never watching, with far too much of it to read through by hand.

even the loop engineering posts admit this themselves, usually somewhere near the end. you can only walk away from a loop if you trust the thing checking it. a checker you don't trust drops you right back into reading every output by hand, which is the exact work the loop was supposed to take off you.

so stack the layers up, prompt, context, harness, loop, and one job survives all of them. closing the loop on failure. the leverage point moved. debugging stayed exactly where it was.

i was writing about this exact gap yesterday, before the loop talk picked up today. the idea was simple. make debugging its own loop. a failure leads to a root cause, a proposed fix, a rerun against the exact inputs that broke, and a test that locks it out for good. the checker gets built from your real failures instead of guessed at up front.

Opik, the tool i was writing about, does exactly this. a built-in agent reads the trace, finds the root cause, proposes a diff, you approve it, and that failure becomes a permanent regression test. every break you debug makes the loop a little harder to break next time, which is the kind of checker the loop engineering crowd keeps saying you need before you walk away.

if you're designing loops you actually plan to walk away from, it's worth a look.

Opik is 100% open-source under Apache-2.0 license.

GitHub repo: https://t.co/MEC26owCdo

(don't forget to star 🌟)

loop engineering moved the leverage point. it didn't remove the engineer who still has to close the loop when something breaks.

the full article, Your Agent Harness Should Repair Itself, is quoted below.

63

676

110

940

85K

jaychu247 retweeted

Matt Van Horn

@mvanhorn

6 days ago

https://t.co/DM0CAuyprS

203

5K

453

15K

3M

jaychu247 retweeted

Shann³

@shannholmberg

5 days ago

what is agent looping for the last two years we prompted agents one task at a time. that is starting to change instead of asking an agent to build the landing page and then driving every step yourself, you set up a loop that handles discovery, planning, the work, checking, and iterating until the goal is met looping is a setup you build. almost any agent harness can run it, it just depends on how you wire it up at its simplest, looping is one agent working on itself: > researches > drafts > checks the draft against a goal > fixes what is weak > runs that cycle again until the work clears the requirements you are not prompting each step anymore. the agent repeats the cycle for you the bigger version is a fleet looping. you give an orchestrator agent a goal, it breaks the goal into pieces, hands each piece to a specialist agent, and those specialists hand smaller jobs to their own subagents the whole tree keeps looping through discovery, planning, execution, and verification until the goal is met one agent looping is like a person redoing their own draft. a fleet looping is a whole team running a project end-to-end you create a goal, and the system runs the loop until it finishes within the reqs you set open and closed looping: OPEN LOOPING is exploratory. it still has conditions and a goal, but you give the agent or the fleet a wide space to move in. it can try different paths, discover things, build something you did not fully spec out this is the exciting end, it is what Peter and others are doing, and tbh it is where I want to spend more time the catch is cost, an open loop with real room to explore burns an insane amount of tokens. for the 90 percent of people without an unlimited budget it is not runnable yet, and pointed at projects with a loose standard it turns into a slop machine CLOSED LOOPING is bounded. a human designs the end-to-end path first: > clear goal > defined steps > an eval at each step > a point where it stops or hands back to you (and feeds back performance data) the agents still loop, but inside framework you built. it gets better every run because each pass feeds the next, and it runs on a normal budget because the path is tight. for most marketing work, closed is the one that pays off today. > the orchestrator owns the goal > the specialists own the steps > the subagents do the narrow work > an eval gate make sure its not slop

shannholmberg's tweet photo. what is agent looping

for the last two years we prompted agents one task at a time. that is starting to change

instead of asking an agent to build the landing page and then driving every step yourself, you set up a loop that handles discovery, planning, the work, checking, and iterating until the goal is met

looping is a setup you build. almost any agent harness can run it, it just depends on how you wire it up

at its simplest, looping is one agent working on itself:

> researches
> drafts
> checks the draft against a goal
> fixes what is weak
> runs that cycle again until the work clears the requirements

you are not prompting each step anymore. the agent repeats the cycle for you

the bigger version is a fleet looping. you give an orchestrator agent a goal, it breaks the goal into pieces, hands each piece to a specialist agent, and those specialists hand smaller jobs to their own subagents

the whole tree keeps looping through discovery, planning, execution, and verification until the goal is met

one agent looping is like a person redoing their own draft. a fleet looping is a whole team running a project end-to-end

you create a goal, and the system runs the loop until it finishes within the reqs you set

open and closed looping:

OPEN LOOPING is exploratory. it still has conditions and a goal, but you give the agent or the fleet a wide space to move in. it can try different paths, discover things, build something you did not fully spec out

this is the exciting end, it is what Peter and others are doing, and tbh it is where I want to spend more time

the catch is cost, an open loop with real room to explore burns an insane amount of tokens. for the 90 percent of people without an unlimited budget it is not runnable yet, and pointed at projects with a loose standard it turns into a slop machine

CLOSED LOOPING is bounded. a human designs the end-to-end path first:

> clear goal
> defined steps
> an eval at each step
> a point where it stops or hands back to you (and feeds back performance data)

the agents still loop, but inside framework you built. it gets better every run because each pass feeds the next, and it runs on a normal budget because the path is tight.

for most marketing work, closed is the one that pays off today.

> the orchestrator owns the goal
> the specialists own the steps
> the subagents do the narrow work
> an eval gate make sure its not slop

194

6K

692

10K

727K

jaychu247 retweeted

Dan Shipper 📧

@danshipper

5 days ago

this is good

50

2K

79

4K

533K

Who to follow

bnbspacesid.bnb (💙,🧡) (🌽, 🌾)cryptoshiyas.bit

@cryptoshiyas

living a crypto lyf !! https://t.co/aFGHikjB6F… 0xadf7651bd924b8add9de1fcf20e1496f1103f0b9 @SuiPadxyz #SuiPadxyz #Bluemove #MOVErs

#Web3 #Gaming #NFTs #P2E #M2E

jaychu247 retweeted

Meta Alchemist

@meta_alchemist

5 days ago

In this, you'll find a guide and an open-source repo to build your own self-improving agent loops As well as a standardized system to build benchmarks and evals around the loop, for any workflow, so you can actually see the growth happening in real time And put your agent to work even while you sleep. The repo for building the system took months of work; it's not a weekend job, so save it and utilize it.

24

97

10

76

8K

jaychu247 retweeted

Addy Osmani

@addyosmani

5 days ago

https://t.co/hIe0UX7z6T

285

7K

1K

17K

2M

jaychu247 retweeted

Fokki

@0x_fokki

7 days ago

🚨a 22-year-old makes $8,217/month from an anime channel he built in one weekend → Claude: script and scene description. 10 minutes. → Midjourney: every frame. 20 minutes. → Runway: movement, breathing, camera. 15 minutes. → ElevenLabs: character voiced with emotional direction. 10 minutes. → Suno: score. 5 minutes. → Make: published Tuesday 9am. automatically. $8,217 last month. 3 hours of work total. the studios haven't figured out what to do about this. full build with every prompt in the article above👇

58

1K

143

2K

156K

jaychu247 retweeted

Thariq

@trq212

11 days ago

Workflows are the biggest upgrade to Claude Code’s capabilities since skills and subagents. I dove deep into it with @sidbid to figure out best practices, examples and more. I’m particularly excited about the non-technical tasks it enables for Claude Code.

178

5K

364

8K

984K

Jaychu.base.eth @jaychu247

9 days ago

@mrtn Great to see your post I’ll give it a try 👍

0

2

0

42

jaychu247 retweeted

Rohit

@rohit4verse

13 days ago

2 months ago, I wrote "The Harness Is Everything" 1.3M views. Last week's Life-Harness paper: 116 of 126 model-environment setups improved by patching the harness alone. Model frozen. 88.5% mean lift across 18 backbones. ↓ how Claude Code and Codex actually work under the hood

rohit4verse's tweet photo. 2 months ago, I wrote "The Harness Is Everything" 1.3M views.

Last week's Life-Harness paper: 116 of 126 model-environment setups improved by patching the harness alone.

Model frozen. 88.5% mean lift across 18 backbones.

↓ how Claude Code and Codex actually work under the hood

64

1K

152

2K

141K

Jaychu.base.eth @jaychu247

12 days ago

Hermes

Machina

@EXM7777

13 days ago

this is the Hermes setup top 1% operators are using to get rid of AI slop...

46

3K

245

5K

302K

0

9

jaychu247 retweeted

Mr. Buzzoni

@polydao

14 days ago

KARPATHY WAS RIGHT. THIS 40-MINUTE Y COMBINATOR LECTURE PROVES IT Karpathy said we're in the 1960s of AI - most people using Claude Opus 4.8 are still acting like it's just a search engine > software 3.0 - LLMs as operating systems, not chatbots > autonomous agents that run entire workflows without you watching the 32 skills in this article are how you actually cross that line bookmark this 👇

48

2K

300

5K

561K

jaychu247 retweeted

rari

@0xwhrrari

16 days ago

Claude opus 4.8 dropped and Anthropic released Boris Cherny's prompt workshop One of the people actually building claude explains how to use it properly Free No signup No paywall The first 8 minutes are better than most $300 ai courses Context Task structure Cleaner outputs Fewer wasted prompts Watch it and bookmark it before this gets sold back to you as a course

38

2K

239

5K

337K

jaychu247 retweeted

darkzodchi

@zodchiii

15 days ago

Anthropic CEO Dario Amodei: "The cheapest way to use Claude is also the smartest. Most devs do the exact opposite." In 36 minutes, he breaks down the real economics behind every Claude model, and why running them all the same way is a mistake. Watch the full interview, then save the config below 👇

72

4K

423

10K

994K

jaychu247 retweeted

Miguel Ángel

@Miguel07Code

15 days ago

I use this to do my biweekly engineering updates for @HeyGen. It pulls my github activity, uses my avatar 5 with the heygen CLI (it can be created too), and renders the video with @HyperFrames_ CLI. The whole thing runs from Claude Code/codex/hermes agent with one skill in the reply below 👇

14

110

12

90

15K

jaychu247 retweeted

Fastlane

@UseFastlane

17 days ago

🚨BREAKING: ChatGPT for marketing is here. In one prompt, Fastlane can deploy hundreds of social media accounts, create viral content, and post it all automatically. This is insane.

98

3K

270

5K

2M

jaychu247 retweeted

Rahul

@sairahul1

17 days ago

Anthropic just officially released the blueprint for creating a company with Claude Code and it's mind-blowing😭 CEO: 1 human (who sleeps) Employees: several AIs Activities: the AIs divide up the tasks and move forward on their own Work is literally dying... I've summarized the full guide below, read it when you've got 5 min ⤵️ If you want the AI to work while you sleep → save this as a bookmark 🔖