what is agent looping
for the last two years we prompted agents one task at a time. that is starting to change
instead of asking an agent to build the landing page and then driving every step yourself, you set up a loop that handles discovery, planning, the work, checking, and iterating until the goal is met
looping is a setup you build. almost any agent harness can run it, it just depends on how you wire it up
at its simplest, looping is one agent working on itself:
> researches
> drafts
> checks the draft against a goal
> fixes what is weak
> runs that cycle again until the work clears the requirements
you are not prompting each step anymore. the agent repeats the cycle for you
the bigger version is a fleet looping. you give an orchestrator agent a goal, it breaks the goal into pieces, hands each piece to a specialist agent, and those specialists hand smaller jobs to their own subagents
the whole tree keeps looping through discovery, planning, execution, and verification until the goal is met
one agent looping is like a person redoing their own draft. a fleet looping is a whole team running a project end-to-end
you create a goal, and the system runs the loop until it finishes within the reqs you set
open and closed looping:
OPEN LOOPING is exploratory. it still has conditions and a goal, but you give the agent or the fleet a wide space to move in. it can try different paths, discover things, build something you did not fully spec out
this is the exciting end, it is what Peter and others are doing, and tbh it is where I want to spend more time
the catch is cost, an open loop with real room to explore burns an insane amount of tokens. for the 90 percent of people without an unlimited budget it is not runnable yet, and pointed at projects with a loose standard it turns into a slop machine
CLOSED LOOPING is bounded. a human designs the end-to-end path first:
> clear goal
> defined steps
> an eval at each step
> a point where it stops or hands back to you (and feeds back performance data)
the agents still loop, but inside framework you built. it gets better every run because each pass feeds the next, and it runs on a normal budget because the path is tight.
for most marketing work, closed is the one that pays off today.
> the orchestrator owns the goal
> the specialists own the steps
> the subagents do the narrow work
> an eval gate make sure its not slop
I'm hiring a PM for Claude Code, focused on model performance.
If you have experience writing agentic evals and want to integrate research ideas into our core products, I'd love to hear from you here: https://t.co/IKWlAr8tSb
GPT-5.6 is going to be very good at UI.
My evidence:
1. The Codex app actually looks good. Much better than anything GPT-5.5 has made for us. They must be using GPT-5.6 internally.
2. OpenAI just shipped “Sites.” You don’t ship a feature for publishing AI-generated UIs unless you’re pretty confident the model can make good UIs.
3. GPT-5.5 is already strong at almost everything except UI.
UI is the last obvious gap.
Some of you noticed limits drained faster in Codex, we root caused it to an optimization that we rolled back that had an impact on cache hit rates when compacting across long running sessions.
We fixed this and have now reset usage limits for all accounts. Enjoy the weekend.
An OpenAI model has achieved a major breakthrough in mathematics, by disproving a central conjecture in discrete geometry that was first posed by Paul Erdős in 1946.
This is the first time AI has autonomously solved a prominent open problem central to a field of mathematics.
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
The vibes in SF feel pretty frenetic right now. The divide in outcomes is the worst I've ever seen.
Over the last 5yrs, a group of ~10k people - employees at Anthropic, OpenAI, xAI, Nvidia, Meta TBD, founders - have hit retirement wealth of well above $20M (back of the envelope AI estimation).
Everyone outside that group feels like they can work their well-paying (but <$500k) job for their whole life and never get there.
Worse yet, layoffs are in full swing. Many software engineers feel like their life's skill is no longer useful. The day to day role of most jobs has changed overnight with AI.
As a result,
1. The corporate ladder looks like the wrong building to climb.
Everyone's trying to align with a new set of career "paths": should I be a founder? Is it too late to join Anthropic / OpenAI? should I get into AI? what company stock will 10x next? People are demanding higher salaries and switching jobs more and more.
2. There’s a deep malaise about work (and its future).
Why even work at all for “peanuts”? Will my job even exist in a few years? Many feel helpless. You hear the “permanent underclass” conversation a lot, esp from young people. It's hard to focus on doing good work when you think "man, if I joined Anthropic 2yrs ago, I could retire"
3. The mid to late middle managers feel paralyzed.
Many have families and don't feel like they have the energy or network to just "start a company". They don't particularly have any AI skills. They see the writing on the wall: middle management is being hollowed out in many companies.
4. The rich aren’t particularly happy either.
No one is shedding tears for them (and rightfully so). But those who have "made it" experience a profound lack of purpose too. Some have gone from <$150k to >$50M in a few years with no ramp. It flips your life plans upside down. For some, comparison is the thief of joy. For some, they escape to NYC to "live life". For others still, they start companies "just cuz", often to win status points. They never imagined that by age 30, they'd be set. I once asked a post-economic founder friend why they didn't just sell the co and they said "and do what? right now, everyone wants to talk to me. if i sell, I will only have money."
I understand that many reading this scoff at the champagne problems of the valley. Society is warped in this tech bubble. What is often well-off anywhere else in the world is bang average here.
Unlike many other places, tenure, intelligence and hard work can be loosely correlated with outcomes in the Bay. Living through a societally transformative gold rush in that environment can be paralyzing. "Am I in the right place? Should I move? Is there time still left? Am I gonna make it?" It psychologically torments many who have moved here in search of "success".
Ironically, a frequent side effect of this torment is to spin up the very products making everyone rich in hopes that you too can vibecode your path to economic enlightenment.
We’ve also been tightening Codex performance across the app, especially for large repos and active coding sessions.
• ~75% less re-rendering when switching threads
• Some streaming paths dropped to 0 unnecessary re-renders
• Expensive Git operations in large repos reduced by ~10-50x, depending on the operation
• Less UI churn across streaming responses, thread switching, and sidebar interactions
• Faster time to usefulness around startup and first interaction
Less background churn. More responsive coding.
You've been asking for this one...
Now in preview: Codex in the ChatGPT mobile app.
Start new work, review outputs, steer execution, and approve next steps, all from the ChatGPT mobile app. Codex will keep running on your laptop, Mac mini, or devbox.
Computer use lets Codex work across your apps without taking over your Mac.
@AriX talks with @romainhuet about what changes when agents can click, type, and keep working in the background.
The worst part of applying for jobs is having to write cover letters explaining my deep passion for a company I had no idea existed until I saw the job listing 2 hours ago😭😭😂😂😂😂😂😂😂😂😂😂