Took a weekend off X.
Turns out watching everyone build amazing things 24/7 is… kind of exhausting.
Stepped away. Reset. Touched grass.
Back now and actually excited to see what everyone's working on. What did I miss?
PSA for Claude Code users:
Don't always use ultracode.
Don't use orchestrate.
Unless you enjoy blowing through your entire 5-hour session in one sitting.
Learned this the hard way so you don't have to.
@quxiaoyin Agreed. I am running two main agents and these two fan out to call the others, only coming back to me with updates or question it cannot answer or decide on.
I would love to see how people manage 5 effectively. The context switching gets crazy.
Humans can only manage 5 AI agents max
I asked everyone I know: "How many agents can you manage simultaneously?"
The consensus: 3 is hard. 5 is the absolute limit. Nobody can effectively manage more than 5 agents at once.
Managing 3-5 agents turns you into a context-switching nightmare. Channel A, Channel B, Channel C. Your brain becomes a pinball machine.
Two key insights:
1. The future isn't humans managing agents - it's agents managing agents.
I can't personally manage 100 agents. My brain would explode. The only path to scale is having a meta-agent manage my agent workforce.
If I only manage 5 agents, I'm basically a small team lead. M0 level at Facebook - managing 5 direct reports.
But if I can manage 50 agents through AI management layers? That's a completely different power level.
2. The bottleneck is task duration, not task complexity.
If an agent bothers me every minute, I can only handle 1 agent.
If it's every 5 minutes, maybe 3 agents.
If it's every 10 minutes, possibly 5 agents.
The breakthrough everyone talks about - "long horizon tasks" - isn't just about AI doing complex work. It's about AI working independently long enough that humans can actually parallel multiple agents.
Real-world implication:
Facebook now ranks engineers by token usage to measure AI adoption. But you can't burn serious tokens by manually managing agents one-by-one.
To hit the top of that leaderboard, you NEED agents managing other agents. That's the only way to achieve massive token consumption.
The human cognitive limit is real. 5 agents maximum.
Everything beyond that requires AI management layers.
We're exploring this at our company. I think "Agent Manager" as a product category will emerge very soon.
The question isn't "How good are you with AI?" It's "How many management layers can you orchestrate?"
That's where the real leverage lives.
Managed to get this working for my Whoop 5, for a 24 hr project its so sick 🔥
The app had frame drops, raised a PR to fix it.
Raising more to make the experience like whoop. This is so cool @b_nnett
Qwen3.6 35B A3B can't fill out a paper form on its own. But give it NVIDIA's LocateAnything-3B — the #1 trending model on HuggingFace — as its eyes, and the two small models get it done together.
(The test: place each element at the right pixel position on a blank form image, not type into a field.)
Setup:
> Qwen is the brain (main model), LocateAnything is the eyes (helper model acting as a tool).
> I gave Qwen a new tool: ask "where's the email field?" and LocateAnything returns the exact x, y, width, height.
> The blue boxes on the screen are its detections. Look how tight they are — it nails every field.
Result:
> Qwen3.6 35B A3B + LocateAnything-3B: form completed, all info correct.
> Name, DOB, ID, gender, marital status, nationality, email, phone, address, postal code: all landed in the right field areas.
> Character-box alignment still a touch loose, but every value is where it belongs.
> 9m10s, 224.5k input, 24.3k output, 21 turns.
Why it matters:
> Qwen alone can't finish this test. Bolt on a 3B model that does exactly one thing > locate > and suddenly it can.
> A combination of small models can do the work of a single large one.
In 2 years, "I can't code" will sound like "I can't use a computer."
The tools are here. The barrier is gone.
The only thing between you and building the thing in your head is whether you start.
Most people won't. That's the whole opportunity.
@ManavGarkel IMO best for e2e flows of business logic. How a user would interact with the app and if there's any gaps there with the code written.
There's also general tests and code review agents in place to capture the bugs between that
The AI coding loop nobody talks about:
Write code → looks right → ship it → bug in prod
The one that actually works:
Write code → prove it works → then ship it
I use Hermes Agent as a QA layer on top of my Claude Code sessions.
Caught ~40% of agent mistakes before they hit the repo.
Speed is meaningless if you're confidently shipping broken code.
@Nanoswarm_net One of the ways to engage yourself is to go through a manual verification process.
Ask Claude/Codex to give you verification steps one by one. Then actually run through the code, UI or data. Verify and respond.
Gain context and familiarity