Still evolving the process but trying to reach the point where the following is consistent:
Use parallelism as a barometer.
If agents can't work independently, the problem is coupling in the architecture not the agents. Conflicts and worktree surgery are symptoms.
One slice, one context and agent.
Every agent task maps to a vertical slice inside a single bounded context covering domain logic, API and UI, tests, docs.
Cross-context edits are a smell.
When a task needs to touch files in several contexts, either the boundary is wrong or the task isn't parallelizable.
Slice vertically.
Cut work along small targeted features, not along controllers, services, and repositories.
Cross-context communication via contracts. Events, interfaces, and APIs, nothing implicit. Each agent is confined to a slice and dependency contracts not implementation.
Don't obsess over DRY.
Repeated low-level code is cheaper than the coupling that comes from sharing it.
Verify the footprint.
A judge agent enforces the principles and rejects work that strays outside the declared slice.
The more you code with AI, the more chaos and slop you get, the more architecture you need
You can't get great results out of coding agents without this skill
My new video explains why
@OpenAI@OpenAIDevs Seems to be good now. GLM 5.2 in @AmpCode worked well as a fallback . Plan to build this into my workflow not just as redundancy but as a broader multi-harness multi-model setup routing by the shape/complexity of the task.
Codex degraded right now, intermittent stream failures across both Codex Desktop and Pi agent, so doesn’t look harness-specific. Feels like an @OpenAI model/streaming or rate-limit issue. @OpenAIDevs
@MatthewBerman More an architectural issue than agent problem. Decoupled, cohesive design doesn't require the whole codebase in memory. Just the module under change and contracts with adjacent dependencies.
@MatthewBerman Go trunk-based, branch by abstraction with feature flags. Optimize architecture for decomposition optimized for agent swarms working in parallel reducing conflicts. PRs are a handbrake in human teams even more so for agents.
@TheRohanVarma Thread management and orchestration. I have a central orchestrator and supervisors & workers for key tasks. Annotations are also useful for driving html dashboards generated by codex effectively replacing MD for planning, task coordination/observability.
Introducing Clips - 100% free, open source, agent-native alternative to Loom
Unlike Loom, agent's can fully understand Clips just from a URL. Every Clip comes with APIs and metadata for agents to explore their contents.
Agents can "see and hear" anything in a Clip - not just transcripts, but everything visually in the video at any timestamp.
Easily share bug reports, feedback, analyses, or anything else in a way that you can easily pass to agents to use to improve products, reports, or more.
Also unlike Loom, you own the software, so no one can jack up prices on you suddenly like Loom did to us.
Clips is made to be customized. The built-in agent can customize its own code, so you can personalize the app to your needs and workflows.
This, in my opinion, is the future of software. Open-source, forkable, customizable with agents, to make your own personal version of anything.
You can also import Looms just from a URL and upload videos as well.
I got so sick of telling people "don't send me feedback as looms, I can't pass those to agents, I need text and images" that I had to just solve this once and for all.
There's a free hosted version you can use too, or fork and self host yourself. Will link to both in the replies.
@ArtificialAnlys Surprising but maybe Fable's real test requires pushing the outer limits for groundbreaking innovation and open-ended invention/novel problems over time.
@JoeKingYou Same experience @OpenAIDevs@OpenAI "YOLO" mode is basically severely degraded as result. If intentional, additional options required in the UI to cater for unattended or even hitl long running loops.
@nurijanian https://t.co/JksNwMqpdA is awesome. The programmable CLI supports cross agent thread management and orchestration and custom pils/labelling is useful for keeping track of agents.
@levie Agreed but AI has reduced friction when applying lean startup, MVP principles and data-driven product market fit rather than diving in a building out a full product no-one wants.
@dakshgup 😂 thought leaders are useful for inspiration and advancing best practice but good engineers still question, experiment and think for themselves.
@plannotator@DanielGri@kirodotdev Nice :) Plannotator a great tool for honing context. I'll play with the glimpse integration but I generally generate a html visual explainer and have that open in a cmux side pane alongside an agent.
Today we’re releasing DeepSWE, a new standard for agentic coding benchmarks.
On public leaderboards, top models often look relatively close in capability. DeepSWE shows where they actually diverge, reflecting the realistic experience of developers in their day-to-day work.
@JoshJanssen@KitsoThato_@ouraring Also considering the Fitbit Air. How's accuracy on sleep data compared to Oura? I know Whoop was significantly less accurate.
@unclebobmartin Agree, not about code speed or even quality. Incidental complexity, over-engineering, poor design decisions drifting from best practices are often the biggest issues in productivity. AI helps mitigate if used in the right way.