Cosmos Research Preview is here!
It is a new way to work with AI agents: as teammates with context and memory that take on larger roles, respond to events, and collaborate with you and others on the team.
100% right that the era of typing code is ending. But I'd push further, the era of single-agent prompting will be dead just as quickly as it arrived.
One agent on one task for 30 minutes is level 1. Five agents working a shared spec in parallel with verification gates is level 2.
Internally, we are already seeing glimpses of level 3.
It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow.
Just to give an example, over the weekend I was building a local video analysis dashboard for the cameras of my home so I wrote: “Here is the local IP and username/password of my DGX Spark. Log in, set up ssh keys, set up vLLM, download and bench Qwen3-VL, set up a server endpoint to inference videos, a basic web ui dashboard, test everything, set it up with systemd, record memory notes for yourself and write up a markdown report for me”. The agent went off for ~30 minutes, ran into multiple issues, researched solutions online, resolved them one by one, wrote the code, tested it, debugged it, set up the services, and came back with the report and it was just done. I didn’t touch anything. All of this could easily have been a weekend project just 3 months ago but today it’s something you kick off and forget about for 30 minutes.
As a result, programming is becoming unrecognizable. You’re not typing computer code into an editor like the way things were since computers were invented, that era is over. You're spinning up AI agents, giving them tasks *in English* and managing and reviewing their work in parallel. The biggest prize is in figuring out how you can keep ascending the layers of abstraction to set up long-running orchestrator Claws with all of the right tools, memory and instructions that productively manage multiple parallel Code instances for you. The leverage achievable via top tier "agentic engineering" feels very high right now.
It’s not perfect, it needs high-level direction, judgement, taste, oversight, iteration and hints and ideas. It works a lot better in some scenarios than others (e.g. especially for tasks that are well-specified and where you can verify/test functionality). The key is to build intuition to decompose the task just right to hand off the parts that work and help out around the edges. But imo, this is nowhere near "business as usual" time in software.
Laughing through tears... 😅
What’s coming to software development in 2026 is going to be disruptive and, in many ways, a lot to process.
The shape of the craft I’ve spent 20 years refining is changing in the blink of an eye.
Many of the skills I built over the years are becoming less valuable, or fading fast.
Example: I can spend 100 hours debugging a problem and push through long after most people would give up. But agents can now bring even more persistence to that kind of work, and often get to the answer too.
Another example: I can drop into a complex codebase and figure it out quickly. But agents can now do a lot of that too, or at least make it much easier for a human to do.
At the same time, some skills are becoming more important than ever: seeing the big picture, understanding the details, driving toward outcomes, switching contexts quickly, and acting as a bridge between human teams and agents.
this might be the single best timed called shot in the history of AIE. I now think about this talk almost ~daily and have direct line of sight to the nonstop onslaught of new post-IDE form factors for agentic engineering.
@RealGeneKim and @Steve_Yegge really nailed this one, no notes. and they called out one of the predominant shifts in 2026 coding, in Nov 2025. i'm still shocked how -I- myself have changed opinion this dramatically in the last 3 months, because unlike appearances I'm actually not an early adopter of things personally (although I serve early adopters for a living). So by the time -I, a perennially left-of-mid-bell-curve person- have come around to the idea, then you really, really know its here.
well: it's here. @Wattenberger just showed me what she and the Augment team have been cooking and yeah, this is the "ADE" or whatever three letter acronym you wanna call it. Cursor 2.0 was a toe dip. Claude folded it into their chat app. Codex formalized the Conductor patterns. Amazon Kiro went hard on Spec Driven Dev. but Intent... this feels like every good idea i've heard in code agent management rolled into one app that, very generously, does not lock you into only using Augment's inhouse coding agent.
I'm in awe at all these smart people I get to talk to because yeah the future of how software is made is happening right in front of my eyes and people will ask us what it was like during this golden age for the rest of our lives.
Intent is our vision for what comes after the IDE.
AI has changed how we build software. But, it’s also made our workflows messier.
One agent is great. Two work. Past that, things fall apart fast. Prompts go stale, context lives everywhere, and you end up spending more time on the tedious work of orchestrating agents.
The bottleneck isn’t writing code anymore. It’s keeping the agents aligned.
That’s why we built Intent.
Do humans still need to understand the code that AI is producing?
For projects below a certain threshold of complexity, you can do Ralph-Wiggum it: have agents generate code that's only checked against automated or AI-verified success criteria.
Above that threshold of complexity, a human engineer needs to deeply understand every single change or else the project devolves into unmaintainable piles of AI slop.
There's an AI coding approach that lets you run seriously long-running AI agents (hours, days) that ship code while you sleep. I've tried it, and I'm not going back.
It's the Ralph Wiggum approach. Here it is in a nutshell:
Run a coding agent with a clean slate, again and again until a stop condition is met.
The Bash Script
Running ralph involves a single bash script.
1. Set up a bash for loop that runs a set number of times, let's say 10. You choose a finite number to prevent the agent running infinitely.
2. Inside the loop, you get the coding agent (Claude Code, OpenCode, Codex etc) to work on a single feature in the repo until that single feature is done. You prompt it to say "if, after implementing, there is no further work to be done, reply with <promise>COMPLETE</promise>.
3. Check for <promise>COMPLETE</promise> inside the loop and exit early if it exists.
The Stop Condition
How does the LLM know when to emit <promise>COMPLETE</promise>? There are multiple approaches:
- Raw Prompting: Just pass a very clear stop condition to the prompt inside the bash loop. "The job is complete when X conditions are met"
- TODO list: Give the agent a TODO list to complete
- PRD: My preferred approach, explained below
Progress Reports
In your prompt, you MUST tell the agent to commit its work, and append its progress to a local progress.txt file.
Committing its work allows future agents to navigate what was done via the git history.
The progress.txt is a standard long-running agent practice. Feed it in to the agent via the prompt, and use the verb 'append' to make sure it doesn't update previous entries.
Keep CI Green
Each commit MUST pass all tests and types. This means you have to prompt the agent to run typechecks and tests on each commit.
If you don't do this, you're hamstringing future agent runs with bad code, and they'll need to bisect to find bugs. Super nasty.
This means that building really healthy feedback loops is CRITICAL to Ralph's success.
The PRD
Two problems immediately emerge with Ralph.
The first is that the agent picks tasks which are too large. They don't scope the amount of work correctly and they try things which are too ambitious. This means they run out of context window and just end up failing.
The second is that the agent doesn't know when to stop.
To solve this, I use a PRD-based approach that formalizes the work I want Ralph to complete in a list of user stories.
It's a mix of a PRD and a TODO-list, where the PRD is a JSON file of user stories with 'passes: false'.
I then prompt the agent to pick the highest priority feature, and ONLY work on that feature. It then updates the passing status of the relevant PRD item at the end.
This scopes it down extremely effectively, meaning it utilizes only a small part of its context window, and thus isn't swamped by context rot.
Summary
- Bash script
- JSON-based PRD
- progress.txt
- Keep CI green
- Feedback loops
You'll have an AI coding setup that can ship while you kip.
@7Psychon 1. Quality: model, prompt and tools specifically selected and tuned for code review
2. Analytics: metrics and dashboard
3. UX: Review in GitHub rather than in IDE
Software leaders often ask me: "Now that we've adopted AI for coding, where should we deploy AI next"?
My answer is: Code Review AI. Every software team today should be using a great code review AI product.
Most teams have something that was set up 6-12 months ago or don't have a code review AI at all.
Today, @augmentcode launched the best Code Review AI on the market. With our Context Engine, our Code Review AI understands your complex code base better than any other solution on the market.
Free for all paid plans for a week! (You need at least an Indie plan to try it.)
Users love our Context Engine. Last week, we made it available to all agents as an MCP server.
Today, we're sharing our 🚀Context Engine SDK🚀. What we do with context, you can do too.
You can build agents and tools that retrieve from codebases, docs, configs (and more)!
📚 Quickstart: https://t.co/75VLgSMoHt
💡 Examples: https://t.co/QmwDOtvISe
More to come in the next couple of weeks!
Here's what you can build with it 🧵