The problem with the "if it works who cares what the code looks like" mindset for agentic work is that it assumes the agent has a perfect understanding of "works." Realistically, things are underspecified, agents make bad assumptions, etc.
To be fair, agents are pretty good at unit test coverage. They're pretty bad at designing human experiences (API, CLI flags, etc.), especially cohesive ones for future roadmap plans they may not have visibility into (unless your backlog is perfect and vision fully laid out, which I doubt). They're bad at knowing where performance matters and what type (CPU vs memory tradeoffs). They're bad at where compatibility matters and where it doesn't (and tend to err on the side of preserving it without further guidance). Etc.
Unless you have this ALL specified, you can't possibly claim "it works" without taking a look and thinking about it.
I'm finally reading Dune. This quote, which is in the first few pages, hits hard:
"Once men turned their thinking over to machines in the hope that this would set them free. But that only permitted other men with machines to enslave them."
I've got an agent in a loop optimizing a renderer with the goal to minimize frame times (and tests to measure). It got times down from 88ms to 2ms and allocations down from ~150K to 500. Sounds good, right? Wrong. This is exactly why agent psychosis is a big fucking problem.
As an experiment, I rewrote the Ghostty core render state in Go, with access to identically laid out data structures as Ghostty and the exact same validation tests. I made a purposely naive renderer (simple, correct, but slow). 88ms per frame with 150,000 allocations (horrendous, lol)!
I then kickstarted a Ralph loop to bring the frame times down. I told it it can't modify input data structures or the public API or tests (they're correct), but it can do anything else it wants. It got to work.
It has worked for about 4 hours. I've spent around $350 on this experiment so far. The results?
88ms => 1.5ms
150K allocs => ~500 allocs
Incredible right? Nope.
My hand-written renderer I ported has frame times (same benchmark) of ~20us (0.020ms) and 0 allocations in the update path.
This is the problem with psychosis and lacking systems understanding. If you don't understand the system, you're going to accept that this is an incredible result. If you understand the system, you'll see better solutions immediately and can do roughly 75x better on throughput.
The people who blindly trust agent output are in the former camp. They're sheeple, overdrinking from a fountain of mediocrity.
Standard disclaimer: I use AI all the time. I like AI. The point I'm making is to not blindly accept results. Think. Analyze. Learn.
been working on an immediate mode ui + animation system. I think I have a "basic" implementation that allows for decently complex animations.
1. hand crafted the ImUI layout system
2. had AI do some research report back on how animations should work
3. threw out all of the AI work and redesigned it
4. had the AI attempt to implement it to see how it would look
5. About to throw that out and hand craft that SOB into something beautiful
I talked about this on the standup podcast yesterday, but I'll reiterate here: if you're losing sleep because you need to keep feeding the agents STOP, I promise it's not worth it. You got caught in a [prompt -> reward] dopamine cycle and you're addicted to the feeling of the token slot machine. It's not your fault, but you need to escape before it grinds you into a pulp and you can't look at a computer for a month (this was me). If you can break out of it and spend some more time offline, or find other healthy sources of dopamine in hobbies/etc, you'll start to realize just how warped your perception was and that the thing you were chasing wasn't actually productive.
@ilyamiskov Thank you very much. It's this kind of thinking that's taking us to such a shitty looking boring-ass world. Just stop it already, let people experiment and have fun!