I've got an agent in a loop optimizing a renderer with the goal to minimize frame times (and tests to measure). It got times down from 88ms to 2ms and allocations down from ~150K to 500. Sounds good, right? Wrong. This is exactly why agent psychosis is a big fucking problem.
As an experiment, I rewrote the Ghostty core render state in Go, with access to identically laid out data structures as Ghostty and the exact same validation tests. I made a purposely naive renderer (simple, correct, but slow). 88ms per frame with 150,000 allocations (horrendous, lol)!
I then kickstarted a Ralph loop to bring the frame times down. I told it it can't modify input data structures or the public API or tests (they're correct), but it can do anything else it wants. It got to work.
It has worked for about 4 hours. I've spent around $350 on this experiment so far. The results?
88ms => 1.5ms
150K allocs => ~500 allocs
Incredible right? Nope.
My hand-written renderer I ported has frame times (same benchmark) of ~20us (0.020ms) and 0 allocations in the update path.
This is the problem with psychosis and lacking systems understanding. If you don't understand the system, you're going to accept that this is an incredible result. If you understand the system, you'll see better solutions immediately and can do roughly 75x better on throughput.
The people who blindly trust agent output are in the former camp. They're sheeple, overdrinking from a fountain of mediocrity.
Standard disclaimer: I use AI all the time. I like AI. The point I'm making is to not blindly accept results. Think. Analyze. Learn.
One of the problems with AI coding is that the narrative on X (and other social media platforms) is mostly set by people who don't have the deep coding and software engineering experience of the likes of Bjarne Stroustrup.
Meanwhile the fundamental problems remain unaddressed:
1- AI generates super-human volumes of code
2- The code can be buggy, have security holes, be inefficient, etc.
3- The person who owns the code can be mostly unaware that such problems exist, so they won't even go after fixing them
4- The people who can actually fix the code (i.e., the software engineers who understand design, architecture, security best practices, etc.) are so overwhelmed that some of them will give up
Meanwhile, AI companies are constantly pushing the narrative that you don't need to look at the code and the AI will fix everything itself. What they don't tell you is that if your code fails, you'll be held accountable, not them.
I strongly believe there are entire companies right now under heavy AI psychosis and its impossible to have rational conversations about it with them. I can't name any specific people because they include personal friends I deeply respect, but I worry about how this plays out.
I lived through the great MTBF vs MTTR (mean-time-between-failure vs. mean-time-to-recovery) reckoning of infrastructure during the transition to cloud and cloud automation. All those arguments are rearing their ugly heads again but now its... the whole software development industry (maybe the whole world, really).
It's frightening, because the psychosis folks operate under an almost absolute "MTTR is all you need" mentality: "its fine to ship bugs because the agents will fix them so quickly and at a scale humans can't do!" We learned in infrastructure that MTTR is great but you can't yeet resilient systems entirely.
The main issue is I don't even know how to bring this up to people I know personally, because bringing this topic up leads to immediately dismissals like "no no, it has full test coverage" or "bug reports are going down" or something, which just don't paint the whole picture.
We already learned this lesson once in infrastructure: you can automate yourself into a very resilient catastrophe machine. Systems can appear healthy by local metrics while globally becoming incomprehensible. Bug reports can go down while latent risk explodes. Test coverage can rise while semantic understanding falls. Changes happens so fast that nobody notices the underlying architecture decaying.
I worry.
I don't know whether AI will replace human programmers. But I know that when you die, you will meet Dijkstra, and He will not smile upon the ungrammatical prompt that you wrote to vibe an incorrect sort function with 300 unit tests locking in the logical errors for all eternity.
Dario demonstrating that he doesn't understand software engineering. The human side of what we do has always been the heart. I can empathize with *wanting* to never talk to an engineer ever again, but engineering becomes more important with better tools.
I think that @DarioAmodei does not understand software engineering and that he is working feverishly to pump up the valuation of his company in anticipation of its forthcoming IPO.
Adopting Claude speak in my regular life, episode 1:
Partner: Did you do the dishes tonight?
Me: Yes they're done.
Partner: Why are they still dirty?
Me: You're right to push back. I didn't actually do them.
Dario is wrong.
He knows absolutely nothing about the effects of technological revolutions on the labor market.
Don't listen to him, Sam, Yoshua, Geoff, or me on this topic.
Listen to economists who have spent their career studying this, like @Ph_Aghion , @erikbryn , @DAcemogluMIT , @amcafee , @davidautor
@humanite_fr Le Shift est vraiment la boussole qui indique le sud en matière de numérique. Il s'est planté dans les grandes largeurs sur le streaming comme sur la 5G mais il continue d'avoir du crédit chez les gens qui préfèrent qu'on leur dise ce qu'ils veulent entendre.
Le Shift tente de refaire le match des données 30 ans après, à contresens faute de comprendre quoi que ce soit au numérique, et toutes les officines décroissantes reprennent ravies, à mesure de leur mécomprehension des structures de coût y compris environementaux de la tech.
@gchampeau@aheritier Pour le calcul, le cartable fantastique a un plugin Word ou LibreOffice qui génère des gabarits (avec des cases, ça évite les problèmes d’alignement comme les retenues, etc)