how to be good at your job
- realize this one thing is actually made up of two separate things
- realize instead of solving the direct problem you can solve a broader problem
- instead of implementing thing, implement other thing that makes it easier to implement thing
@deepfates wtf. mine gave me "Cesare, from Pavese's The Moon and the Bonfires.", when I asked who's that, it told me "Sorry, actually Cesare is the author of the book"
> *looks into Claude's thinking*
Not a single mention neither of this book, nor of the author.
I've got an agent in a loop optimizing a renderer with the goal to minimize frame times (and tests to measure). It got times down from 88ms to 2ms and allocations down from ~150K to 500. Sounds good, right? Wrong. This is exactly why agent psychosis is a big fucking problem.
As an experiment, I rewrote the Ghostty core render state in Go, with access to identically laid out data structures as Ghostty and the exact same validation tests. I made a purposely naive renderer (simple, correct, but slow). 88ms per frame with 150,000 allocations (horrendous, lol)!
I then kickstarted a Ralph loop to bring the frame times down. I told it it can't modify input data structures or the public API or tests (they're correct), but it can do anything else it wants. It got to work.
It has worked for about 4 hours. I've spent around $350 on this experiment so far. The results?
88ms => 1.5ms
150K allocs => ~500 allocs
Incredible right? Nope.
My hand-written renderer I ported has frame times (same benchmark) of ~20us (0.020ms) and 0 allocations in the update path.
This is the problem with psychosis and lacking systems understanding. If you don't understand the system, you're going to accept that this is an incredible result. If you understand the system, you'll see better solutions immediately and can do roughly 75x better on throughput.
The people who blindly trust agent output are in the former camp. They're sheeple, overdrinking from a fountain of mediocrity.
Standard disclaimer: I use AI all the time. I like AI. The point I'm making is to not blindly accept results. Think. Analyze. Learn.
@elder_plinius Yes, the same tactics work again and again, it's just sometimes you need to craft compelling enough setting to make the request actually valid.
And it's really hard to protect against such jailbreaks, like how would you even do it without lobotomizing the model.
@uncreativetom Hmm, I am not convinced by the drop resistance scores. I am carrying my iPad pro 13, without a case and it had a lot of crazy falls over the last two years. It has very little scratches.
@rtfeldman When I got confident enough in English to start reading books in it, I wondered what happened to the spaces around em dashes — and who these psychopaths were who left them out. 🙃
@josevalim@chris_mccord@antirez This is genuinely interesting! I trust your judgment here,
But conversationally the ChatGPTs feel impossible. 4h debugging a Pi Pico with one, and ughh —— it kept circling and ignoring what I'd shared.
While Claude found the solution in half an hour!