I've got an agent in a loop optimizing a renderer with the goal to minimize frame times (and tests to measure). It got times down from 88ms to 2ms and allocations down from ~150K to 500. Sounds good, right? Wrong. This is exactly why agent psychosis is a big fucking problem.
As an experiment, I rewrote the Ghostty core render state in Go, with access to identically laid out data structures as Ghostty and the exact same validation tests. I made a purposely naive renderer (simple, correct, but slow). 88ms per frame with 150,000 allocations (horrendous, lol)!
I then kickstarted a Ralph loop to bring the frame times down. I told it it can't modify input data structures or the public API or tests (they're correct), but it can do anything else it wants. It got to work.
It has worked for about 4 hours. I've spent around $350 on this experiment so far. The results?
88ms => 1.5ms
150K allocs => ~500 allocs
Incredible right? Nope.
My hand-written renderer I ported has frame times (same benchmark) of ~20us (0.020ms) and 0 allocations in the update path.
This is the problem with psychosis and lacking systems understanding. If you don't understand the system, you're going to accept that this is an incredible result. If you understand the system, you'll see better solutions immediately and can do roughly 75x better on throughput.
The people who blindly trust agent output are in the former camp. They're sheeple, overdrinking from a fountain of mediocrity.
Standard disclaimer: I use AI all the time. I like AI. The point I'm making is to not blindly accept results. Think. Analyze. Learn.
We’ve had on the order of 3 memory bugs in 6 years of TigerBeetle. None RCEs.
On the other hand, our own simulators have proactively found hundreds of (devastatingly catastrophic) distributed systems correctness bugs per year.
Given how hard TigerBeetle’s domain is, in terms of mission critical financial transaction processing, I’ve never for one minute believed that writing TB in a memory safe language such as say TypeScript would somehow magically (!) make any material impact compared to the 100x correctness multiplier of TigerStyle.
That’s because—rather than fall for the fallacy of composition, i.e. to see distributed correctness as a language problem—TigerStyle instead takes ultimate responsibility for the “end to end” correctness of the distributed system as a whole.
Per systems engineering, correctness is always a systems design problem. For example, how to build a reliable whole, (especially) out of unreliable parts, such as broken firmware, bitrot, programmer error etc. In other words, application of the end to end principle.
But when you TigerStyle the design in this way, the world of systems engineering also completely opens up to you and changes how you evaluate systems languages (now things like “power to grammar ratio”, or explicitness, checked arithmetic and precision become more critical and valuable to you).
Of course, it is harder to care about correctness, to take responsibility for correctness end to end.
Yes, you’re forced to begin to worry about the more serious concerns, starting with the basics of static allocation, explicit limits, assertions, deterministic simulation testing and moving to more advanced topics like protocol-awareness and storage fault-tolerance.
But then again, TigerStyle is such a force multiplier, that you achieve mission critical quality, and in less time and with greater velocity.
If you’re tired of production issues, and if you want to “engineer your engineering”, I would encourage you to lift up your thinking to the level of systems design and end to end correctness.
Start thinking about your methodology and begin embracing TigerStyle.
https://t.co/fgtyNCO7SE
Liftoff.
The Artemis II mission launched from @NASAKennedy at 6:35pm ET (2235 UTC), propelling four astronauts on a journey around the Moon.
Artemis II will pave the way for future Moon landings, as well as the next giant leap — astronauts on Mars.
@philogy Agree, I've basically one-shotted a feature to solidity to show stack allocation on stack too deep. Wouldn't dare to even try in such a codebase otherwise.
@ZoidCTF@cmuratori I've been trying to find a working distro with Steam for the last few days. Bazzite just stuck on Steam logo. I'm confused, because I have Framework Desktop as well.
Re Windows: proton overhead is not negligible, a few tens of FPS might be a game changer, pun intended.
@optimizoor My understanding is the optimizer fails to remove unreachable jump destination because of the loop layout. If we take a look in the assembly listing we can find the unreachable tag_10 referenced by itself and the tag_11 referenced by the tag_10 and therefore both are kept.
@optimizoor I've played a little with this approach and realized the legacy pipeline fails to remove the dead code if there's ... wait for it ... a loop. If you try to compile the following contract with legacy optimizer you will see both branches kept even though one unreachable.
"A calculator app? Anyone could make that."
Not true.
A calculator should show you the result of the mathematical expression you entered. That's much, much harder than it sounds.
What I'm about to tell you is the greatest calculator app development story ever told.
The validation setup is now up and running, all thanks to @nethermind and @dappnode! And one more @LidoFinance CSM operator is patiently waiting for a deposit 😉
42 years ago this bakery plugged in their Commodore 64s to use as cash registers at Hilligoss Bakery in Brownsburg, Indiana, and are still in use.
The POS system is written in BASIC and has never had no problems, no hackers and no internet.
It will out last all of us.