I was able to remove 90% of my deterministic guard rails, all my pre-loop classifiers, and reduce my system prompt by ~30% moving from GPT 5.4 -> 5.5. (tbf it had built up a lot of cruft 5.4 probably didn't need)
5.5 did it itself over ~8 hours total (never had anything run this long before! 3-4 sessions, mostly unattended), then it took me 2-3 hours to review and clean up. Would've taken me ~2 weeks on my own, probably.
Every time I catch myself trying to do clever system prompts + deterministic guardrails (old world thinking) and instead rethink it into a few simple primitives, my little agent loops get like 2-3x more capable and 90% less brittle
Also: evals. Evals evals evals.