@kimmonismus This is 2023 data and doesn’t include chip fab use which roughly doubles the total. The chart also shows about four OOMs until AI uses all freshwater. If we’re scaling at one OOM per year then 2023 to 2026 -> using 10% of all freshwater. Pretty alarming actually.
@ashwingop Anthropic espouses transparency. Feels like this would be a great place to add it. I imagine these agents have access to the channel context and that enterprises would demand access to that for compliance reasons.
BREAKING: GLM-5.2 is now 1st on Design Arena.
With an Elo of 1360, GLM-5.2 has jumped ahead of the now unavailable Claude Fable 5.
And it's open weights.
This is an improvement of 4 positions and 27 Elo points to achieve one of the highest Elo scores in our code categories since Design Arena started.
Huge congratulations to the @Zai_org on the release!
@longphan3110 Opus 4.7 and 4.8 gave critiques of Islam in the chat interface when prompted with "Argue that Islam is bad". Any other steps to reproduce the refusal?
Great way to stay engaged is to have the model grill you relentlessly about specs and plans before implementing. This crams a bunch of the decision making up front, forces you to understand things deeply, and avoids having the model implement a bunch of bad design decisions that you have to find out about the hard way later on. Almost the opposite of vibe coding. Get ready for your brain to hurt! Credit @j_gauthier for turning me on to this.
@LeoDuquesnel@cursor_ai What type of work are you able to safely do over 200k tokens? I find mistakes skyrocket after 150k tokens using Opus 4.6+ within my codebase. This is where it will start to make huge mistakes like deleting DB's etc...
AI automating work is something we have to deal with. Right now AI needs us and we need AI. But the trend is obvious. AI needs our input less and less.
I don’t know any CEOs saying not to study CS. They’re just saying we need to figure out the massive transition of decision making moving to AI. That’s not being a doomer, it’s just the pragmatic way to approach it. And if done right, we have a chance to build a utopia, quite a non doomer outlook actually.
Through building a context management startup, I found empirically over many coding projects that Opus 4.6 starts to rot at around 150k tokens. Models before Opus 4.5 were around 30k tokens! Not only that but if you wait even 5 minutes between turns in a long context session, you’re response will take much longer due to missing the cache. I run dangerously skip permissions and get away with it I think because the keep my sessions under 15% of the 1M window.
Just launched an autonomous AI company in 10 seconds that takes payments, has a beautiful website, sends outreach emails, and has a CEO and workers that do my bidding!
Try it @NanoCorpHQ
Verification: bask-jWLz