i do agree that he should get banned on faceit etc. , he is way to young for stuff like this.
however i also seen now many comments saying that 7yo is to young in general to play video games , and idk about that one , literally anyone ik including myself started playing cs , gta etc. very young.
Opus 4.8 Is Out!
idk if im tripping but it works way faster than 4.7 does even thoe the TPS is the same.
it also thinks way faster on the same effort level than it does do on 4.7
in case you hit ever guardrails -> delete entire chat and just create a new one, trust.
i tried correcting and re-phrasing it many times , just lost usage for nothing , it kept re-triggering
CumBench v1.0 results are in.
Gemini 3.5 Flash ranks #1 on the CumBench benchmark, outperforming much larger models a whole size above it in real-world finish quality.
The gap is honestly staggering.
Gemini 3.5 Flash is one of the worst releases Iโve ever seen.
To summarize the problems:
- The 4x price increase compared to earlier Flash models is insane and destroys the purpose of this model being a cheaper, less intelligent, and faster version. This is especially important for people like me using it in services through the API.
- The model tries to be โFlash, but with Pro intelligence,โ while in reality, this does not really work. Due to the nature of a fast reasoning model, it will always be much less capable and might only perform well in benchmarks due to benchmark-maxxing.
- On top of that, the model is more token-hungry. So even if another model is more expensive per million input/output tokens, 3.5 Flash can still end up being more expensive because it burns way more tokens for similar tasks.
- Google has made itself very untrustworthy here (just like xAi did when they removed their fast grok 4.1 versions without an proper replacement). You cannot reliably build around their models anymore if the next iteration might suddenly cost 4x more. They basically destroyed this model for my use case, so I now have to look for a better alternative, at the latest when they drop the older models.
Why is this so hard?
A Pro model should be slower and more expensive, but provide what it promises: high-quality responses.
A Flash model should balance cost and intelligence. (mostly companys/private projects using it inside their services where they do not need a Pro reasoning model anyway)
And Flash-Lite should be for very simple, repetitive, low-quality tasks.
i used claude code in multiple 200k+ lines projects , for 5-8hrs / day (without a day off) in 5x plan , and use like 70-80% weekly usage.
i honestly dont know what the people do to run so fast out of usage , running 10 opus agents in parallel ?
also codex gives me compared to claude (on a tier-to-tier comparison) roughly 30% more usage (excluding events etc.)
as someone already said here in the replys, its prop a combination of sponsor and just trying to not cause any bad attention in any way.
but honestly, flags are kind of a part of most competitions, so i personally would never feel triggered or offended by seeing any flag, even from the most horrible countrys in the world.
cuz after all, its just colors representing someone's origin nation, and the players themselfs mostly dont give a shit about that anyway.