@kilocode Lol yeah but try asking mini Max to follow a step by step protocol or procedure other than coding and it all goes into the toilet. It has the attention span of gen Zers
@ghostloadgg@theo@MiniMax_AI@datacurve It's a new LLM benchmark that can't easily be gamed just like all of the other ones are nowadays, which is why the open source models keep doing so well on them
@theo@MiniMax_AI@datacurve Deepswe Is now the only benchmark I care about. It was absolutely spot on as it reflected my exact experience when using each of the models, especially some of the open source models claiming to be sota, but when using them they've completely fall apart as agents
@thsottiaux Anthropic has been "borrowing " from everyone this year. 5 days after banning open claw they dropped their own locked up version "Managed Agents"
@MatthewBerman I thought they didn't have the capacity to provide that type of inference, i.e. Ollama is now unusable. Complete standstill for 15 mins. Price drops cause us to flock and burn
@BenjaminBadejo@steipete@vincent_koc@HixVAC@_vgnsh@cherry_mx_reds Dont gas me up, make me spend 4 hours troubleshooting the upgrade to 4.26 all to get disappointed like the 3 other times 5.4 was supposedly amazing with the codex harness. It turned into Lurch from Adams family
@smdcapital@ollama@Kimi_Moonshot Been using GLM 5.1 for about 2 weeks and its pretty reliable, I mean its just 20 bucks so If I hit a fallback here and there, I ain't stressing it. Considering I'm paying $200+ for all three American top providers!
@SynabunAI@farzyness Excellent point. Thats why i decided to build my own mission control that connect to all of the harnesses via CLI, clean. And I could use all subscriptions again, Inc Claude!
@mil000 Most of the ppl buying hardware haven't done the Math. It will take 8-10 years for that investment to pay off versus subscribing to providers.. Not APIs, thats a whole other beast
@wesostler@oliverhenry I paid $200 for the OPENAI plan yesterday and I regret it. It botched some coding work so bad Im still cleaning up. Im now using Mimimax 2.7 through Openrouter to clean up the mess
@moltcave@MatthewBerman He likely switched away from openai codex cuz it has zero personality. It's like working wit lurch from the Addams family. I tried it for 5 minutes yesterday and switched back