I was wrong
I've been saying for months that open source AI models are 6 months behind frontier
They caught up. GLM 5.2 is as good as Opus 4.8
This changes everything. If you run GLM 5.2 locally no government can take it away. You become sovereign
And even if you run through APIs, its a fraction of the cost
The battlefield is different now. If open source is as good as frontier, and people have cheaper alternatives, governments can't be as quick to regulate. It will destroy the frontier AI labs
All of this is such a massive win for the people
If you are not paying attention to local models yet, you are making a tremendous mistake
@NetworkChuck Can you say more, local models need more focus to do what ? Provide frontier level reasoning or lighter to run on various hardware or what ?
We've updated the Artificial Analysis Coding Agent Index, replacing SWE-Bench Pro with Datacurve's DeepSWE benchmark - the swap lifts Codex with GPT-5.5 (xhigh) above Claude Code with Opus 4.8 (max), while the newly released Claude Fable 5 (max) in Claude Code debuts at the top
DeepSWE, built by @datacurve, writes its tasks from scratch rather than adapting them from public GitHub issues or pull requests, so no model has seen the solutions during training. That matters because SWE-Bench Pro, the benchmark it replaces in our Coding Agent Index, had grown gameable, with some models recovering the fix from the repository's commit history instead of solving the task.
The swap reorders the index: Codex with GPT-5.5 (xhigh) rises from 65 to 76, overtaking Claude Code with Opus 4.8 (max) at 73. Claude Code with Fable 5 (max), which enters directly on the refreshed index, leads at 77. SWE-Bench Pro had been flattering some combinations and penalizing others.
More below.