We got 74.4% on TerminalBench2 with Opus 4.6 simply by improving Terminus 2.
That's up from 62.9% on "Terminus 2 + Opus 4.6", making Opus 4.6 match "Simple Codex + GPT-5.3-Codex".
We'll share a short technical blog post + open-source the modified Terminus soon. Stay tuned π
@tessl_io "Viktor has volunteered (he doesnβt know this yet) to package OpenSpec as a Tessl tile too, since he loves it so much. One registry, one install command, all your agent context."
Time to tell Viktor, I'll be happy to see it coming: https://t.co/l28qiAf3K9
Not sure how I feel about such features when Claude Code is running behind other terminal agents in terms of prompting/compression/UX (terminal bench). Yet lock us into Claude Code by banning users of Opencode or Droid, and what do they give us? This, how ironic can it be π
Claude Code 2.1.23 is out!
This update adds support for customizing the Spinner words, so you can choose what shows up while Claude is working
To enable it, add this to your settings.json:
"spinnerVerbs": {
"mode": "replace",
"verbs": [
"Daniel Thinking",
"Avila VibeCoding",
"Daniel Architecting"
]
}
Check out my Avila VibeCoding spinner π
@antigma_labs Interested to see how you rank with Opus. Terminal Bench is a good reference, but in the end, for complex architecture: Opus is still king. I'm sure many of us are interested only if it's not tuned towards models that only score good on the smaller Terminal Bench tests.
@thdxr Skills are unreliable in every tool I've tested. I'd greatly appreciate a terminal tool who would say: "we don't do skills" instead of adding whistles and bells that simply don't work anyway since these whistles make you lose time. If skills need to be tagged, they are command.
These "it does everything for me automagically" posts are getting tiring. No it doesn't.
Worktrees are the sane way to use agents, yes.
Planning is the most important step now (like before, but obvious now).
Not touching terminals/editors or don't review => fail later on.
So @AnthropicAI is now randomly banning accounts? Just started getting "This organization has been disabled", without any warning or explanation. Payment is of course processed. Support is 100% useless.
Dear @NeuralDSP, I love your products but if you obligate cloud usage in the chain, fix your UI!
- Clicking one download button results in these calls, no wonder you can't handle load.
- I bought amp captures, can't even get an overview as the continuous scroll does not work.