@TheZvi The fix is a classifier that stops Mythos from fixing any bugs in software code.
That is a problem when your main business is software coding agents.
@rikkarth Depends a lot on how hard the project is. I was working on a 3D renderer and I went from Fable getting 8/10 prompts right to Opus 1/10 right.
I've switching to using ultra code with 40+ agents per prompt but Opus is still having a lot of trouble.....
@spicey_lemonade There is just no way for a classifier approach to 100% tell if the user is asking for a bug fixed or asking to identify a bug for purposes of exploit.
@Youssofal_ I've gone to doing a ultra code workflow at almost every prompt in the conversation to allow Opus 4.8 to continue work on the project I was working on with Fable.
Where as Fable could go correction free, Opus "thinking" is generally wrong and requires a 30-50 agent workflow.
@kimmonismus The jailbreak is essentially "find bugs in my codebase". It you disable that ability via classifier then you prevent the use of Fable for coding.
Is a real tension because cyber vulnerabilities are bugs and fixing bugs is a huge part of what coding agents do.
@ShanuMathew93 I give Claude a loop with /goal and the goal is to come back with a passing grade from Chat GPT 5.5.
Claude calls Chat GPT over and over with his attempts until he passes, and the /goal makes sure that he can't cheat or stop early.
@morganlinton It's especially helpful cross model but even cross session with same model the agent / critic is extremely underrated.
The issue is models like their own work just like humans do. You need a fresh session that didn't do the work to evaluate.
@morganlinton Would suggest Claude Code terminal and /goal which will have another agent assess it and loop.
I often have Chat GPT 5.5 as a critic of Opus 4.8 work (Opus is smart enough to code this up), and then the 4.8 goalkeeper task is to keep going till Chat GPT approves.
@Gavriel_Cohen@swyx@Barazany Ya..I found it exploring the source too. They have put an enormous effort into maximizing context caching and there is a lot of editing going on to make it happen.
@saurabh_shah2 How aggressively a harness works to minimize token consumption is going to be a major factor.
The is an enormous amount of context editing going on.
For example, the results of tool calls are generally not available to models on subsequent turns and can have a big impact.
@kimmonismus It is only that the release cycles have gotten so fast that very few people can keep up with it.
AI is continuing to diffuse into the workplace, but the average person doesn't have the bandwidth to keep up with what the current state of art is.
@developedbyed I saw something very similar in my tests.
GPT 5.4 is probably the better coder and smarter model, but it is lacking in taste and wants to over achieve on outputs.
@steipete@Cucho The are likely able to optimize cache across sessions ( if everyone is using the same Google harness ) that breaks down once everyone is bringing their own.
@MatthewBerman My guess is that it isn’t OpenClaw/OAuth that gets you banned but rather what OpenClaw does that could get you banned.
This is why Anthropic don’t want to come out and say that OpenClaw is allowed.
Anthropic has low trust in the guardrails.