@code_star Team worked through the night to roll this back. I think it was the wrong tradeoff to make and I’m glad we’ve changed it. Looks like there was a bit of a lag between the statement and this tweet: https://t.co/terQUflPPP
We’re rolling out changes to make Fable 5’s safeguards for frontier LLM development visible.
Starting this week, flagged requests will visibly fall back to Opus 4.8—the same as our safeguards for cyber and bio. You will see this every time it happens. On the API, any flagged requests will return a reason for their refusal (coming to server-side fallback in the next few days).
We wanted to deploy Fable 5 to our users quickly and safely. Visible safeguards can be probed, so they have to be robust, which takes time to get right. Invisible safeguards can be targeted more narrowly, allowing us to ship quickly with very few false positives. We went with invisible safeguards for this reason—and that was the wrong tradeoff. You should have visibility into the safeguards we have in place, and why. We’re sorry for not getting the balance right.
Making the safeguards visible makes them easier to work around, so keeping them robust to jailbreaks will unfortunately mean more false positives while we improve the classifiers. We're also tuning our bio and cyber classifiers to trigger less often on harmless requests. We know this is frustrating and we’ll do our best to keep this period as short as possible.
If you think a request has been mistakenly flagged: run /feedback in Claude Code, click thumbs-down on the fallback in https://t.co/LtktniD5HY or Cowork, or file the safeguard appeal form for API requests. Your reports help us tune these classifiers and we appreciate your feedback.
https://t.co/TDAAYRGqDt
@genekogan There was no AI involved at all in this one, just physical sources, PS, and AE. Much of the source material in this one was from the Brockhaus encyclopaedia and our internal team worked with some brilliant external animators on the finished product.
Glad to have played a small part in every model launch since Claude 3. It’s been a little while since I’ve felt quite this excited to share one with the world.
Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use.
Its capabilities exceed those of any model we’ve ever made generally available.