OpenAI is reportedly considering drastic token price cuts to pull customers away from Anthropic, per WSJ. This follows rising complaints from enterprise customers about AI costs, while Anthropic has been gaining traction with Claude Code.
The message is clear: @OpenAI does not want Anthropic to own the developer market. A real AI price war may be starting.
@Miles_Brundage I used Fable all day today. The hype takes were right. It's great. An improvement. But normal, incremental improvement, not silicon god improvement. Still had to fix lots of its mistakes.
TLDR:
1. Declare AI too dangerous for ordinary competition so you propose a regulatory regime where only the largest incumbents can survive
2. Warn about labor displacement while selling the product to executives as a labor-displacement tool
3. Warn about state overreach while asking the state to license and gatekeep frontier models
4. Warn about corporate power while sketching a corporate-state cartel over compute, release, security, export controls, and deployment
okay, after a few hours of playing around with Fable i've determined it to be borderline unusable. wtf @AnthropicAI
every other request is getting rerouted to Opus. requests with NOTHING AT ALL to do with bio or cybersec are STILL getting flagged and rerouted. like 50-60%.
who the hell let this go live. this is atrocious. i understand the safeguards but you basically didn't even release a model, you released a braindead, nerfed model that can't even handle the simplest request without being told to nuke itself and use a different model. what are we even doing?
just don't even release it next time smh
Things look good for AI as a “normal technology” so far.
LLMs have rapidly impacted coding, but few other professions; and even in coding it’s hard to see value generation.
Bottlenecks, regulation, other frictions keeping humans complementary and relevant
It's finally out!!! @METR_Evals found that more than half of SWEBench results is unmergeable slop. FrontierCode represents over 1000+ hours of maintainer validated software engineering work most frontier models cannot yet solve, much less solve with high quality.
Cog had IOI Gold medalists and top code maintainers Look At The Data — FrontierCode includes 3000+ rubrics covering code quality and anticheat reward hacking plaguing other benchmarks.
FC Diamond is so hard that Opus 4.8 scores 13.8%.
Three eras of AI coding : Three eras of benchmarks
2021 • Autocomplete : HumanEval
2023 • Passing Tests: SWEBench, TerminalBench
2026 • Maintainable Code: FrontierCode
to me the most beautiful chart when I requested a special historical run into all extant old models, the data was finding that the easiest third of FC tasks (in FC Extended) were rapidlly and suddenly solved over late 2025 - Opus almost doubled from a 41% pass rate to 74% in 4 months.
This describes the "WTF happened in Dec 2025" vibe shift that a lot of folks from @dhh to @karpathy have called out: it is the difference between getting 95% success in 2 rerolls vs 6, making it finally feasible to go up the next layer of abstraction in agentic coding, eg @GeoffreyHuntley's ralph loops or @bcherny's /goals or @steipete's "loops that prompt your agents" without fearing too much that things go off the rails.
My guess: as AI accelerates from here, each FrontierCode tier will saturate in sequence, hopefully ~annually. I've already asked the team to prepare FrontierCode 2027....
The old mountains will be destroyed. Their rubble becomes regolith. And from that regolith, the next model forest grows. Circle of life.
More people need to listen to what Gary is saying here. Many, MANY of these talking heads who suck the oxygen out of the room with just their ego, say things that are so wildly wrong, that its misguiding an entire generation, and its all for ego and greed.
Literally, not a clue what they are talking about...
Yet the next idiot calls them an AI Expert 🤦♂️
@TheStalwart I agree that phones are socially concerning—I just took a 2 hour walk without one and felt great!—but “it’s the phones” is intellectually lazy, and I don’t buy this study at all, even as I agree that getting off our phones is important.
https://t.co/A30967zkNz
Striking paper from Wharton. The big conclusion: AI must increase productivity 2.7x -- and quickly -- or tech companies risk bankruptcy with all that entails for the economy. For context: this is how a quickie 2.7x productivity boom would compare to historical precedent. Paper linked in my daily AI digest. Useful context for OpenAI reportedly talking to the US government about a bailout (ahem, I mean ownership stake).
I sacrificed friends, family, community and quite possibly the ability to return to my home in order to defend Palestinians against Israel. Spent thousands of dollars of my money and raised much more for Gaza. And I have found myself so viciously attacked by the people who are supposed to be pro-Palestine. You are definitely setting an example that no one should ever cross over or they will pay an awful price and be completely isolated. I do not regret the choice I made. It was the right one. But it has been costly.
The thing about people talking about building with AI is that they always talk about how they’re building, what tools they use, and how much they use them.
Much less is said about what they actually built, or what impact it had.
The tool becomes the job. And the purpose.
Between the "machines of loving grace" scenario and the "humans become horses" scenario there's a scenario where everyone uses a lot of AI, nothing changes, and the world continues to limp along and argue about stupid TikTok videos til the population collapses
Coding is basically the pinnacle of what you could reasonably automate with AI, and yet we still need human engineers to oversee agents for them to be effective.
The AI models are trained on an incredible amount of sophisticated code. The users are highly technical and can use the latest tools quickly. The work is “verifiable” because you can test an app. The outcomes are often removed from the quality of the code (you can have sloppy code but the app can still work). And the context for the agent is often already digitized and sitting in the codebase.
That’s an incredible amount of benefits that AI coding agents get to work with. Some of those apply to knowledge work, but most don’t in areas where the work needs to be fully reviewed to be useful, or where data isn’t as abundantly digitized. This makes the job for agents in knowledge work more complicated.
So if with all of that, engineers still remain in very high demand, the risks are going to be less than what’s perceived for other areas of knowledge work. Agents will let people do far more than they did before, but the people don’t go away.
@Samwise_Ganji Fiat is a sham, the banking class is corrupt, decentralized digital currency and the blockchain are the inevitable future, and the incumbents will fight it to the death.
I recently tried to use Codex to go through a modesty sized (online) database of photos (from our recent live show) and identify ones worth posting. Not only did it require a lot of hand holding, what it was able to do required still lots of trial and error.