Sol is our new flagship and a step function better than GPT-5.5.
Terra delivers performance competitive to GPT-5.5 at 2x lower cost.
Luna is our most cost-efficient model, delivering strong capability at our lowest cost.
Together, the GPT-5.6 family gives people and developers more choice in how they balance intelligence, speed, and cost.
Work at OpenAI is being transformed by agents, in every department.
Across our entire company, people are using Codex to do work that is more complex, longer-running, and increasingly cross-functional.
Our internal usage offers an early look at how agentic tools may reshape work as they become more capable and broadly available.
Claude Tag is a Trojan horse. Not because Anthropic is doing anything evil. Because the incentives are obvious.
Day one, this looks like a great feature: tag Claude in Slack, let it follow the thread, remember context, connect to tools, break down tasks, chase work, and act like a teammate.
But that is exactly the problem. The moment your AI vendor becomes a shared coworker, it stops being just a model provider. It starts becoming the place where work is interpreted, remembered, routed, and eventually executed.
That is not model lock-in. That is context lock-in. You are now renting your company back from them.
Models can be swapped. Agents can be copied. But the memory of how your company actually works is much harder, maybe impossible, to move: the Slack scar tissue, the exception paths, the customer promises, the unfinished threads, the weird workflows, the implicit owners, the “we tried that in Q2 and it failed” knowledge.
Once that lives inside one vendor’s agent layer, you are not renting intelligence anymore. You are renting your company’s operating memory.
And the pricing model makes it even more dangerous. A human coworker has a salary. Claude has unbounded tokenized activity. The more work moves through it, the more the vendor captures not just IT spend, but labor spend.
This is the enterprise bargain people will regret: Convenience now, and rapid decent into dependency.
The right architecture is simple: rent the best intelligence from whoever is best this month. OpenAI, Anthropic, Gemini, open source, whatever. But own the context layer.
Your company memory should be inspectable, permissioned, portable, and model-neutral. It should not be buried inside the same vendor that sells you the intelligence and the workflow surface.
Claude Tag is useful. That is why it is dangerous. Rent the intelligence, but own the context. Or, regret later.
We’ve designed and built our first AI chip: Jalapeño.
Designed from the ground up by OpenAI and brought to production with @Broadcom, Jalapeño is purpose-built for the LLM workloads powering ChatGPT, Codex, the API, and future agentic products.
Chips are foundational to the AI economy. Building our own expands our full-stack platform from products to models to infrastructure, and will help us scale intelligence, serve more people, and expand access to AI.
This "loop" automation is nuts inside of Codex.
"/goal go over every single feature in this app create a user story with expected behaviour based on the code keep a single canonical spreadsheet tracking the features status
- when done switch loop to testing every user story and documenting all errors
- when done fix every logistical error or ux error
- test every user behaviour again post fix"
Shoutout to @MatthewBerman for the heads up.
Hundreds of user stories being worked through like it's nothing.
I’ve had a number of conversations with folks inside and outside government about the current situation with Anthropic, and here is what I believe to be true:
— As we know, Anthropic publicly released its Mythos class models earlier this week under the commercial name Fable.
— Fable is Mythos with guardrails. But if those guardrails fail, then you’ve exposed Mythos and its advanced cyber capabilities to people who shouldn’t have them. (Keep in mind that Anthropic itself widely promoted the idea that Mythos was a cyberweapon and needed to be regulated as such. They asked for government regulation of Mythos and championed the guardrails on Fable. If there is a vulnerability — big or small — it is Anthropic’s responsibility to patch.)
— A highly credible trusted partner of both Anthropic and the USG who was testing Fable came forward with a jailbreak of those guardrails. The Admin asked Dario to fix the jailbreak or de-deploy the model. Dario refused.
— In their blog post, Anthropic defended its decision by saying the jailbreak isn’t serious. That is not what the trusted partner and the USG believe; nor is that kind of minimizing language consistent with Anthropic’s brand as the AI safety company. It’s difficult to fathom how they could claim a jailbreak allowing operability of a cyber weapon could be defined as not “serious.”
— In the past, Anthropic has always said that safety must be top priority and taken super seriously. In this case, Anthropic prioritized the continued offering of the consumer model over safety.
— In reaction, the Admin issued the export control. The Admin did this reluctantly. It’s been very surprised that Anthropic hasn’t wanted to cooperate with a reasonable safety request (ie fixing the jailbreak issue). Anthropic’s reaction is very much at odds with their branding and ethos as a safe AI research community.
— The Admin’s hope now is that Anthropic remediates the safety issue, the export control is lifted, and Fable goes back into general release. The Admin wants all of this to happen as soon as possible. It is frankly bewildered that Anthropic hasn’t wanted to comply with safety requests that it previously said were its highest priority.
— Those trying to misdirect and tie this action to the prior DoW/Anthropic issues are wrong. The Admin values Anthropic’s technical capabilities and feels that this issue, while serious, should be easily resolved. The ball is in Anthropic’s court.
Things I really dislike about Fable:
1. Anthropic collects my prompt history, stores it, and does whatever they want with it for 30 days. No opt-out
2. They can nerf their most expensive model without telling me, billing me the same amount, wasting my time. Whenever they want
Excited to share that MagicPath is now available as an official plugin for Codex, in collaboration with OpenAI!
It's incredibly easy to give Codex an infinite multiplayer canvas where it can design, build, and iterate with you.
Your Codex activity now has a home, and an easier way to share it.
Codex profiles show your activity graph, streaks, lifetime tokens, peak daily tokens, and top features like plugins and /fast mode.
Private by default. Share a card when you want to.
People keep saying GPT-5.5 + Codex mogs Claude Code.
So I got OpenAI PM @AbhiMuch to show me exactly how to use Codex like a pro:
03:48 - What Codex unlocks for PM work
05:34 - How he built his growth dashboard
11:32 - How to build in Codex with Playwright
14:52 - Moving away from PRDs to Codex prototypes
21:23 - 3 automations running before his day starts
28:37 - WhatsApp computer use demo setup
33:05 - Codex takes action inside WhatsApp
37:00 - Building a 1040 tax filing app in Codex
43:42 - What drove ChatGPT to 1B WAU
47:18 - ImageGen 2, the biggest ELO jump of any model
59:26 - How to break into OpenAI as a PM
@garrytan Just FYI DD also introduced a similar cap on Claude usage recently, based on my convo with a friend at DD. But not on codex yet, so employees are “dual modeling” now, especially considering codex is more token efficient.