⚙️ Behind the build of self-improving tax agents with Codex
We co-built Tax AI with @ThriveHoldings around tax prep workflows so when reviewers fix any errors, Codex can trace the failure, improve the system, and test the change before it ships.
https://t.co/otI9oYp2A6
Codex is going to sneak up fast as a Cowork competitor for non-technical people
The desktop app is accessible and very powerful, esp. with 5.5. It handles complex tasks better than Cowork (IMO)
If they nail the brand, I expect mainstream knowledge worker Codex usage soon
Still wondering how you can use Codex for (almost) everything?
Codex can help with more of the work that supports the work, from organizing research to making spreadsheets, decks, and summaries.
We’re incredibly excited about our partnership with OpenAI and remain focused on building and delivering the capacity they need to support rapidly growing demand. We’re seeing firsthand how quickly adoption of their technology is accelerating, driven by the strength of their latest models.
OpenAI’s new 5.5 model is a significant step forward, and we expect continued momentum as access to their technology expands across cloud providers. Together, we’re enabling customers to bring powerful AI capabilities into production at scale.
1. We believe in iterative deployment; although GPT-5.5 is already a smart model, we expect rapid improvements. Iterative deployment is a big part of our safety strategy; we believe the world will be best equipped to win at the team sport of AI resilience this way.
2. We believe in democratization. We want people to be able to use lots of AI; we aim to have the most efficient models, the most efficient inference stack, and the most compute. We want our users to have access to the best technology and for everyone to have equal opportunity. We have been tracking cybersecurity as a preparedness category for a long time, and have built mitigations we believe in that enable us to make capable models broadly available.
3. We love you and we want you to win. We want to be a platform for every company, scientist, entrepreneur, and person. (My whole career has largely been about the magic of startups, and I think we are about to see that magic at hyperscale.)
Introducing GPT-5.5
A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done.
Now available in ChatGPT and Codex.
always a real feeling of magic to ask codex to perform a task that requires finding information scattered across slack, google docs, notion, and various internal tools, and it just figures it out
Ahhhh, Codex 5.3 (xhigh) with a vague prompt just solved a bug that I and others have been struggling to fix for over 6 months. Other reasoning levels with Codex failed, Opus 4.6 failed. Cost $4.14 and 45 minutes. Full trace plus includes original issue: https://t.co/DbBACN2HLj
I know this prompt is relatively bad. Honestly, our stable release is in a week, and I was throwing some Hail Marys at the frontier models to see if I could get a clean, understandable fix for some of these bugs. By using `gh`, it grabs much better context from the issue, so its not terrible.
The best thing that Codex did was eventually start reading GTK4 source code. That's where I ended up (see my GH issue), and I knew the answer was somewhere in there, but I didn't have the time or motivation to do it myself. The other models never went there, and lower reasoning efforts with 5.3 didn't go there either. Only xhigh went there. I think that was a critical difference.
The final fix was decent. It was small, all in a single file, and very understandable. It had one bug I identified (you can see in the trace), and then I manually cleaned up some style. But, it did a great job.
Definitely an "it's so over" moment. But at the same time, it feels amazing because now our next stable release will have this fix and I was able to spend the time working on other fixes as it went.
I spent last night with Andrew Strominger and Alex Lupsasca, two of the top physicists in the world
They just released a paper, co-authored with OpenAi, that seems to me like ASI
Andrew, who helped develop string theory, told me that a year ago, his view was that he didn’t know how helpful AI was going to be.
A year later, after some back and forth with GPT 5.2 pro, they submitted a final query to an internal model which solved AND proved a previously unsolved problem in quantum field theory…in 12 hours.
A model, doing something two of the smartest people in the world in their field couldn’t do. And, when I was with them, they were giddy with excitement for what might lay ahead.
Andy said “It is the first time I’ve seen AI solve a problem in my kind of theoretical physics that might not have been solvable by humans.”
They said, “two things changed: the model improved and we figured out how to talk to it.”
Andy also told me “I also now feel that with the recent advances, most physicists who want to keep up with the frontiers of progress will need to learn how to talk to it. That wasn’t true a year ago.”
ASI is here, just not evenly distributed.
Today Australia’s sporting champions showed they don’t just represent the nation they speak for the nation.
In an incredibly powerful and unprecedented statement 70 champions including Dawn Fraser, Jess Fox, Pat Rafter, Ian Thorpe, Nova Peris and Grant Hackett have called for a Royal Commission into antisemitism and the events leading up to the Bondi massacre. Its now up the Prime Minister to act on their words:
“Today, we cannot remain silent.
This is not who we are.
This is not the Australia we represented.
As sporting leaders, we understand that leadership matters, especially when
values are tested. We call on the Prime Minister and the Australian Government
to show decisive national leadership by confronting extremism and terrorism in
all its forms, without fear or hesitation.
We must also put an end to the unprecedented harassment, intimidation and violence that has been directed at the Australian Jewish community since October 7, 2023.
This is a national crisis, and it demands a national response.
This is bigger than politics.
It is about the character of our country and the Australia we want future generations to inherit.
With the Brisbane 2032 Olympic Games approaching, the eyes of the world will
soon be upon Australia. The safety of our citizens, the integrity of our public
spaces, and the values we project as a nation have never mattered more.
We call on the Australian Government to immediately establish a Commonwealth Royal Commission into antisemitism, radicalisation and the events leading up to the Bondi massacre as well as take other immediate action
to protect the public.
A Royal Commission is the most credible and unifying pathway to understanding
what went wrong, ensuring accountability, restoring social harmony and taking
Australia forward with a meaningful, practical plan of action.
As Australians who have long championed unity and national pride - on the field and beyond it - we implore our leaders to act with urgency and moral clarity.
The safety of Australians, and the future cohesion of our nation, depends on it.”
Full statement 👇
GPT-5.2 is here and it’s the best model out there for everyday professional work.
On GDPval, the thinking model beats or ties human experts on 70.9% of common professional tasks like spreadsheets, presentations, and document creation. It’s also better at general intelligence, writing code, tool calling, vision, and long-context understanding so it can unlock even more economic value for people.
Early feedback has been excellent and I can’t wait for you to try it: https://t.co/ZGkMMhDg8b
Big milestone for @Uber last week: Uber + Uber Eats crossed 300 million trips in a single week for the first time, up from ~250 million a year ago. ~900K trips every 30 mins 🤯 Best part is we're just getting started, building on this (remarkable) operational & technical scale
Denise Dresser is joining OpenAI as Chief Revenue Officer.
Previously CEO of Slack, she brings deep enterprise and customer experience as she leads our global revenue strategy and support for customers at scale.
https://t.co/yQcgkhCdo0
At the heart of every business is a simple question:
Who are we really creating value for?
Customers want more for less.
Employees want higher wages.
Shareholders want stronger returns.
It looks like a fight over a fixed pie.
But it doesn’t have to be...🧵