Making AI accessible to all with our courses, blogs, tutorials, books & community. Since 2019 we helped teach over 400k AI. Now we also help corporate teams.
More of the iOS app loop, now inside Codex.
The Build iOS Apps plugin lets Codex view and test your iOS app in the in-app browser, open SwiftUI previews, and hot reload edits without leaving Codex.
You can halve costs today by moving from Claude Code /Cowork to Codex- so while Codex is only at 5 million users I don’t think people are serious about cutting token costs.
Not recommending to cancel Claude though, I use both.
Other model families are just too far behind on the most complex agentic work at the moment to be worth the frustration of the switch.
Long AI coding sessions don’t just fail because the context window gets full.
They fail because the agent starts carrying forward the wrong assumptions from the previous conversation.
That means even if you compact the chat, the next steps can still be biased by:
files it touched earlier
failed fixes it half-remembers
assumptions that were never corrected
outdated project state
a messy chain of reasoning from hours ago
A better workflow is to end long sessions with a small HANDOFF.md file.
Include only five things:
Original goal
Current state
Files changed
What was already tried and failed
Exact next step
Then start a fresh session and tell the next agent to read only that file first.
The point is not “more context.”
The point is cleaner context.
That is how you keep progress moving without dragging the entire messy conversation into the next agent.
Building apps has never been easier.
With Sites, Codex can turn your work, ideas, and plans into an interactive website or app your team can explore, use, and share with a URL.
Rolling out to Business and Enterprise plans, before expanding more broadly.
@maxjendrall@OpenAIDevs This is better than using Wispr Flow with the Codex app.
I can start dictating, open my browser while it keeps running, and then stop dictation without having to click back into the text box.
The scariest AI-generated code is not the code that fails immediately.
It is the code that passes one clean test and makes everyone relax.
A green checkmark only proves the tested path worked. It does not prove the edge cases, hidden assumptions, or real production risks are covered.
When the agent writes both the code and the tests, the first pass can create fake certainty.
Treat the first passing test as the start of review, not the finish line.
here is the first look at realtime voice mode in codex
the ui is already in the app, but the codex backend endpoint does not seem to be live yet
i got it running by forcing the local feature gate and using my own openai api key, so it goes through the public realtime api instead
it's incredible, but this demo alone cost me ~$4
gpt realtime 2 is very expensive, which makes me wonder how they're going to handle it - part of the normal usage limit, or a separate limit like we had with gpt 5.3 codex spark?
i'm turning it off until the official rollout
but get excited, the next few weeks are going to be amazing
Claude Opus 4.8 has landed on DeepSWE Bench, posting a 58% Pass@1 and taking #2 overall behind GPT-5.5.
It continues a broader trend: slightly behind on raw score, but among the most reliable and efficient coding models across recent benchmarks.
The best coding agent is the one you let do less.
Bad prompts make the agent invent the task.
Good prompts narrow the scope.
Great prompts define the behavior, files, tests, and proof before code starts.
If your agent keeps making messy changes, the problem might not be the model.
It might be the amount of freedom you gave it.
Scope tighter. Name the files. Define the tests. Ask for proof before merging.