Congrats to @OpenAI for taking the top spot on our Audio MultiChallenge S2S leaderboard with the release of GPT‑Realtime‑2 🥇
GPT-Realtime-2 more than doubles GPT-Realtime-1.5 on instruction retention, rising from 36.7% to 70.8% APR, and also stands out on voice editing, especially when users repair or revise what they are saying in real time – crucial for voice agent use cases.
Excited to see the pace of progress as voice AI accelerates.
GPT-5.5 is now available in the API.
The model brings higher intelligence and stronger token efficiency to complex work, helping tasks get done with fewer retries.
gpt-image-2 is here, available today in the API and Codex.
The most capable image generation model yet, built for production-grade workflows with stronger text rendering, layout, editing, resolution, and multilingual rendering.
Today, we’re launching a Rather Large™ update to the OpenAI Agents SDK.
Agents SDK now allows you to scale Codex-style agents in production, without building the whole harness yourself. We’ve brought all of the stuff of modern agents: computer-use, skills, memory, compaction, and more to the same platforms you’re already using.
Build long-running agents with more control over agent execution.
New capabilities in the Agents SDK:
• Run agents in controlled sandboxes
• Inspect and customize the open-source harness
• Control when memories are created and where they’re stored
Your videos can go further now.
We’re introducing new Video API capabilities, powered by Sora 2:
• Custom characters and objects
• 16:9 and 9:16 exports
• Clips up to 20 seconds
• Video continuation to extend scenes
• Batch jobs for video generation
Voice workflows just got stronger with gpt-realtime-1.5 in the Realtime API.
The model offers more reliable instruction following, tool calling, and multilingual accuracy.
Demo with @charlierguo
This is the biggest jump in Image Arena that we've seen since Nano Banana
GPT-Image-1.5 has taken #1 on Image Arena with a significant lead
Huge congratulations to the team at @OpenAI for this achievement!
GPT Image 1.5 is now available in the API:
✏️ More precise image editing and preservation of logos & faces
🎯 Better instruction following and adherence to prompts
🔤 Improved text rendering, particularly for denser and smaller text
Learn more in docs: https://t.co/KWUqCUOIGZ
🆕 New audio model snapshots are now live in the Realtime API with improvements to reliability, lower error rates, and fewer hallucinations:
- gpt-4o-mini-transcribe-2025-12-15: 89% reduction in hallucinations compared to whisper-1
- gpt-4o-mini-tts-2025-12-15: 35% fewer word errors as measured by Common Voice
- gpt-realtime-mini-2025-12-15: 22% improvement in instruction following and 13% improvement in function calling
Teams testing GPT-5.2 have reported steadier agents, stronger coding performance, and clearer reasoning over large contexts.
But don't take our word for it.
Here are the early impressions we're seeing:
GPT-5.1 is now available in the API.
It’s faster, more steerable, better at coding, and ships with practical new tools.
If you’re building apps or agents where intelligence, speed, and cost matter, GPT-5.1 should feel like a meaningful upgrade. https://t.co/guDlsARNic
🎥 Sora 2 and Sora 2 Pro are now in the API.
And new image and speech-to-speech models that are cheaper than their full-sized counterparts, but with similar quality:
📸 gpt-image-1-mini (80% cheaper)
🗣️ gpt-realtime-mini (70% cheaper)
The Realtime API is officially out of beta and ready for your production voice agents!
We’re also introducing gpt-realtime—our most advanced speech-to-speech model yet—plus new voices and API capabilities:
🔌 Remote MCPs
🖼️ Image input
📞 SIP phone calling
♻️ Reusable prompts
We've improved image generation in the API. Editing with faces, logos, and fine-grained details is now much higher fidelity with features preserved. 🔍
Edit specific objects, create marketing assets with your logo, or adjust facial expressions, poses, and outfits on people.
🆕 Four updates to building agents with OpenAI: Agents SDK in TypeScript, a new RealtimeAgent feature for voice agents, Traces support for the Realtime API, and improvements to our speech-to-speech model.
Introducing Stripe Reader S700—a customizable smart reader for countertop and handheld use.
Build a point-of-sale app with the Terminal API and SDK, then use the S700 for tableside ordering, loyalty programs, forms, tipping, and more.
Coming soon: https://t.co/LFXNqBTEgf.