My @levelsio#VibeJam2026 game: Monaco GP 🏎️
Time trial + live multiplayer — every driver online is on the same track with you, in real time.
https://t.co/zgfLcXdCeo
Built with Claude Code. Car model: GPT Image 2 × Tripo3D. Music: Suno.
We just launched Sites into Codex!
Software creation was always about more than writing code. Sites in Codex fundamentally gives the power of end-to-end software creation to every user, no matter their technical fluency.
These Sites are fully deployed to a URL, private to workspaces, come with authentication, can have static files, and can store dynamic data in databases.
It is in preview for business and enterprise teams and will be rolling out to all workspaces over the next day. Give it a try by typing @ Sites into Codex and ask it to build anything!
This project took a massive amount of effort across hundreds of people at OpenAI - proud that we were able to get this out and excited to see what you all build with it!
In the next version of Claude Code: run /usage to see a breakdown of which Skills, Agents, MCPs, and Plugins are using your tokens
CLI today, coming to Desktop next
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
"everyday we're trying to obtain more compute to pass on to you, we're sorry if it takes sometime but we're going to acquire as much as we can"
you heard the man
you know what
all of these "which is better" polls are silly
use codex or claude code, whatever works best for you
i am grateful we live in a time with such amazing tools, and grateful there is a choice
Claude now connects to the tools creative professionals already use.
With the new Blender connector, you can debug a scene, build new tools, or batch-apply changes across every object, directly from Claude.
I think "prompting" will keep being an incredibly high-leverage skill, like writing or public speaking.
It is the skill of talking to agents, mediated by the harness.
My main goal is to grow the bandwidth between humans and agents, to help us understand each other better.
🤯BREAKING: Alibaba just proved that AI Coding isn't taking your job, it's just writing the legacy code that will keep you employed fixing it for the next decade. 🤣
Passing a coding test once is easy. Maintaining that code for 8 months without it exploding? Apparently, it’s nearly impossible for AI.
Alibaba tested 18 AI agents on 100 real codebases over 233-day cycles. They didn't just look for "quick fixes"—they looked for long-term survival.
The results were a bloodbath:
75% of models broke previously working code during maintenance.
Only Claude Opus 4.5/4.6 maintained a >50% zero-regression rate.
Every other model accumulated technical debt that compounded until the codebase collapsed.
We’ve been using "snapshot" benchmarks like HumanEval that only ask "Does it work right now?"
The new SWE-CI benchmark asks: "Does it still work after 8 months of evolution?"
Most AI agents are "Quick-Fix Artists." They write brittle code that passes tests today but becomes a maintenance nightmare tomorrow. They aren't building software; they're building a house of cards.
The narrative just got honest: Most models can write code. Almost none can maintain it.
Claude can now build interactive charts and diagrams, directly in the chat.
Available today in beta on all plans, including free.
Try it out: https://t.co/tHPAZRgQkn
It’s extremely good that Anthropic has not backed down, and it’s siginficant that OpenAI has taken a similar stance.
In the future, there will be much more challenging situations of this nature, and it will be critical for the relevant leaders to rise up to the occasion, for fierce competitors to put their differences aside. Good to see that happen today.
F*ck the OpenClaw hype.
I know, I've built my entire AI workflows around OpenClaw, and I'm telling you most people shouldn't use it.
You'll understand why after reading this.
Doom-posting with no solutions is useless; here are real alternatives:
Impressive inference speed from Inception Labs’ diffusion LLMs. Diffusion LLMs are a fascinating alternative to conventional autoregressive LLMs. Well done @StefanoErmon and team!
It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December and basically work since - the models have significantly higher quality, long-term coherence and tenacity and they can power through large and long tasks, well past enough that it is extremely disruptive to the default programming workflow.
Just to give an example, over the weekend I was building a local video analysis dashboard for the cameras of my home so I wrote: “Here is the local IP and username/password of my DGX Spark. Log in, set up ssh keys, set up vLLM, download and bench Qwen3-VL, set up a server endpoint to inference videos, a basic web ui dashboard, test everything, set it up with systemd, record memory notes for yourself and write up a markdown report for me”. The agent went off for ~30 minutes, ran into multiple issues, researched solutions online, resolved them one by one, wrote the code, tested it, debugged it, set up the services, and came back with the report and it was just done. I didn’t touch anything. All of this could easily have been a weekend project just 3 months ago but today it’s something you kick off and forget about for 30 minutes.
As a result, programming is becoming unrecognizable. You’re not typing computer code into an editor like the way things were since computers were invented, that era is over. You're spinning up AI agents, giving them tasks *in English* and managing and reviewing their work in parallel. The biggest prize is in figuring out how you can keep ascending the layers of abstraction to set up long-running orchestrator Claws with all of the right tools, memory and instructions that productively manage multiple parallel Code instances for you. The leverage achievable via top tier "agentic engineering" feels very high right now.
It’s not perfect, it needs high-level direction, judgement, taste, oversight, iteration and hints and ideas. It works a lot better in some scenarios than others (e.g. especially for tasks that are well-specified and where you can verify/test functionality). The key is to build intuition to decompose the task just right to hand off the parts that work and help out around the edges. But imo, this is nowhere near "business as usual" time in software.
Alright interns, we need to have some real talk here
I am tired of vibing on stream. I dont really like vibe coding unless its a tool i have no desire to build (how i manage things on my stream / how i write my youtube videos are great examples of things i would never build but i have). I dont like vibing the things i care about. I hate the code it generates, i hate the feeling of getting everything i ask for and nothing i want. I hate the subtle offness around vibe coded things. It is just driving me nuts. So for the next while i am going to be done vibing on stream.
I genuinely have been trying my hardest to make this work and i cannot quite put a finger on why i hate it, but i do. And i just feel so horribly guilty and wrong because i am not getting the results of "everyone else on twitter."
How am i, someone who prides themselves on making youtube videos that i think are actually good for people. To make videos that help people laugh at the silliness of tech or learn something new. But here i am not able to keep up with all these people claiming the sky is literally coming down. I just feel horrible and guilty about it.
Now i know the world is changing fast, and i want to be able to understand that change super well, be able to talk about it, be able to give really accurate opinions about it so for the last 3 months i have vibe coded an absurd amount of things. But now... i am just tired of it.
I dont want this any more. I want to be a tradcoder.
I dont know why i told everyone this, but i just have this growing sickness that is just eating me alive around vibing and i dont know how to express it.
You all are fired,
CEO ThePrimeagen