Today, we share a breakthrough on the planar unit distance problem, a famous open question first posed by Paul Erdős in 1946.
For nearly 80 years, mathematicians believed the best possible solutions looked roughly like square grids.
An OpenAI model has now disproved that belief, discovering an entirely new family of constructions that performs better.
This marks the first time AI has autonomously solved a prominent open problem central to a field of mathematics.
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
Introducing Open Collider: an open-source engine that mechanically improves LLM creativity.
It generates non-trivial, high-quality ideas at scale, for any ideation problem.
LLMs collapse on the same ideas. Sample the same brief 100 times → most outputs land in the same place. Researchers call it the Artificial Hivemind (Jiang et al., 2025).
"Be more creative" moves the LLM's output by ~0.04 in embedding space.
Forcing structurally distant domain collisions moves it by ~0.28.
7× more. Same model, same brief.
So I built Open Collider: a pipeline based on the theory of bisociations (Koestler 1964), the same model that drives human creativity.
📊 Across 12 real-world ideation problems:
• 12/12 sign-test wins on embedding distance (p = .0002)
• 60%+ originality wins on 4,320 blind LLM-judge verdicts
• 4–13× further from the default cloud than "be original" prompts or longer context
• Idea relevance holds (win rate >50% on overall quality)
💻 Engine: first reply 👇
📝 Launch study: pinned tweet
Try it, Break it, Tell me what you find!
Singapore’s Foreign Minister, Dr Balakrishnan casually explaining how he built his own AI agent (a 2nd brain for diplomacy) using Claude & WhatsApp integration etc. on a Raspberry Pi
“You cannot govern a technology you have only been briefed on.” 🇸🇬
Introducing Flue — The First Agent Harness Framework
Flue is a TypeScript framework for building the next generation of agents, designed around a built-in agent harness.
Flue is like Claude Code, but 100% headless and programmable. There's no baked in assumption like requiring a human operator to function. No TUI. No GUI. Just TypeScript.
But using Flue feels like using Claude Code. The agents you build act autonomously to solve problems and complete tasks. They require very little code to run. Most of the "logic" lives in Markdown: skills and context and AGENTS.md.
Flue is like Astro or Next.js for agents (not surprising, given my background 🙃). It's not another AI SDK. It's a proper runtime-agnostic framework. Write once, build, and deploy your agents anywhere (Node.js, Cloudflare, GitHub Actions, GitLab CI/CD, etc).
We originally built Flue to power AI workflows inside of the Astro GitHub repo. But then @_bgiori got his hands on it, and we realized that every agent needs a framework like Flue, not just us.
Check it out! It's early, but I'm curious to hear what people think. Are agents ready for their library -> framework moment?
🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length.
🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world's top closed-source models.
🔹 DeepSeek-V4-Flash: 284B total / 13B active params. Your fast, efficient, and economical choice.
Try it now at https://t.co/GCdiMzk1Dl via Expert Mode / Instant Mode. API is updated & available today!
📄 Tech Report: https://t.co/drlDrxkYtp
🤗 Open Weights: https://t.co/T13Y8i7SDM
1/n
Introducing GPT-5.5
A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done.
Now available in ChatGPT and Codex.
✨ Announcing NanoClaw v2, in partnership with @vercel.
We completely rebuilt how NanoClaw agents communicate with the outside world. v2 brings agent-to-agent communication, human-in-the-loop-approvals, support for 15 messaging platforms, and more.
A thread on what's new:
@eddiejiao_obj@drewocarr@LTXStudio@modal_labs All of this is live! it's early and slow. many of the demos above are sped up/edited, but we can't wait to see what you think. Try it yourself at https://t.co/bcephqPu1c (5/5)
We're bringing the advisor strategy to the Claude Platform.
Pair Opus as an advisor with Sonnet or Haiku as an executor, and get near Opus-level intelligence in your agents at a fraction of the cost.
Conversations tend to go better with a face and a voice. That’s why we’re thrilled to release the beta version of the first video chat skill for ANY agent, powered by our new real-time model, PikaStream1.0.
The skill preserves memory and personality, and enables real-time adaptability. And if you use it with your Pika AI Self, they’ll be able to execute agentic tasks during the call 💅
Scientists at Cortical Labs have trained lab-grown human brain cells to play the video game Doom. Their system, CL1, uses about 200,000 neurons grown on a microchip that receive game signals and learn through feedback.