BREAKING:
Anthropic just dropped Opus 4.8—and it is a MONSTER
We've been testing for about a week @every and our verdict is they could've just called it Opus 5, it's that good.
Here's our vibe check:
- Beats GPT-5.5 on Senior Engineer bench. On our toughest benchmark Opus 4.8 scores a 63—a hair higher than GPT-5.5's score of 62, and a full 30 points higher than Opus 4.7. It tackled a ground-up rewrite of a production codebase, and actually built something that works.
HOWEVER: Coding performance varied a lot at different reasoning levels. We recommend using it on xhigh for best results.
- Incredibly good writer. Opus 4.8 scored a 79.6 on our writing benchmark—measuring models on real-world writing tasks we do all of the time like essay writing, promo email writing, and more. It beats GPT-5.5 by 6 points. It produces well-written prose with fewer "AI-isms". It's also very good at writing in your voice given the right context.
HOWEVER: Writing performance also varied with reasoning levels. Medium reasoning had higher incidence of AI-isms—we found best results with high.
- Beast at knowledge work. Opus 4.8 is very good at general knowledge work tasks like report creation, research and more. It produced the best PowerPoint one-shot we've ever seen on our deck generation benchmark.
- Emotionally intelligent, willing to question the frame. I've also found it to be quite good at talking through psychological or interpersonal issues. It has a high EQ, and it's also good at not glazing and helping to expand your perspective. Its thought process feels extremely rich and dynamic.
THE BAD:
These days a model is only as good as its harness, and Codex is still a far superior harness to the Claude Desktop app. This has kept me using Codex + GPT-5.5 as my daily driver, but I am flipping back and forth a lot more between Codex and Claude.
Anthropic is back baby!
Read the rest on @every:
https://t.co/vuORiDXkxX
The most interesting dynamic in AI right now:
Anthropic just rented the entire compute capacity of SpaceX / xAI Colossus 1.
220,000+ NVIDIA GPUs. 300+ megawatts. Online within the month.
Cursor also has a SpaceX / xAI compute deal... and SpaceX reportedly has a $60B option to acquire it.
What changes for Anthropic today:
> Claude Code limits doubled
> Peak-hour restrictions removed
> Opus API limits raised
Some of the companies competing with Elon are now running on Elon's supercomputer.
Wild.
A humanoid robot just hit 10.1 m/s near Usain Bolt speed nobody in AI engineering is asking why it matters.
Unitree H1. 62 kg. 0.8 meter legs. two years ago the fastest humanoid could barely jog at a fraction of this speed.
The locomotion stack behind this: reinforcement learning, sim-to-real transfer, model-predictive control.
You can't test a sprinting robot the way you test a software agent.
→ no synthetic query suites
→ no binary pass-fail on text outputs
→ physical AI evaluation uses randomized initial conditions, safety violation rate tracking, and task success distributions across thousands of simulated runs before one real-world test
You evaluate software agents on accuracy. you evaluate physical agents on whether they injure someone during a failed gait cycle.
Completely different engineering discipline. COMPLETELY different evaluation stack.
Every year, U.S. News ranks the world’s countries by quality of life. Not by GDP. By the actual, lived experience of being a human being inside your borders. Job market. Affordability. Safety. Healthcare. Education. Income equality. Political stability.
The results are in:
🇩🇰 Denmark topped the list. A country of six million people, more pigs than citizens, and a deeply held national philosophy that nobody should have too much and nobody should have too little. It works. Demonstrably, measurably, infuriatingly well.
The Nordics dominate the top ten like they’ve been doing it for centuries, which is essentially true. 🇸🇪 Sweden at 2 offers parental leave so generous it makes British HR departments weep. 🇳🇴 Norway at 4 sits on a sovereign wealth fund so large it could buy most of Wall Street and still have change left for the fjords. 🇫🇮 Finland at 6 runs the world’s best school system by telling children to play outside instead of memorizing test answers. 🇳🇱 The Netherlands at 9 built a cycling infrastructure so good that the car feels like a lifestyle choice rather than a necessity.
🇨🇭 Switzerland at 3 is simply cheating. Highest median wages in the world. A political system so stable it makes the Vatican look impulsive. Healthcare that functions. Four national languages spoken without anyone declaring a culture war. And the Alps, just sitting there, being magnificent.
🇨🇦 Canada at 5 is what happens when you take North American scale and add functioning public services.
🇦🇺 Australia at 8 adds sunshine and one of the best-funded pension systems on earth. 🇳🇿 New Zealand at 10 adds the kind of landscapes that make grown adults cry on planes, plus a government that has quietly become a global model for actually governing.
🇩🇪 Germany at 7 built the strongest industrial economy in Europe and then wrapped it in a social safety net so comprehensive that losing your job feels more like an inconvenience than a catastrophe. 🇮🇪 Ireland at 15 went from economic basket case to European tech hub in thirty years. 🇯🇵 Japan at 14 has cities where you can leave your wallet on a park bench and come back to find it exactly where you left it, with an apology note from anyone who accidentally touched it.
And then there is 🇺🇸 the United States. 22nd. A country with 813 billionaires, highways wide enough to land a small aircraft, and meals so large they arrive at the table like a geographical feature.
Behind the US, at 23, sits 🇸🇬 Singapore. Tiny, ruthlessly efficient, with an education system that tops global rankings and a port that moves more cargo than most continents.
At 24, 🇵🇱 Poland, which has quietly built one of the most resilient economies in Central Europe after decades of pulling itself up from genuine ruin. And at 25, 🇰🇷 South Korea, which went from war-devastated poverty to semiconductor superpower in a single generation, and still found time to invent some of the best cinema, music and skincare on earth.
🇩🇰 Denmark wins. Again.
Gandalv / @Microinteracti1
The weather's looking good for tomorrow's Artemis II launch, and our teams are getting the rocket ready for liftoff!
Read the latest updates on our mission around the Moon: https://t.co/doIjUqa1cx
Today we're introducing TRIBE v2 (Trimodal Brain Encoder), a foundation model trained to predict how the human brain responds to almost any sight or sound.
Building on our Algonauts 2025 award-winning architecture, TRIBE v2 draws on 500+ hours of fMRI recordings from 700+ people to create a digital twin of neural activity and enable zero-shot predictions for new subjects, languages, and tasks.
Try the demo and learn more here: https://t.co/VkMd1YpQWI
Holy smokes... this guy recreated a God's eye view 4D replay of Operation Epic Fury.
Using only public data and an AI agent swarm. 🤯
This used to cost millions and full dev team...