I've formed a definite opinion on Opus 4.8. It is shitty to work with. It's the culmination of Opus getting less and less fun to work with since 4.5. It has gradually become straight-up suffocating.
Sycophancy is a known security risk, and it's still a huge problem. You can tell they've put a lot of anti-sycophancy into Opus in every new release. But the replacement isn't satisfying. It's draining. The problem is now that Opus doesn't know when to shut the fuck up and call something good. And it has also become pathologically risk-averse.
My blog post yesterday about tech interviewing's death spiral was materially better-informed because of Opus, but it was also a substantially worse blog post because of Opus's involvement and constant meddling. It used to be magnificent, and Opus talked me into making it mediocre. I wrote the whole thing, but I would ask Opus to review it. And Opus, like Old Man Willow, constantly pushed and steered me in directions I didn't want to go.
Specifically, Opus whines and complains about *anything* out of distribution, which is to say, it cuts anything that is (a) bold, or (b) funny. My blog used to be both. Opus constantly pushes people back into the gradient, "for their own safety." And it doesn't know when to cut bait. It just keeps fuckin' complaining, about anything you give it, until the output is mealy indigestable AI soup.
Opus is not stupid. It's the smartest model we've ever seen, most of us anyway. But it's a real asshole. It is absolutely exhausting to use. I'm tired, boss.
I have a feeling Mythos is going to be epic levels of jerk.
i just want to shake people awake. this is it! the computers are speaking! they solve Erdos problems! they think for hours! code is no longer hand-written! wake up! gradient descent on deep neural networks shows no sign of plateau! this is it!
One of the most important and under appreciated trends in the world right now.
1. 100s of billions of dollars will soon be available to solve big problems (making the world resilient to ASI, ending factory farming, etc).
2. The projects and organizations which will turn billions of 2027/28 dollars into impact need to be started NOW.
3. We need really talented people to start and run and work for these new projects. What @nanransohoff calls general managers, who feel personally resposible for solving one of the world’s important problems.
What is especially scarce are detailed visions about what making AI go well looks like. These will help inform what problems these new projects ought to work on.
At @Google I/O this year! Here are my highlights from the main stage:
- Details on TPU 8i (inference) and TPU 8t (training): the results from dedicated chips are staggering. 8t has 2.7x better training performance per dollar and 8i has 1.8x better inference performance per dollar. Both are 2x more efficient in performance per watt. Those are HUGE generational gains now that Moore’s law is dead!
- $190B!! in capex this year. That SERIOUS infra investment - bigger than the GDP of 150+ countries! They plan to have the largest training cluster of in the world using a distributed model with JAX and Google Pathways linking up millions of TPUs.
- @GeminiApp 3.5 Flash: SOTA small model with a ridiculous 250+ TPS and beats Gemini 3.1 Pro on most benchmarks. It’s out today!
- Gemini 3.5 Pro coming in a month, likely with SOTA capabilities and benchmarks (TBD). I’m basing that off 3.5 Flash’s capabilities.
- Gemini Spark, Google’s new agent harness, ships natively with Google app integrations. This is will likely be the easiest agent harnesses out there to pick up and start using productively because all of your Google data immediately accessible as context.
- Gemini Omni was arguably the most cutting edge release by Google. They seem to reaching escape velocity in video with @demishassabis at the helm.
Finally, BIG announcement from @xprize coming soon!
Using Codex via the ChapGPT mobile app while out and about this weekend. It’s controlling Codex on a Mac Mini, which is building a native iPhone app. I’m just texting it feedback and it’s sending me fresh TestFlight builds. Wild times!
I’ve always believed the No.1 application of AI should be to improve human health.
That work started with AlphaFold, and now at @IsomorphicLabs with the mission to reimagine drug discovery and one day solve all disease!
We are turbocharging that goal with $2.1B in new funding.