Fundamentally these labs are labs -- they have the smartest researchers in their fields doing some of the hardest and most stimulating intellectual work on the planet. They are "growing" intelligence with resources no one else has -- enormous amounts of scarce compute, billions of dollars of it for pre-training and post training runs. But they are also just labs, and have on some level accidentally run into this success. The question is whether their CEO's will be able to mature these companies into what the world needs and wants: trusted and transparent providers of frontier intelligence at fair prices who simultaneously create a long-term favorable dynamic for their industry and (importantly here) the world. What has happened instead is very lab-like (think tank like): Amodei and Altman have in their own way been Cassandra's for the very thing they have ushered (and allowed to usher) into existence, seeding the world with one particular worldview of doom and fear. The result is a landscape of datacenter backlash, public fear and anxiety over AI, growing AI regulation, and even propsective national ownership or possibly nationalization of AI assets. The writings, discussions, and interviews of Amodei and Altman have been, while ardent and articulate, stimulating and revealing, also have been profoundly unwise and on a broad level probabalisticly untrue, and the result is creating a climate-change like fear among younger populations that does not need to exist and likely overblows by an order of magnitude the risks, especially relative to the opportunity. If people don't go into STEM or other fields, or even college itself, for fear of AI job automation, if family formation is delayed due to fear of what this means for future generations, if regulation slows down potentially life saving cures and leaps in our knowledge of the physical, chemical, and biological realms, we will all be poorer for it, and some of us (future people) literally won't exist. What we need is for Amodei and Altman to grow up, become the wise leaders that they likely can be, to temper their fear with rationalism, all with the knowledge that humans are among the most adaptable creatures on the planet and can navigate a world where intelligence is no longer a forced scarcity cloistered in research labs, universities, think tanks, and other top and for many unattainable echelons of society.
Here is my first assessment of Sonnet 5:
Sonnet 5 is better than Sonnet 4.6. Who would have thought? But jokes aside: Unfortunately, it is weaker than Opus 4.8 across all evals. Why they nevertheless labeled the latest Sonnet 5 iteration with a “5”, even though “4.8” would have been more fitting, is beyond me. Normally, major version jumps in particular signal a significant leap in capability. Be that as it may: Sonnet 5 is good, but worse than expected.
Pricing has not changed; it is on the same level as its predecessor. Opus is still more expensive, but at the same time it also remains better. Overall, the release irritates me and leaves more questions than it answers.
I cannot help but see Sonnet 5 as a release that stands in the context of Fable 5. There was no mention of Fable 5 at all, which surprises me a lot. I really would have expected us to get news about it at the same time. But nothing. Instead, we get an update to a new model series (“5”), but one that is not significant compared with the models we already have.
As a result, there is a lingering aftertaste that Sonnet was released as something in between, perhaps also simply to release something at all and to stay part of the conversation, including in a positive sense. Why no Opus 5, when we know that Fable 5 already exists as a model that performs significantly better than 4.8, and when we can assume both that a better Opus exists internally and that it would not be difficult to update Opus to the new generation? Why “only” Sonnet 5?
Because restraint is currently required. The major releases are currently being delayed across the board; they are still in discussions with regulators about how the truly powerful frontier releases can be carried out at all and under what conditions. In my view, the Sonnet 5 release has to be seen against this background. And as a result, at least for me, it was disappointing overall.
@StockMKTNewz No worries, the fundamentals will eventually catch up. Remember: in the short term, the market is a voting machine, in the long term, it is a weighing machine. The true weight of crypto will be thus revealed.
@jukan05 This statement, although it could be true for other reasons, doesn't follow from this biased SemiAnalysis reporting of old news. Jukan, you should know better.
@SemiAnalysis_ INTERESTING: SemiAnalysis posts yet another anti-NVIDIA article, this time with misleading news around a 4-die cancelation (still same number of chips). More INTERESTING -- SA's entire credibility is to be vendor neutral, not an AMD and ASIC shill.
We know we'll have reached AGI when humans use less social media and more AI, and as a result the average IQ will get higher from humans distilling the models into their brains.
This will have a profound effect on society and reverse an otherwise worrisome trend.
@elonmusk@beffjezos Thank you! I love Grok 4.3, but whenever anything is complicated I have to import its output into Chat or Gemini and the errors and inconsistencies light up. But I love how fast it is, and how connected it is to X. So very much looking forward to the upgrade! But keep it speedy!
An expert communicating something complicated to a smart person without the same expertise or domain knowledge is a message-in-a-bottle problem. The person to whom the expert is communicating, ideally, rebuilds the essence of what the expert understands in their mind, but must do so through a very thin pipe of speech (or text). It requires humility and question-asking on the part of the non-expert, patience on the part of the expert, and the realization by both that the process is to painstakingly rebuild, through a small opening, something elaborate and representative of reality (as conceived at least by the expert). Or to put it more simply, the expert is trying to move a complex mental model from their mind through a small amount of information exchange into someone else mind.
@SemiAnalysis_ SemiAnalysis is AMD pilled. Treat these posts as biased. I've spent most of my career on M&A. It's hard to keep leaders who found companies for long. They get rich, they often move on to other projects. So is it "shocking." No. What also isn't shocking is this anti-NVIDIA slant.
I don't think we're going to get to a truly interesting scientific breakthrough on a new topic or previously unasked question via AI alone until we set up an advanced AI system to be bothered by something of its choosing that doesn't quite fit or make sense from what it knows or can sense ... so bothered that it keeps thinking about it, much like an oyster forms around a pearl, until it is figured out. Right now LLM's, even the smartest ones, aren't really bothered by anything unless they bother you. Relativity, quantum mechanics, newton's gravity, calculus, they all came about not just because of curiosity but what I would call the bothered by factor that kept their respective thinkers up at night and noodling during the day.
GLM-5.2 isn’t just a coding benchmark winner — it’s Exhibit A for why rental prices are spiking.
AWS Capacity Blocks (the cleanest public “rental” signal for big clusters) just got another round of increases effective July 1:
H100: ~$4.33 → $5.19/chip-hr
H200 (P5e/en): up to $5.97–$6.87
B200: $10.30 → $12.36
B300: $11.70 → $14.04
GLM-5.2 (750B MoE, NVFP4-optimized for Blackwell, 1M context, MIT license) is the perfect demand catalyst. Its API is already 3–11× cheaper than frontier coding models. Providers can route 60-80% of routine coding work here and still deliver excellent results. Cheaper tokens → massively more usage (Jevons paradox). Long-context agent loops love B300’s extra HBM and FP4 headroom.
Bottom line: Open-weight coding models lower model costs but raise inference hardware demand. The economics are shifting — and GLM-5.2 just made them visible.
Many smart people/AI insiders are saying GLM-5.2 is the first Chinese AI model to match and often beat the American big lab public AI models with no compromises. Incredible timing given current events.
The Nvidia anti-tax…
Everyone focused on the Nvidia “tax” because their gross margins are 75%. That’s a target and it works the other way too. Micron and other memory players are letting their gross margins soar to the moon. Nvidia keeps theirs at 75% even when they can charge more, passing on their pre-negotiated HBM savings they have on to customers. Gamers have indirectly seen this for years — RTX’s getting bid up sometimes 2x. But Nvidia doesn’t get this, their OEM’s may, but only when they charge above MSRP. We have and will see price increases by Nvidia but only to keep their margins at 75%. So bottom line — predictable pricing, and a bargain if you value the software, R&D, and cutting edge tech that the stack reflects.
A hidden gem of the English language is the words for collectives of different animals. For example, a relatively new one supplied by the US Park Service for a group of marmots is a madness of marmots.
This is nonsense. Elon is right about a lot of things but a WALL·E future? If you give humanity a bigger lever bigger ideas will emerge to use it. Perhaps a Star Trek future of exploration, or Atlantis cities under the ocean, or a golden age of art and sports. Wealth is relative, not absolute.
My take on the Jensen-Dwarkesh interview.
Jensen is a grandmaster, Dwarkesh an apprentice but to no one in particular. Dwarkesh was prepped into oblivion by his roommate for this interview, armed even. On the substance, China, Jensen was in the unenviable position of articulating strategy to tactics. Here are the arguments in a nutshell:
Jensen - I know, really know, China, this industry, my company. China has 10x the EE's coming out of college than we have, work ethic is off the charts; they're scrappy, brilliant problem solvers, relentless. Without us in 5 years they will have built a parallel 5 layer AI cake as good as ours with 10x the energy and 10x the capacity. Imagine 20 years. That tech stack will be like telecom -- diffused throughout the world, but without the US. Think of that: AI trained by the Chinese, controlled and diffused by the Chinese, installed throughout the world and riddled with Chinese backdoors, and an AI with a world-view that's 100% pro-Chinese. And, because China will be focused on their tech stack and its necessities, the US won't have much advantage from their innovations. Think Deepseek^n.
Dwarkesh - I know just enough to be dangerous. Chips are enriched uranium that will enable the Chinese to create Mythos^n, the modern equivalent of a nuclear bomb. The US and our allies will get hacked into oblivion, causally connected to Nvidia selling China two generation-ago chips. And they will use all that AI to fuel their dangerous military ambitions and turbocharge their economy, at our expense.
The issue with Dwarkesh's argument is what I call the 5% problem. It is true, but only 5% true. Obviously China will have better or more efficient compute in the short term with Nvidia chips. They wouldn't buy them if that wasn't the case. But it's a small amount in the sea of chips they already have and otherwise can and will make. And if a Mythos-level model is so important to them they certainly have enough chips right now to prioritize it. If they haven't already. So offset that tactical gain -- not towards parity with the US, but inching them a bit closer -- against the otherwise future in 5 years. One where we couldn't cut them off if we wanted to, and where they will, under their own steam propelled by our imposed necessity -- to be at parity and perhaps ahead because of their energy and scale. Think of our Mythos problem in 5 years with that sort of self-propelled, out-of-necessity acceleration. Think of 20 years. And the Chinese will do it even more so because of their pride.
Is it about a small amount of revenue to Nvidia? I'm a shareholder of Nvidia, have been for a decade. It DOESN'T MATTER. What does matter is having a bunch of people like Amodei, Patel, and Warren who don't understand the long-term implications of decoupling the 5 layer cake from China. And when it happens in 5 years, and then in a decade everyone recognizes the gravity of the mistake, we will not be able to "unshit" this. We'll have new podcasters, new Amodei's, new Warren's. All unable to unshit this.
And if I could somehow get this caution to Jensen, here it is: beware Dylan Patel. He is not on your side.