Father of #ZempTriplets | Music Lover | Scientist working with Data (@DataDrivenValue) | Computational Astrophysicist (@ETH, @uzh_ch, @ucsc, @UMich, @PKU1898)
Finally, a big name has the courage to tell it: we are nowhere near AGI.
Demis Hassabis, CEO of Google DeepMind and Nobel laureate for AlphaFold, put it neat and clear:
"Today's systems are nowhere near [AGI]. Doesn't matter how many Erdős problems you solve… I think it's far, far from what a true invention, or someone like Ramanujan, would have been able to do."
This is the elephant in the room that many AI enthusiasts prefer not to see, or are actively trying to hide.
Erdős problems are well defined, often combinatorial, on finite spaces. They are exactly the kind of problems on which current AI can achieve spectacular performance with a lot of compute and knowledge.
A neural network can search a huge graph of possibilities. It can recombine existing knowledge at unprecedented scale. It can discover surprising solutions inside an already defined conceptual space.
But true invention is something else.
True invention is not only solving a problem.
It is inventing new objects, new dimensions, new connections. It is inventing new problems.
From resolving to inventing there is a discontinuity that we don't know how to bridge.
We are making extraordinary tools.
But we are nowhere close to AGI.
Congratulations to Alexei V. Filippenko, Ken'ichi Nomoto & Stanford Woosley, recipients of the 2026 Gruber Cosmology Prize! @UCBerkeley@ucsc https://t.co/BYNeA4GfpA
Are LLMs really more important than fire or electricity?
“Honestly, a ton of what we’ve developed in my lifetime amounts to scaling up the delivery of information and entertainment and the frictionlessness of certain financial transactions. These are real improvements! ... But compare them seriously to what came before and the disproportion becomes almost embarrassing. The fundamental architecture of daily material life - how we heat our homes, how we move from place to place, how we grow and store and cook food, how we build structures - has changed remarkably little since 1970. .….The cars go to the same places. The planes aren’t even marginally faster. The houses are built the same way. People still die of cancer.
....
Code cannot insulate your house; no algorithm has ever laid a water pipe; the internet has not built a single mile of high-speed rail. What our current stagnation shows, collectively, is that the improvements in material human life that matter the most - abundance in warmth, in calories, in clean water, in physical safety, in hours of freedom from labor - were all achieved by technologies that operated on atoms: steel, concrete, copper wire, chlorine, penicillin...”
— Freddie deBoer
Claude Code is not AGI, but it is the single biggest advance in AI since the LLM.
But the thing is, Claude Code is NOT a pure LLM. And it’s not pure deep learning. Not even close.
And that changes everything.
The source code leak proves it. Tucked away at its center is a 3,167 line kernel called print.ts.
print.ts is a pattern matching. And pattern matching is supposed to be the *strength* of LLMs.
But Anthropic figured out that if you really need to get your patterns right, you can’t trust a pure LLM. They are too probabilistic. And too erratic.
Instead, the way Anthropic built that kernel is straight out of classical symbolic AI. For example, it is in large part a big IF-THEN conditional, with 486 branch points and 12 levels of nesting — all inside a deterministic, symbolic loop that the real godfathers of AI, people like John McCarthy and Marvin Minsky and Herb Simon, would have instantly recognized.*
Putting things differently, Anthropic, when push came to shove, went exactly where I long said the field needed to go (and where @geoffreyhinton said we didn’t need to go): to Neurosymbolic AI.
That’s right, the biggest advance since the LLM was neurosymbolic. AlphaFold, AlphaEvolve, AlphaProof, and AlphaGeometry are all neurosymbolic, too; so is Code Interpreter; when you are calling code, you are asking symbolic AI do an important part of the work.
Claude Code isn’t better because of scaling.
It’s better because Anthropic accepted the importance of using classical AI techniques alongside neural networks — precisely marriage I have long advocated.
It’s *massive* vindication for me (go see my 2019 debate with Bengio for context, or to my 2001 book, The Algebraic Mind), but it still ain’t perfect, or even close.
What we really need to do to get trustworthy AI rather than the current unpredictable “jagged” mess, is to go in the knowledge-, reasoning-, and world-model driven direction I laid out in 2020, in an article called the Next Decade in AI, in which neurosymbolic AI is just the *starting point* in a longer journey.*
Read that article if you want to know what else we need to do next.
The first part has already come to pass. In time, other three will, too.
Meanwhile, the implications for the allocation of capital are pretty massive: smartly adding in bits of symbolic AI can do a lot more than scaling alone, and even Anthropic as now discovered (though they won’t say) scaling is no longer the essence of innovation.
The paradigm has changed.
—
*Claude Code is plainly neurosymbolic but the code part is a mess; as Ernie Davis and I argued in Rebooting AI in 2019, we also need major advances in software engineering. But that’s a story for another day.
Finland is the Happiest country in the World for the 9th year running. We also climbed up the world brand index from 7th to 5th.
I am often asked why we do well in these rankings.
I do not think there is a magic potion, but it helps to have a society which strives towards freedom, equality and justice.
The basis of it all is a welfare society, a robust education system, a sense of security and a close attachment to nature.
There is no such thing as a perfectly happy society, but providing some building blocks that give us a chance to live a meaningful life, to help others, will nudge us in the right direction during the journey of life.
Unpopular Opinion: We aren't building the future 10x faster with AI. We are just generating legacy code 10x faster.
Everyone is currently bragging about developer velocity. "I built this entire backend in a weekend!" "AI wrote 80% of my codebase!"
But here is the reality check we are ignoring: Code is a liability, not an asset.
If an AI tool spits out 1,000 lines of functional boilerplate in five seconds, that is still 1,000 lines that a human being has to read, review, secure, and maintain when the dependencies inevitably break next year.
We are treating code generation like a pure productivity win, but we are optimizing for the wrong metric. The bottleneck in software engineering was never how fast we could type. The bottleneck has always been comprehension, architecture, and maintenance.
If we don't shift our focus from "generation speed" to "architectural sanity," the tech debt of the next five years is going to be an absolute, unmaintainable nightmare.
Over 50 million views for that @MattShumer_ blog (“Something Big is Happening”) and not a shred of actual data.
Nobody thought to ask what are the data? Does this match reality? Is Shumer credible? Did he give a single concrete example?
People gonna believe what they want to believe, but here are 7 critical points most people missed:
• Shumer’s blog post is weaponized hype, filled with vivid narrative and marketing speech, but stumbles on the facts, especially with respect to reliability. He gives no actual data to support this claim that the latest coding systems can (reliably) write “perfect” whole complex apps without making errors.
• When he describes how AI’s are doing longer and longer tasks on METR’s famous task-time benchmark, he neglects to say the criterion on that benchmark is 50% correct, not 100%, and that the benchmark is only about coding and not tasks in general. No AI system can reliably do every five-hour long task humans can do without error, or even close. But you wouldn’t know that reading Shumer’s blog, which largely ignores all the hallucination and boneheaded errors that are so common in every day experience.
• Shumer didn’t cite the new Caltech/Stanford article that reviews a wide range of reasoning errors in so-called reasoning models [or the Apple reasoning paper or the ASU mirage paper, etc]. The picture he sells just isn’t realistic, however much people might wish it were true.
• Shumer is the guy who was once famous for apparently exaggerated claims about a big model of his that didn’t replicate and that many people saw as fraud; he likes to sell big. But that doesn’t mean we should take him seriously.
• Shumer made no reference to a different METR study showed that coders sometimes imagine big productivity gains where they actually lost productivity. (Even though he selectively mentioned their other well-known study).
• He also didn’t acknowledge that other user’s experience is certainly not “it’s usually perfect.” @KelseyTuocreported a few weeks back that Claude Code was sometimes perfect, and other times maddening. (Example: “Sometimes, Claude is absolutely the worst coworker you’ve ever had. At one point, it deleted every single one of the phoneme files of each English sound pronounced [that she was working with in her app] absolutely correctly, which I had personally emailed an English teacher to secure permission to use, and replaced them with AI-generated sounds which were all subtly wrong.”) Shumer glosses over that kind of experience. 
•LLMs write code wicked fast, but some coders are starting to report burnout, and only relatively modest gains relative to that burnout, as a new story from Connie Loizos at Tech Crunch just reported.
• Shumer is actually correct that something has changed recently. You really can let things rip more in the most recent systems. Bit, quoting from a developer friend “sometimes! [it] will come out with the right answer. … Sometimes… [But] ultimately, I think this makes it more dangerous …. Generally, the closer these systems are to appearing right, the more dangerous they become…”
• The security of the autogenerated code is very much in question.
The bottom line is this: LLMs are certainly coding more, but it’s not clear that the code they are creating is secure or trustworthy. Shumer’s presentation is completely one-sided, omitting lots of concerns that have been widely expressed here and elsewhere.
A lot of people may have taken his post seriously, but they shouldn’t have.
Adapted from my newsletter, Marcus on AI (nearly 100k subscribers).
I remember the times when, to be considered an expert in AI worth listening to, you would at least have read this book *at least* once.
Now every clown with internet access who asked ChatGPT to generate some Python code and managed to run it thinks of themselves as an AI expert and shares their wisdom with humanity.
This place is toxic.
For the last seven years I warned you that LLMs and similar approaches would not lead us to AGI. Almost nobody is willing to acknowledge that, even though so many of you gave me endless grief about it at the time.
I also warned you -– first –- that Sam Altman could not be trusted, that OpenAI would lose its dominance, that GPT-5 would not be all AGI, and that LLMs lacked world models. That hallucinations would not go away. That out of distribution generalization was THE key issue.
And that the economics of LLMs didn't make sense. And that the LLM companies would start seeking bailouts.
The receipts are all here if you care. As for me, I have had it. If you want to hear other prescient warnings in advance, subscribe to my newsletter Marcus on AI.
Or you can stay here and be lied to; the choice is yours.
Hey @ATT, my mother-in-law was told she had to get a new phone and change the home phone number that she has had for 50 years. It's 2025 and she can't keep her phone number? Have we regressed back to the 90s? #disappointed
** STEPHEN HAWKING WAS RIGHT **
Using a very strong black-hole merger signal, the @LIGO-@ego_virgo-@KAGRA_PR Collaboration has shown the 2nd law of black hole mechanics, Hawking's area law (A_final > A_initial), to hold with high credibility.
PRL: https://t.co/IMlx5kgNU7 #open
#HBD to arXiv!🎈
On August 14, 1991, the very first paper was submitted to arXiv. That's 34 years of sharing research quickly, freely & openly!
Some baby pictures to show how far we've come . . . when we were just a computer under desk . . . & in our 1994 punk phase . . . 👶💾
What if a spacecraft could cycle between Earth and Moon orbits, performing multiple circuits of each, naturally and indefinitely, with zero propulsion?
We’ve discovered a new class of stable, prograde, low-energy cycler orbits that do just that.
Why these orbits matter:
Ballistic → fuel-free
Stable → long-term ready
Near-chaotic → agile with low ΔV
Low-energy → access to Earth/Moon, Lagrange points, Sun–Earth L1/L2, even heliocentric space
At the AAS/AIAA Astrodynamics Specialist Conference in Boston next week, I’ll present on a new family of ballistic Earth-Moon cycler orbits that are stable, prograde, and mission agile—unlike any cyclers in the current literature.
The example below is shown in both the Earth-Moon rotating frame and inertial frame.
Conference Paper: https://t.co/v3VDPIT0X4