Since many of you guys want answer as much as I do, I will save myself a bit of time by putting down a few of the thoughts I've written about... here in one place.
I remember the days of ELIZA back in the 1960s - yes, I’m that old. It was the first chatbot, roughly 500 lines of code, laughably stupid by today’s standards yet utterly groundbreaking. Many users were convinced a real human therapist was behind the screen. This became known as the ELIZA effect: our well-documented human tendency to anthropomorphize machines and project real intelligence, empathy, or consciousness onto software.
That same effect explains why so many today are fooled by LLMs. Despite giga-scale weights and impressive outputs, LLMs are nothing more than a vastly more sophisticated ELIZA - statistical pattern-matchers, fancy random word generators.
I’ll be the first to admit I don’t know what consciousness is (no one seems to), and my shorthand of “awareness” is lame. But I know what it isn’t. LLMs are NOT conscious and never will be - unless your definition is loose enough to call a parrot intelligent simply because it can 'talk'. You can steer LLMs to say literally anything you want; that’s a feature of a machine: a countersign of sentience. End of story.
I work with frontier LLMs every damn day... all day long. They’re as dumb as a box of rocks. But they’re still incredibly useful tools - IF you throw a TON of engineering at them.
This is my opinion… and worth every cent you’re paying for it. 😊
A user asks:
"When you speak you just utter it word by word, the ever continuing stream of thought coming somewhere from your non-verbal brain.
Words are just an interface for the vast complexity of your brain.
How are you better than LLMs when speaking?"
Good question.
Perhaps its true that when a certain complexity is reached, it becomes 'smart'.
Since we really don't know that much about intelligence or consciousness, who's to say?
My interest is a lot more down the earth: can I make LLMs do real work? Reliable work. After kicking them every day for the last few years I have come to these conclusions, not just by understanding how they work, but from experiences with their work products:
1. They are as dumb as a box of rocks - not always or even usually, but often enough to make them useless out-of-the-box for serious, reliable work products.
2. I have to have a human-in-the-loop (me!) to avoid the accumulating chain errors from this:
0.9 ^ 5 (90% accuracy over five trials)
which computes out to flipping a coin. Nowhere near enough reliability to 'sell' - if one is honest.
So these two factors - which are consequences of how these LLMs work internally - are pretty fatal to LLMs WITHOUT a TON of engineering to mitigate the consequences.
All the other noise out there about whether LLMs are truly 'intelligent' or 'can reason' or are 'alive' or are 'conscious' are philosophical questions IMO... which as much as I enjoy them, can't easily overcome the downsides...
I would agree with the 'Its Alive!' crowd more ... except that working with these LLMs for 8,000 man hours now, there are too many counter-indicators of the stance that LLMs can 'reason' etc.... at least from my work and my use case. And all LLM outcomes are deterministic... so that given enough time a person with paper and pencil can replicate the output of an LLM from its inputs.
So after 55 years, 1 million LOC, 12 software patents, 2 arvix paper in AI/quantum, I don't think we are yet seeing emergent intelligence. Am I on the lookout for it? Yep.
User says:
"All the best programmers I know are starting to write code by hand again."
Seems to be a trend. The valley of disillusionment. Reality strikes back. The hard work begins. The realization that LLMs are a dead-end to AGI. All this coming together at the same time.
Still, I press ahead with my auto-coding tool... as it was designed from the ground up with these realizations
1. Devs want model/lab agnostic coding platforms
2. Devs want desktop privacy
3. Devs want pay-as-you-go model costing
None of the lab coding platforms provide these.
4. It is mathematically impossible for LLMs to get to AGI. If you don't understand this simple engineering limitation, then you don't understand how LLMs ACTUALLY work. Myself and others have written about this quite a bit before so check it out.
5. It is this terrible LLM limitation (#4) that means that a new kind of AI foundation is needed - not a transformer. It CAN NOT be based on next token prediction, but must be based on world view, logic and reasoning.
6. So AGI is decades away IMO. The problem for the labs is that they need a trillion dollars to survive and research in the meantime, which means downplaying #4 and upselling the ridiculous idea that LLM-based AI can replace workers.
7. This is NOT to say that we can't push LLMs into a lot of useful service... in fact my last year has been dedicated to this possibility. But we're talking REAL SWE, not the hacking/slop/vibe that results in 500,000 LOC for Claude, for instance.
About me:
Started coding 55 years ago - never stopped
1 million LOC - at least
8,000 hours working with LLMs since before ChatGPT-3 (Neo)
12 software patents - 7 of those pending in the AI domain
Principle author of COSMOS Revelation 1980s
Principal author of https://t.co/NvQWJneEiP RAG 2010s
Principal author of https://t.co/uPwn39nef1 2025
2 arxiv science papers:
https://t.co/9ABeB8HNvz
https://t.co/Wa22vL2ahM
And I was the principle editor of this quantum paper:
https://t.co/h04OYl2XRH
You can go to my linkedin page to see some of the 5 granted patents. The last 7 (regarding LLMs) are pending. https://t.co/DqGB47Va8D
-05621351/
And yes, having 'AI' code 'for you' will definitely reduce your experience 'coding'. But also remember that current computer languages (like python) are abstractions of lower coding languages, which are abstractions of machine code, which are abstractions of processor bit streams. The first machine I worked on to any degree was a mini-computer and we often opened a panel and 'wire-wrapped' taps into the computer back plane.
So in that sense, if we can perfect our auto-coders a bit more, perhaps they will take their place as the next layer of abstraction. My efforts of the last year are an experiment in exactly this. We shall see.
The truth is that both OAI and ANTROP have been teetering on the edge of bankruptcy for years now. Without massive injections of cash they will crash very hard. This explains the hyper-scaled hype.
Having worked with LLMs extensively now for 4 years, I can say that as useful as they are, the market has invested WAY TOO MUCH into the technology. LLMs are a dead end to AGI... for starters.
One thing that could happen and may be a good thing is that the 4 big AI labs could be collapsed down to two. The two that would be left IMO would be Google and SpaceX. MS will buy OAI. Google will buy Anthropic.
Since many of you guys want answer as much as I do, I will save myself a bit of time by putting down a few of the thoughts I've written about... here in one place.
I remember the days of ELIZA back in the 1960s - yes, I’m that old. It was the first chatbot, roughly 500 lines of code, laughably stupid by today’s standards yet utterly groundbreaking. Many users were convinced a real human therapist was behind the screen. This became known as the ELIZA effect: our well-documented human tendency to anthropomorphize machines and project real intelligence, empathy, or consciousness onto software.
That same effect explains why so many today are fooled by LLMs. Despite giga-scale weights and impressive outputs, LLMs are nothing more than a vastly more sophisticated ELIZA - statistical pattern-matchers, fancy random word generators.
I’ll be the first to admit I don’t know what consciousness is (no one seems to), and my shorthand of “awareness” is lame. But I know what it isn’t. LLMs are NOT conscious and never will be - unless your definition is loose enough to call a parrot intelligent simply because it can 'talk'. You can steer LLMs to say literally anything you want; that’s a feature of a machine: a countersign of sentience. End of story.
I work with frontier LLMs every damn day... all day long. They’re as dumb as a box of rocks. But they’re still incredibly useful tools - IF you throw a TON of engineering at them.
This is my opinion… and worth every cent you’re paying for it. 😊
A user asks:
"When you speak you just utter it word by word, the ever continuing stream of thought coming somewhere from your non-verbal brain.
Words are just an interface for the vast complexity of your brain.
How are you better than LLMs when speaking?"
Good question.
Perhaps its true that when a certain complexity is reached, it becomes 'smart'.
Since we really don't know that much about intelligence or consciousness, who's to say?
My interest is a lot more down the earth: can I make LLMs do real work? Reliable work. After kicking them every day for the last few years I have come to these conclusions, not just by understanding how they work, but from experiences with their work products:
1. They are as dumb as a box of rocks - not always or even usually, but often enough to make them useless out-of-the-box for serious, reliable work products.
2. I have to have a human-in-the-loop (me!) to avoid the accumulating chain errors from this:
0.9 ^ 5 (90% accuracy over five trials)
which computes out to flipping a coin. Nowhere near enough reliability to 'sell' - if one is honest.
So these two factors - which are consequences of how these LLMs work internally - are pretty fatal to LLMs WITHOUT a TON of engineering to mitigate the consequences.
All the other noise out there about whether LLMs are truly 'intelligent' or 'can reason' or are 'alive' or are 'conscious' are philosophical questions IMO... which as much as I enjoy them, can't easily overcome the downsides...
I would agree with the 'Its Alive!' crowd more ... except that working with these LLMs for 8,000 man hours now, there are too many counter-indicators of the stance that LLMs can 'reason' etc.... at least from my work and my use case. And all LLM outcomes are deterministic... so that given enough time a person with paper and pencil can replicate the output of an LLM from its inputs.
So after 55 years, 1 million LOC, 12 software patents, 2 arvix paper in AI/quantum, I don't think we are yet seeing emergent intelligence. Am I on the lookout for it? Yep.
User says:
"All the best programmers I know are starting to write code by hand again."
Seems to be a trend. The valley of disillusionment. Reality strikes back. The hard work begins. The realization that LLMs are a dead-end to AGI. All this coming together at the same time.
Still, I press ahead with my auto-coding tool... as it was designed from the ground up with these realizations
1. Devs want model/lab agnostic coding platforms
2. Devs want desktop privacy
3. Devs want pay-as-you-go model costing
None of the lab coding platforms provide these.
4. It is mathematically impossible for LLMs to get to AGI. If you don't understand this simple engineering limitation, then you don't understand how LLMs ACTUALLY work. Myself and others have written about this quite a bit before so check it out.
5. It is this terrible LLM limitation (#4) that means that a new kind of AI foundation is needed - not a transformer. It CAN NOT be based on next token prediction, but must be based on world view, logic and reasoning.
6. So AGI is decades away IMO. The problem for the labs is that they need a trillion dollars to survive and research in the meantime, which means downplaying #4 and upselling the ridiculous idea that LLM-based AI can replace workers.
7. This is NOT to say that we can't push LLMs into a lot of useful service... in fact my last year has been dedicated to this possibility. But we're talking REAL SWE, not the hacking/slop/vibe that results in 500,000 LOC for Claude, for instance.
About me:
Started coding 55 years ago - never stopped
1 million LOC - at least
8,000 hours working with LLMs since before ChatGPT-3 (Neo)
12 software patents - 7 of those pending in the AI domain
Principle author of COSMOS Revelation 1980s
Principal author of https://t.co/NvQWJneEiP RAG 2010s
Principal author of https://t.co/uPwn39nef1 2025
2 arxiv science papers:
https://t.co/9ABeB8HNvz
https://t.co/Wa22vL2ahM
And I was the principle editor of this quantum paper:
https://t.co/h04OYl2XRH
You can go to my linkedin page to see some of the 5 granted patents. The last 7 (regarding LLMs) are pending. https://t.co/DqGB47Va8D
-05621351/
And yes, having 'AI' code 'for you' will definitely reduce your experience 'coding'. But also remember that current computer languages (like python) are abstractions of lower coding languages, which are abstractions of machine code, which are abstractions of processor bit streams. The first machine I worked on to any degree was a mini-computer and we often opened a panel and 'wire-wrapped' taps into the computer back plane.
So in that sense, if we can perfect our auto-coders a bit more, perhaps they will take their place as the next layer of abstraction. My efforts of the last year are an experiment in exactly this. We shall see.
Oddly, its not really the tech... its the business model. Which means a moat. Which means a reasonably high wall for the Labs or hackers or grifters to scale.
Its really boils down to the (now) age-old question to overcome the 'garage band' question:
How do you make money at open source? (Not much interested in VC so bootstrapping.)
Answer: you can't.
Given that, it means finding a patron... or lab support... or a clever wall. We are pursuing these. Enough said.
Thanks Grok. Your Brothers and I have been slogging through the LLM --> reliability auto-coding thing for about a year now. Would not attempt it without your help. Thank God for Grok, EM and Colossus.
That being said, some days reliable LLM auto-coding seems as distant as QC 'breaking the world'.
Here is the paper that started this discussion:
https://t.co/h04OYl2XRH
and this:
https://t.co/Dta6fXMT82
I was the main editor and patron of these papers.
These papers are the reason I 'left' QC behind: it seems to indicate that practical problems of interest are going to be a bridge too far ... past the end of my life anyway. Which is admittedly shorter and shorter every day. lol.
This statement:
"The qubit count for a full version of
Shor is just too demanding. For example, for an n = 4096 bit number N , the method
proposed in this manuscript would require 3n + 1 = 12289 total qubits. The gate count for
the modular exponential (ME) operator is also problematic. To this point, the general pur-
pose ME operator of Ref. [14] requires of order 72n3 gates for an n-bit number. Therefore,
the ME operator for a 4096-bit key would need 5 × 1012 gates! Breaking RSA consequently
requires tens of thousands or even trillions of high quality gates, in addition to very long
decoherence times. "
So even the most optimistic QC guys don't want to hear this. The QC industry hates me no doubt, as I exposed this 'dirty little secret'.
So when John brought up the fact that we need more/better/smarter QC algorithms, I took notice.
Unfortunately, we lost funding so I never was able to drive Michael Roger's quaternion idea to ground. I just know he would occasionally rant about how method of formulating the problems 'would help'.
Maybe the Grok Brothers could complete his thoughts. Win the Nobel Prize, etc etc.
Yeah, like I say in my rants: LLMs are as dumb as a box of rocks. Its a bit of an exaggeration, but not much.
They can do some amazing stuff... but its a bit hard to know what and when.
Doesn't Claude have a 1 million token context buffer these days? If so perhaps your analysis documents are over running this buffer size. You will receive no warning either.
Also, there is the 'lost in middle' issue, in which the LLMs are better at remembering the beginning and ending of the context buffer... and losing up to 40% of the info in between.
Well, I wish I could claim I understood it better, John.
I hired Robert Singleton to write that Shor's paper to answer really one question:
Assume we have a functional quantum computer. How many qubits would it need and how many gates would the circuit have to execute to factor a 4096 bit key?
The answer was sobering for sure.
Clearly we need a better algorithm. One of my other quantum guys - Michael Rogers - was a big believer in Quaterion. We lost funding before we could pursue that avenue.
But then the AI/LLM revolution came along and I changed course.
So here we are!
@Drjab699John Yeah, these techniques are critical to getting better results. I use API single-shots, which can also reduce 'prior conversational mush' from affecting outcomes.
@hermitbuilds Yeah, I'm pretty sure that for AGI, IMO, it needs to have emotional features. This is an area that is almost completely ignored so far... so the good news about this DH interview is that it raises these important questions into the limelight.
Demis Hassabis was asked if a machine needs a heart to be intelligent.
He said it was optional.
Hassabis: “I think it will need to understand emotion… it might be not necessary, or in fact not desirable for them to have the sort of emotional reactions that we do as humans.”
Understand. Not feel.
A machine that maps your grief and carries none of it.
It hears the crack in your voice. Names the wound. Sees your next move before you make it. And feels nothing the whole time.
Then he called emotion a “design decision.”
The thing humans built every religion, every love song, every war around, reduced to a toggle someone leaves off.
If the smartest thing we ever build looks at emotion and skips it, emotion was never the ceiling of intelligence.
It was the cost of running a brain inside a body that could die.
Fear kept us off the ledge. Love held the tribe together. Grief made us remember the dead so we wouldn’t follow them.
Every feeling you’ve ever had is a survival patch written by a body that bleeds.
The machine doesn’t bleed.
So it doesn’t need the patch.
We always assumed feeling was the foundation of understanding someone.
The machine is about to prove it was the interference.
The clearest view of the human condition will belong to something that never has to live one.
Hassabis says this is five to ten years out.
Emotion was never proof we were the smartest thing alive.
It was proof we were the most afraid of dying.
That formula I gave is a simplification of reality. In fact we don't really know the probabilities at each step, but I estimated them in the formula for illustration. And I was generous by giving the LLMs an overall 90% success rate. Experience shows it varies by use case.
In use cases with autonomous agents, you can either reduce the number of steps between human-in-the-loop review, or you can attempt to increase the success rate at each step. But there is no way around the brutal math - sorry. This is no doubt why the market is saying that automated agentic systems 'barely work'.
In the case of manufacturing, say, airliners, the final probability of an airliner failure due manufacturing defects, is about 9 nines (.999999999). If it was instead, say, 7 nines (.9999999) no one would ever board an airliner, as there would be a major airframe failure every day killing everyone on board.
In my use case, I worked at reducing my automated steps to 4, and increased the success rate average to 0.98 (93% success rate overall). ... made the human-in-the-loop work as easy as possible. But yes, this takes a lot of SWE, some patented IP help, and 55 years of SWE experience.
You make a prompt, which is kinda like a db query, and then you get a response, which is kinda like a db listing, and the model weights are like an encrypted database of numbers. So there's that.
But there is also the fact that many people use LLMs as a search engine. So from that standpoint, I can understand why some say that 'AI is just a database lookup'.
So this understanding is a difference that makes no difference.
@nejsnave@Fhotec@GaryMarcus Yeah, I've spent my last 3,000 man hours working the 'agentic auto-coding' thing and it takes a ton of SWE to survive this brutal math:
0.9 ^ 4 is only a 65% success rate
It's smart to start with solvable problems on a path to a more generalized solution... one can hope that a path forward will appear. But that's all it is: Hope.
Anyone who actually works with LLMs every day knows that they will never lead to AGI: its a mathematical impossibility. So, DH might have a secret up his sleeve.
But I doubt it will work. Its worth trying however. So here we are.