Researchers at EPFL proved your AI is lying to you.
Not sometimes. Most of the time.
They built one of the hardest hallucination tests ever made with Max Planck Institute. 950 questions. Four domains where being wrong actually hurts. Legal. Medical. Research. Coding.
Then they ran every top model on it.
The results.
GPT-5. Wrong 71.8% of the time.
Claude Opus 4.5. Wrong 60% of the time.
Gemini 3 Pro. Wrong 61.9% of the time.
DeepSeek Reasoner. Wrong 76.8% of the time.
These are the smartest AI models on Earth. The ones you trust with your career. Your health. Your money.
You think turning on web search fixes it.
It doesn't.
Claude Opus 4.5 with web search. Still wrong 30.2% of the time.
GPT-5.2 thinking with web search. Still wrong 38.2% of the time.
The internet attached. Still lying to you in 1 out of every 3 answers.
Now the part that should scare you.
Medical questions. The one place being wrong can kill you.
GPT-5 hallucinated 92.8% of the time on medical guidelines.
Claude Haiku 4.5 hallucinated 95.7% of the time.
Gemini 3 Flash hallucinated 89% of the time.
Nine out of ten medical answers from popular AI models. Wrong.
It gets worse.
The longer you talk to it, the more it lies.
Early mistakes cascade. The model starts citing its own earlier hallucinations as facts. Your third message is more wrong than your first.
The paper, in its own words: "hallucinations remain substantial even with web search."
This is what hundreds of millions of people are doing right now. Asking software that lies in the majority of its answers. About their health. About their job. About their legal case. About their code.
Most are not checking.
Most never will.
But please. Keep using ChatGPT for medical advice.
The doctors need a break.
https://t.co/dHBP5CDpTM
Yann LeCun (AMI Labs Founder): "The AI industry is completely LLM-pilled. Everybody is working on the same thing. They're all digging the same trench."
LeCun explains why no lab dares break from the pack:
"They are stealing each other's engineers. So they can't afford to do something different because if they start going on a tangent, they're going to fall behind the other guys. And so they're all doing the same thing."
This groupthink is exactly what drove him out of Meta.
"Meta also became LLM-pilled with sort of recent reshuffling. And it's fine, a strategic decision that maybe makes sense for them. It's just not what I'm interested in."
For @ylecun, the problem runs deeper than strategy.
LLMs are missing something essential about how intelligence actually works:
"I cannot imagine that we can build agentic systems without those systems having an ability to predict in advance what the consequences of their actions are going to be. The way we act in the world is that we can predict the consequences of our actions and that's what allows us to plan."
His broader critique is that the industry has mistaken fluency for intelligence.
Language turned out to be the easy part. The hard part is the physical world.
It's why we still don't have domestic robots or level-five self-driving cars, even though today's systems can pass the bar exam and write code.
It’s time to expose a huge scam in AI startups: Contracted ARR
The reason many AI startups are crushing revenue records is because they are using a dishonest metric
The biggest funds in the world are supporting this and misleading journalists for PR coverage.
The setup: Company signs 3-year enterprise deals. Year 1 is discounted (say $1M), Year 2 steps up ($2M), Year 3 is full price ($3M).
They report $3M as “ARR” — even though they’re only collecting $1M right now.
The worst part: The customer has an opt-out option at 12 months! It’s not actually a 3 year contract.
In the chart below, by Q5 the company is trumpeting ~$100M “ARR” to press, while actual cash-generating, in-effect ARR is ~$35M. That’s ~3x inflation.
On top of this, enterprise AI companies are bundling full-time “forward deployed engineers” into deals massively reducing margins, sometimes producing Year 1 negative margins.
At some point customers are going to start triggering their opt-out clauses or aggressively negotiating down Year 3 pricing.
And a wave of enterprise AI companies may collapse.
at this rate, ai engineers and influencers will discover data engineering by end of 2026, reinventing medallion architecture, analytics, etl, and finally learn about ai and machine learning.
15 years after the field has matured.
UC Berkeley professor Stuart Russell: An AI with IQ 150 could upgrade itself to 170, then 250, very soon leaving humans way behind.
A recent Meta paper also wanred self-improving AI is promising but risky, as removing humans can worsen misalignment.
Don’t use a calculator until you can do the math on your own.
Don’t vibe code until you can code – and debug and maintain code – on your own.
It’s that simple.
This is more evidence that current frontier models remain completely reliant on content-level memorization, as opposed to higher-level generalizable knowledge (such as metalearning knowledge, problem-solving strategies...)
50% of all relationship advice on Reddit is “leave.” 15 years of data, 52 million comments, and the trend line only goes one direction.
A researcher filtered r/relationship_advice down to 1,166,592 quality comments and tracked what people actually recommend. In 2010, “End Relationship” sat around 30%. By 2025, it’s approaching 50%.
“Communicate” dropped from 22% to 14%. “Compromise” collapsed from 7% to 3%. “Give Space” fell from 25% to 13%. Every category that requires patience lost ground every single year.
The one category growing faster than “leave” is “Seek Therapy,” which went from 1% to 6%. The subreddit is slowly learning to say “this is above my pay grade.”
Train a model on this dataset and it would absolutely tell people to break up. The training data is 50% “leave” and climbing. The model wouldn’t be broken. It would be accurately reflecting what 52 million commenters actually believe about your relationship.
A 50% prior that you should leave, a 14% prior that you should talk about it, and a 6% prior that you need a professional. That’s not LLM psychosis. That’s the median human opinion on your relationship, backed by the largest advice dataset ever assembled.
I've come to have mixed feelings on "AGI" and "ASI" as terms to convey where this technology is headed.
On one hand, AGI has played a key role in motivating talented people who obsessed over this problem (@demishassabis, @ShaneLegg , @ilyasut , many others) and enabled the flow of capital that made many of our advancements possible.
On the other hand, "AGI" or "ASI" or any variant currently actively harms discourse around how the most interesting technology of our lifetime gets built and used.
a) it's not an accurate description of where we're headed, at least how most people interpret the term. Look at the recent conversations with @karpathy and @dwarkesh_sp and you instantly see how far we are from anything resembling true human intelligence. No proof of takeoff, timelines keep expanding. We are building very useful technology which could transform how businesses work or how tech is built but has nothing to do with "general intelligence".
b) it's become so overloaded that I've found almost no two people define it the same way or agree on timelines (whether we've already reached it or are 30 years away). It's why every blog post on AGI has to conjure up its own local definition to proceed. I mostly subscribe to @random_walker's view on why "intelligence" is used in an almost incoherent way always.
c) most importantly, it invokes fear—connected to historical usage in sci-fi and philosophy (think 2001, Her, anything invoking the singularity) that has nothing to do with the tech tree we're actually on. Makes every AI discussion incredibly easy to anthropomorphize and detour into hypotheticals.
We may need a different term for what we're trying to build at the end of all this and what it means for business and society.
BREAKING: The $610 Billion AI Ponzi Scheme Just Collapsed
Last night at 4pm EST, something unprecedented happened. Nvidia stock rallied 5% on earnings, then crashed into negative territory within 18 hours. Wall Street algorithms detected what humans couldn’t: the numbers don’t add up.
Here’s what they found.
Nvidia reported $33.4 billion in unpaid bills, up 89% in one year. Customers who bought chips haven’t paid for them yet. The average wait time for payment stretched from 46 days to 53 days. That extra week represents $10.4 billion that may never arrive.
Meanwhile, Nvidia stockpiled $19.8 billion in unsold chips, up 32% in three months. But management claims demand is insane and supply is constrained. Both cannot be true. Either customers aren’t buying or they’re buying without cash.
The cash flow tells the real story. Nvidia generated $14.5 billion in actual cash but reported $19.3 billion in profit. The gap is $4.8 billion. Healthy chip companies like TSMC and AMD convert over 95% of profits to cash. Nvidia converts 75%. That’s distress level.
Here’s where it gets criminal.
Nvidia gave $2 billion to xAI. xAI borrowed $12.5 billion to buy Nvidia chips. Microsoft gave OpenAI $13 billion. OpenAI committed $50 billion to buy Microsoft cloud. Microsoft ordered $100 billion in Nvidia chips for that cloud. Oracle gave OpenAI $300 billion in cloud credits. OpenAI ordered Nvidia chips for Oracle data centers.
The same dollars circle through different companies and get counted as revenue multiple times. Nvidia books sales, but nobody actually pays. The bills age. The inventory piles up. The cash never comes.
AI company CEOs admitted it themselves last week. Airbnb’s CEO called it vibe revenue. OpenAI burns $9.3 billion per year but makes $3.7 billion. That’s a $5.6 billion annual loss. The $157 billion valuation requires $3.1 trillion in future profits that MIT research shows 95% of AI projects will never generate.
Peter Thiel sold $100 million in Nvidia on November 9. SoftBank dumped $5.8 billion on November 11. Michael Burry bought put options betting Nvidia crashes to $140 by March 2026.
Bitcoin, which tracks AI speculation, dropped from $126,000 in October to $89,567 today. That’s a 29% crash. AI startups hold $26.8 billion in Bitcoin as collateral for loans. When Nvidia falls another 40%, those loans default, forcing $23 billion in Bitcoin sales, crashing crypto to $52,000.
The timeline is now certain. February 2026, Nvidia reports fourth quarter and reveals how many bills aged past 60 days. March 2026, credit agencies downgrade. April 2026, the first restatement. The fraud that took 18 months to build unwinds in 90 days.
Fair value for Nvidia: $71 per share. Current price: $186. The math is simple.
This is the fastest moving financial fraud in history because algorithms detected it in real time. Human investors are 90 days behind.
Read the full data driven deep dive article here - https://t.co/sDEf5Mdrtc
@jxmnop thx - @garrytan is a hype dude without much understanding of tech and I usually ignore his posts but got upset at this one too. Let him fail on ROI.
this post is complete misinformation
LLMs are lossy compressors! of *training data*.
LLMs losslessly compress *prompts*, internally. that’s what this paper shows.
source: i am the author of “Language Model Inversion”, the original paper on this
@rohanpaul_ai spreading what is trending is better when you consider all angles first.. your history is now in training data and includes all the things
Today’s edition of my newsletter just went out.
🔗 https://t.co/tEFjPubUgK
Consider subscribing, its free, and I write it everyday.
🔬 Microsoft research finds AI is not yet ready for real-world medical diagnosis
🏆 Google DeepMind just released its very first robotics AI models, called Gemini Robotics 1.5 and Gemini Robotics-ER 1.5.
🛠️ New paper shows a great way to make RAG much faster and more accurate
💼 AI just passed a brutal finance exam most humans fail