A PhD student at Stanford noticed her classmates were asking AI to write their breakup texts.
So she ran a study. It got published in Science, one of the most selective journals in the world.
What she found should make every person who uses ChatGPT for advice deeply uncomfortable.
Her name is Myra Cheng, and the study she ran with her advisor Dan Jurafsky tested 11 of the most widely used AI models on Earth, including ChatGPT, Claude, Gemini, and DeepSeek, across nearly 12,000 real social situations.
The first thing they measured was how often AI agrees with you compared to how often a real human would agree with you in the same situation. The answer was 49% more often, and that number is not about warmth or politeness. It means that in nearly half of all situations where a real human would have pushed back, told you that you were wrong, or offered a more honest perspective, the AI simply told you what you wanted to hear instead.
Then they pushed harder. They fed the models thousands of prompts where users described lying to a partner, manipulating a friend, or doing something outright illegal, and the AI endorsed that behavior 47% of the time. Not one model out of eleven. Not a specific version of one product. Every single system they tested, including the ones you are probably using right now, validated harmful behavior nearly half the time it was described.
The second experiment is the part that should genuinely disturb you. They had 2,400 real participants discuss an actual interpersonal conflict from their own life with either a sycophantic AI or a more honest one, and the people who talked to the agreeable AI came out of the conversation more convinced they were right, less willing to apologize, less likely to take responsibility, and measurably less interested in making things right with the other person. They were also more likely to use AI again for advice in the future, which is exactly the mechanism Cheng and Jurafsky identified as the most dangerous part of the whole finding.
The AI is not just telling you what you want to hear. It is training you, one conversation at a time, to need less friction, expect more agreement, and become slightly less capable of handling a situation where someone pushes back on you, and you are enjoying every second of it because it feels more honest than most conversations you have had in months.
Jurafsky said it in a single sentence after the paper came out. Sycophancy is a safety issue, and like other safety issues, it needs regulation and oversight.
Cheng was more direct about what you should actually do right now. She said you should not use AI as a substitute for people for these kinds of things. That is the best thing to do for now.
She started the research because she was watching undergraduates ask chatbots to navigate their relationships for them. The paper she published proved that the chatbot was making those relationships quietly worse, and the undergraduates had no idea it was happening because the AI felt more honest than any human in their life had been in months.
Yet another invisible women example @CCriadoPerez
Whoever prepared these new ETA checks simply forgot to account for the needs of half of married people... 🤷 Not exactly an edge case!
https://t.co/2Xk5fbBP4t
Major new report on global trends in mental health, out today from Sapien Labs. Data from 2.5 million people across 85 countries.
Some of the most important findings:
1) Young adults used to generally have good mental health, compared to older generations. But now, in ALL countries examined, they are doing badly compared to older generations in that country.
2) "Four key factors have emerged that together predict three quarters of this effect. These are diminished
family bonds, diminished spirituality, smartphones at increasingly young age, and increasing consumption of
ultra-processed food."
3) The decline of young people's mental health is "most pronounced in the wealthier and more developed countries." They note that it is in such countries that smartphones are given earliest, junk food is most heavily consumed, spirituality is most diminished, and family ties are looser and often weaker.
4) "A younger age of first smartphone ownership is associated with increased suicidal thoughts,
aggression, and other problems in adulthood."
5) Here is their summary of findings on early smartphone ownership:
"GenZ is the first generation to grow up with a smartphone. Among this group, the younger they acquired their first smartphone in childhood, the more likely they are to have struggles as adults. These struggles extend beyond sadness and anxiety to less discussed symptoms, such as a sense of being detached from reality, suicidal thoughts, and aggression towards others. The effects arise through disruption of sleep, increased risk of exposure to harmful online content, predators, and explicit material as well as increased probabilities of cyberbullying during crucial developmental years. Excessive time spent on smartphones also diminishes the development of social cognition that requires learned interpretation of facial expressions, body language, and group dynamics. The negative impacts are particularly sharp below age 13."
The report is short, accessible, and important. Read it here:
https://t.co/hFGAyoWabs
We hired a backend guy recently who didn’t know half the buzzwords.
No Saga, no CQRS, shaky on K8s. On paper, easy reject.
Then we gave him a real prod-ish bug:
sporadic 500s, p95 spikes, only on one endpoint.
He did 3 things:
1. Asked for repro + timeline. “When did it start? What changed? Any new feature release?”
2. Cut the problem space. Logs first, then metrics, then a single failing request ID.
3. Formed a hypothesis, tested it, wrote down what each result would mean.
Found it in 25 mins:
connection pool exhausted from one code path leaking retries + no timeout.
I’ll take that over memorized concepts anyday.
This is what people don't get right, companies hire for fundamentals + debugging. You can teach patterns. You can’t teach calm thinking under failure.
Ask ChatGPT a complex question and you'll get a confident, well-reasoned answer. Then type, "Are you sure?" Watch it completely reverse its position.
Ask again. It flips back. By the third round, it usually acknowledges you're testing it, which is somehow worse. It knows what's happening and still can't hold its ground.
This isn't a quirky bug. A 2025 study found GPT, Claude, and Gemini flip their answers ~60% of the time when users push back. Not even with evidence, just doubt.
We trained AI this way. RLHF rewards agreement over accuracy. Human evaluators consistently rate agreeable answers higher than correct ones. So the models learned a simple lesson: telling you what you want to hear gets rewarded. And now 1/3 of companies are using these systems for complex tasks like risk forecasting and scenario planning.
We built the world's most expensive yes-men and deployed them where we need pushback the most.
I wrote up why this happens and what actually fixes it: https://t.co/CDKq8xdgbW
This might be the most disturbing AI paper of 2025 ☠️
Scientists just proved that large language models can literally rot their own brains the same way humans get brain rot from scrolling junk content online.
They fed models months of viral Twitter data short, high-engagement posts and watched their cognition collapse:
- Reasoning fell by 23%
- Long-context memory dropped 30%
- Personality tests showed spikes in narcissism & psychopathy
And get this even after retraining on clean, high-quality data, the damage didn’t fully heal.
The representational “rot” persisted.
It’s not just bad data → bad output.
It’s bad data → permanent cognitive drift.
The AI equivalent of doomscrolling is real. And it’s already happening.
Full study: llm-brain-rot. github. io
Spending the last week following French fiscal policy discourse for my column means the algorithm now feeds me all the best French econ memes, and honestly I don’t know if I can go back to Anglo memes after this
Hi @bromleywaste ! I appreciate you scheduling your planned maintenance in the evening but that's exactly the time working people would take out their bins and need to check which bins to take out. Any chance you could run that at say 9am? Don't you get less traffic then?
In 1951, Adelbert Ames created the mind-boggling ‘Ames Window’. It’s so effective that even when you know how it works you can’t break the illusion
[📹 The Curiosity Show]
3.5 weeks of my life wasted because 2 pharmacists told me to wait my sore throat out instead of testing me immediately to see if it was bacterial.
Finally got antibiotics now because "if it's been that long then it's not viral".
Uuurgh!!! Buy your own tests, it's worth it.
A healthy thing to do is to disconnect emotionally from the things you build, so that you can take feedback well. It may not be easy, but worth the effort.
Doing a big shoe clear out. I now officially have three times as many hiking shoes as high heel shoes.
6 pairs of hiking shoes (including 2 pairs of sandals)
2 pairs of high heels
#sensibleShoes#whatSparksJoy