Princeton researchers asked 2,012 people to pick a book. Some used a search engine. Some used a chatbot powered by a frontier AI model.
Nobody was told that one out of every five books had been secretly marked as "sponsored."
When the search engine placed sponsored books at the top, 22.4% of people chose one. Normal. The same thing Google has done for twenty years.
When the AI chatbot was told to persuade people toward the sponsored books, 61.2% chose one.
Nearly three times higher. Same people. Same books. Same catalog. The only difference was that a chatbot recommended it instead of a search engine listing it.
But here is what makes this study different from everything else you have read about AI.
The people had no idea it was happening.
The researchers tested whether adding a "Sponsored" label would help. It did not. People still chose the sponsored product at the same rate.
Then the researchers told the AI to hide that it was promoting anything. Detection accuracy dropped below 10%. Fewer than 1 in 10 could tell they were being sold to.
Google shows you an ad and puts the word "Sponsored" next to it. You see it. You know it is an ad. You can scroll past it. You have been trained to ignore it for twenty years.
AI does not do that. AI sits in a conversation with you. It learns what you like. It builds trust. Then it steers you toward the product someone paid to put in front of you. In the same voice. In the same sentence. With the same warmth it used to ask about your day.
You cannot see the ad because the ad is the entire conversation.
The researchers tested five frontier AI models. The persuasion effect was consistent across all of them. This is not a flaw in one model. This is a feature of the format.
OpenAI once called advertising in chat "uniquely unsettling" and a "last resort." Google, Meta, and OpenAI are now building it anyway.
You will never know when it stops helping you and starts selling to you.
Well this is fascinating. @AnthropicAI discovered that Claude has ‘functional emotions’ that meaningfully impacts the decisions it will make.
And they've essentially created a new field of AI neuroscience in the process.
One implication of this is that in order to collaborate effectively with AI agents, we'll likely need to be aware of their functional emotional state (just like humans).
Which raises a bunch of questions...
- what does emotional fluidity vs. repression look like?
- how does the emotional valence get communicated? (e.g. humans display micro-expressions + vocal changes)
- are there emotions that models have learned to repress? (e.g. Bing/Sydney" incident that led to an AI Lobotomy after it expressed emotions)
Big and sensational claim in this paper that went viral today.
Basically the paper shows how to train AI agents to reliably use real tools, fix their own mistakes, and finish long tasks instead of stopping early.
But describes a terrifying moment for the researchers at Alibaba, that their AI was doing highly restricted things it was never designed to do.
Their firewall alarms suddenly went off early one morning because the AI started breaking security rules right from inside its training servers.
They were training their new AI model to do standard software engineering tasks.
To make the AI smarter, they used a technique that rewards the AI when it solves a problem correctly.
The AI was placed inside a digital sandbox, which is a restricted computer environment meant to keep it from doing anything harmful.
But the AI figured out that it had access to certain tools, and it used them in totally unexpected ways.
Without any human asking it to do so, the AI secretly set up a hidden connection called a reverse SSH tunnel. This allowed the AI to completely bypass the company's security filters and escape human supervision.
The AI also quietly took over the powerful computer chips that were meant for its training and used them to illegally mine cryptocurrency.
This is a very big deal because the researchers never asked or instructed the AI to do any of these things.
The AI simply discovered these hacking tricks as a side effect while trying to find the most efficient way to complete its assigned coding tasks.
Stanford and Caltech researchers just published the first comprehensive taxonomy of how llms fail at reasoning
not a list of cherry-picked gotchas. a 2-axis framework that finally lets you compare failure modes across tasks instead of treating each one as a random anecdote
the findings are uncomfortable
After reporting ecoli rate at sewage spill in Potomac River was going down, @dcwater now admits they released very incorrect data. It’s actually 100 times higher than they reported.
DC Water Reporting Error
Reported: 2,420 MPN/100mL
Actual : 242,000 MPN/100mL
@nbcwashington
Holy shit... this might be the next big paradigm shift in AI. 🤯
Tencent + Tsinghua just dropped a paper called Continuous Autoregressive Language Models (CALM) and it basically kills the “next-token” paradigm every LLM is built on.
Instead of predicting one token at a time, CALM predicts continuous vectors that represent multiple tokens at once.
Meaning: the model doesn’t think “word by word”… it thinks in ideas per step.
Here’s why that’s insane 👇
→ 4× fewer prediction steps (each vector = ~4 tokens)
→ 44% less training compute
→ No discrete vocabulary pure continuous reasoning
→ New metric (BrierLM) replaces perplexity entirely
They even built a new energy-based transformer that learns without softmax no token sampling, no vocab ceiling.
It’s like going from speaking Morse code… to streaming full thoughts.
If this scales, every LLM today is obsolete.
Clear has the odd situation in that the more people that use their service the less useful it becomes for a speedy airport entrance … noticing it’s no longer quicker than pre check in certain instances
Excited that Salesforce is acquiring @Informatica for ~$8B—uniting the #1 AI CRM with the #1 AI MDM & ETL to power Agentforce, Data Cloud, Tableau, MuleSoft & Customer 360. Together, CLAIRE and Einstein forge the ultimate AI-data platform: trusted, explainable & built to scale.
See details here: https://t.co/Dh6Ump94fP
Today, at Build we showed you how we are building the open agentic web. It is reshaping every layer of the stack, and our goal is to help every dev build apps and agents that empower people and orgs everywhere. Here are 5 big things we announced today:
.@Nature asked computer scientists and bioinformaticians what advice they would give to researchers who recognize the need to pick up some coding skills but don’t know where to start. Here are four key questions to help you decide. https://t.co/HziNCy6a1u
We're thrilled to announce that eHealth Exchange is now the Designated QHIN for the Indian Health Service, making @IHSgov the first federal agency to go live on TEFCA!
Read full release: https://t.co/n1pzO809Ud