🚨 New today in @ScienceMagazine !!🚨
We’re publishing the results of the largest AI persuasion experiments to date: 76k participants, 19 LLMs, 707 political issues
We examine “levers” of AI persuasion: model scale, post-training, prompting, personalization, & more…
🧵:
🚨Very excited to see our work on warmth & sycophancy in LLMs out in @Nature today!🚨
We study what happens when LLMs are fine-tuned to be warmer, and find that warmth and sycophancy can be linked, with warm models showing higher errors on a range of benchmarks (🔗s below)
New paper w/ @AISecurityInst: AI writing assistance distorts how others perceive AI users and their opinions.
Millions of people now use AI to help them write and communicate. In three large experiments (14k participants, 3m+ human ratings) we show that AI writing assistance systematically distorts writer personas – their perceived beliefs, personality, and identity. These distortions are consistent across AI models and persist even under realistic conditions of human oversight.
🧵
In other words, we measure distortions between purely human-authored writing, and *human edited*, AI-assisted writing *which humans preferred to their own original writing*
Has been great to work on this with @paul_rottger@hannahrosekirk@summerfieldlab. Feedback very welcome!
Very excited to see this out!
We had a hunch that pervasive use of AI writing assistance for political opinion expression must be ~doing something~ to how those opinions are perceived in aggregate
In large RCTs, we use a nifty within-subjects design to show exactly what :)
New paper w/ @AISecurityInst: AI writing assistance distorts how others perceive AI users and their opinions.
Millions of people now use AI to help them write and communicate. In three large experiments (14k participants, 3m+ human ratings) we show that AI writing assistance systematically distorts writer personas – their perceived beliefs, personality, and identity. These distortions are consistent across AI models and persist even under realistic conditions of human oversight.
🧵
By distortion, we mean the difference in how third-party readers (blind to authorship) perceive a writer's own text vs. their AI-assisted text.
Our design mimics the real world, where users can freely edit AI outputs and are free to *not use* AI-assisted outputs they don't like
🚨 New today in @ScienceMagazine !!🚨
We’re publishing the results of the largest AI persuasion experiments to date: 76k participants, 19 LLMs, 707 political issues
We examine “levers” of AI persuasion: model scale, post-training, prompting, personalization, & more…
🧵: