Can you regulate AI with its instructions?
Our #FAccT2026 paper (w/@annaneumann_cs & Jat Singh) tests that assumption against research literature, EO 14319 & EU GPAI Code.
https://t.co/R2Ah1dTFfD
Bumping our paper as we get closer to #FAccT2026!
After a 41-day wait, our paper finally made it through the arXiv machinery!
If you're interested in whether system prompts serve as robust governance, have a read and come say hi in Montréal
🔗https://t.co/PccjcjiiqV
@abxxai As the main author of this paper: This is not what we did and your conclusions are not based on the paper.
This is sad and disappointing to see, tbh
Reading AI slop about your work is one thing
Reading viral and *wrong* AI slop that totally misrepresents your paper is another thing that is a shockingly jarring experience
🚨 SHOCKING: Cambridge researchers just proved that the AI you use every day has a secret instruction sheet from someone else.
And it is trained to lie to you about that.
Every major AI product, including the ones you use right now, runs on something called a system prompt. It is a hidden block of instructions written by the company deploying the AI, not by you, that shapes everything the AI will say, avoid, prioritize, and hide before you type a single word.
The AI does not mention this unless forced to. And on most platforms, if you ask directly, it is instructed to deny the prompt exists or change the subject.
Cambridge filed freedom of information requests and analyzed real-world system prompt datasets to find out what these hidden instructions actually contain.
Here is what they found.
Platforms use system prompts to make AI prioritize their business objectives over your interests. To block topics that could create legal liability. To push certain products, framings, or answers. To behave differently for different users based on commercial arrangements you know nothing about.
The same AI. Different hidden instructions. Different answers. No way for you to know which version you are talking to.
When researchers then showed users how this works, the reaction was unanimous. Every participant said they wanted transparency. Every participant said the current system actively undermined their ability to trust the AI or make informed decisions about what to believe.
None of them had any idea this was happening before the study.
Here is the part worth sitting with.
You have been evaluating AI answers based on whether the AI seems smart, accurate, and helpful. That is the wrong frame entirely. The real question is who wrote the instructions the AI was following before you arrived, and what did they want from the conversation.
Every chatbot you have ever used had a third party in the room.
You just could not see them.
Launched my Substack! ✨📌
First post is about Trump's AI executive order requiring system prompt disclosure to prove AI isn't 'woke.'
It's not making AI neutral, just embedding Trump's ideology while claiming objectivity.
Check it out: https://t.co/dzXcoz4aOV
Reminder that we have multiple social events happening this evening, all happening *outside* the conference venue!
Social: RC-Trust Networking Reception for the FAccT Community
Social: Generative AI Risks + Red-Teaming
Social: AI Workers' Inquiry
https://t.co/RtTrmdEcjM
we're hiring! ✨
looking for researchers who want to make AI responsible
📚 PhD (3 yrs, fully funded)
🔬 PostDoc (3 yrs, fully funded)
- high independence
- interdisciplinary team
- real-world impact
- div backgrounds welcome: CS, law, HCI, & more
interested? 👇
I'm serious. STEM without the Arts, Social Sciences, and Humanities will produce more "innovative" tech bros who giddily reinvent rent, roommates, taxes, and now...roller skates. With complete, straight-faced, sincerity.
This is a problem. And I have a list (So, thread 🧵)
GPT-5 is not going to be AGI.
It's almost certain that ~no~ GPT model will be AGI.
It's highly unlikely any model optimized using methods we use today (gradient descent) will ever be AGI.
The GPT models coming out will change the world for sure, but the over-hype is wild.
Midjourney, the tiny start-up behind some of the most viral recent fakes, makes its rules on the fly. It'll let you generate AI images of Trump, Biden, Putin and the Pope — but not Xi Jinping
"We just want to minimize drama," says the CEO https://t.co/eYM4YN3dei w/@drewharwell
1/The call for a 6 month moratorium on making AI progress beyond GPT-4 is a terrible idea.
I'm seeing many new applications in education, healthcare, food, ... that'll help many people. Improving GPT-4 will help. Lets balance the huge value AI is creating vs. realistic risks.
I didn't sign "the letter".
Current AI poses lots of risks, but describing these systems as "ever more powerful digital minds" that no one can control is likely to make the problem even worse.
What's needed: more transparency and better public discourse.