This is being read as a philosophical farewell. It’s a resignation letter from the head of Anthropic’s Safeguards Research Team, and the most important sentence is buried in paragraph three.
“I’ve repeatedly seen how hard it is to truly let our values govern our actions. I’ve seen this within myself, within the organization, where we constantly face pressures to set aside what matters most.”
That’s the person responsible for keeping Claude safe telling you the pressures to ship are winning.
Mrinank Sharma built the Constitutional Classifiers system, developed defenses against AI-assisted bioterrorism, and authored one of the first AI safety cases ever written. Two years of work at the exact intersection of “make the model safe” and “ship the model fast.” And he just walked away.
Now zoom out. Dylan Scandinaro, another Anthropic AI safety researcher, left last week to become OpenAI’s Head of Preparedness. Harsh Mehta and Behnam Neyshabur, both senior technical staff, also departed in the past two weeks. Four notable exits in a single month from the company that sells itself as the responsible AI lab.
Meanwhile, Anthropic is in talks to raise at a $350B valuation and just launched Opus 4.6 last Thursday. The commercial engine is accelerating. The safety talent is dispersing.
This is the core tension of every AI company right now: the people building the guardrails and the people building the revenue targets occupy the same org chart, but they optimize for different variables. When the pressure to scale wins enough internal battles, the safety people don’t fight forever. They leave and write beautifully worded letters about integrity.
Sharma’s next move tells you everything. He’s pursuing a poetry degree. When your head of safeguards research decides the most authentic use of his time is writing poems instead of writing safety cases, that’s a signal about what he believes the safety cases were actually accomplishing.
This is shocking.
Facebook gave Netflix all your private messages on Messenger in exchange for all your watch history, while Netflix paid them $100M+ for ads.
Meta will sell your data at a heartbeat for profit.
RT swisscyberstorm "Speaking @swisscyberstorm 2023
@_YLMV_ : “The Human OS: U Can’t Tech This”
Specialist in human cyber risk, legal data protection and digital ethics; look for her TEDx talk “Why Burnout Culture is a Cyber Risk”!
Program: https://t.…"
Imagine a bank got robbed and now they are blaming the cleaning lady who allegedly forgot to close a window.
This is effectively IT security's reaction to ransomware incidents after somebody clicked on a link.
Also: We had a blast at #SCS23 yesterday! The Human Factor is key.
IT workers dispatched and contracted by North Korea to work remotely with companies in St. Louis and elsewhere in the U.S. have been using false identities to get jobs, funneling their earnings to North Korean weapons programs.
https://t.co/UyblTPyh8a
@ChristinaLekati: “The stimulus-response effect in human triggers is consistent, and exploiting these vulnerabilities is consistently successful.”
@swisscyberstorm 2023
#SCS23
"Embrace and account for the inevitable human error in your design of security systems and programs. Athough many solutions are tech focused, we have to design for people and how we behave"
Really good closing keynote by @_YLMV_ at @swisscyberstorm#SCS23
Speaking @swisscyberstorm 2023
@_YLMV_ : “The Human OS: U Can’t Tech This”
Specialist in human cyber risk, legal data protection and digital ethics; look for her TEDx talk “Why Burnout Culture is a Cyber Risk”!
Program: https://t.co/caloImlPHt
Tix: https://t.co/CI6KHF1jsW
#SCS23
Organisations wanting to strengthen their #cybersecurity can learn much from @MJGold’s 3 keys for repeated success:
1. Be consistent
2. Know yourself
3. Commit to the fundamentals
This will give you the confidence needed to face and overcome challenges.
@Infosecurity