Peter Wallich

@PeterWallich

Accelerating AI safety research & building talent pipelines @ConstellOrg. Expert advisor @MITAIRisk. Ex UK AISI, BCG. Views my own. Likes/RTs != endorsements.

London

Joined April 2019

576 Following

143 Followers

8 Posts

Peter Wallich

@PeterWallich

22 days ago

@geoffreyirving Sad to see you leave, Geoffrey. Thanks again for all you have done for AISI. Hope to see you in the Bay Area and excited to learn more about the new org!

PeterWallich retweeted

Uzay Macar

@uzaymacar

about 2 months ago

🧵New Anthropic Fellows research: We studied mechanisms of "introspective awareness" in LLMs. LLMs can sometimes detect steering vectors injected into their residual stream. But is this worthy of being called introspection, or attributable to some uninteresting confound?👇

uzaymacar's tweet photo. 🧵New Anthropic Fellows research: We studied mechanisms of "introspective awareness" in LLMs.

LLMs can sometimes detect steering vectors injected into their residual stream. But is this worthy of being called introspection, or attributable to some uninteresting confound?👇 https://t.co/glSVSlon85

420

335

47K

PeterWallich retweeted

billy @billyhumblebrag

about 2 months ago

Haha those doofuses at ai2027 predicted we'd have professional level hacking abilities and the top ai company would be at $26B in revenue in May 2026. It's April and we already have superhuman hacking and $30B in revenue, why would you take forecasters this bad seriously???

billyhumblebrag's tweet photo. Haha those doofuses at ai2027 predicted we'd have professional level hacking abilities and the top ai company would be at $26B in revenue in May 2026. It's April and we already have superhuman hacking and $30B in revenue, why would you take forecasters this bad seriously??? https://t.co/FE63fjssnn

280

594

186K

PeterWallich retweeted

Yong Zheng-Xin

@yong_zhengxin

2 months ago

🚨New paper! How safe and aligned is Kimi K2.5? We found concerning dual-use capabilities, sabotage and self-replication tendencies, political censorship on Chinese-language queries, and potential agentic misuse risks. (1/N)

yong_zhengxin's tweet photo. 🚨New paper!

How safe and aligned is Kimi K2.5?

We found concerning dual-use capabilities, sabotage and self-replication tendencies, political censorship on Chinese-language queries, and potential agentic misuse risks. (1/N) https://t.co/NRflzkyRPs

105

23K

Who to follow

Jeremy Worth

@jeremyworth

A personal perspective. Views are my own; reposts are shared for interest only. Like many things in life, none of this should be taken too seriously.

Euan D. Carss

@carsseuan

Ph.D. Candidate and GTA @KingsCollegeLon, U.K. Assoc. Rsr. @ecfr: trust & IR, (EU)ropean foreign & security policy. @LFC devotee #YNWA

Peter Wallich

@PeterWallich

2 months ago

Super excited for this program!

🚀Henry is leading AI Safety Research Programs

@sleight_henry

2 months ago

🚀 Applications are now open: Constellation's Astra Fellowship 🚀 Fully funded, 5-month fellowship at our Berkeley research institute. Pair with mentors across empirical AI safety research, strategy, and governance at @ConstellOrg! 📅 Apply by May 3rd (begins Sep 2026) 🔗 https://t.co/pxtOduDBFh

sleight_henry's tweet photo. 🚀 Applications are now open: Constellation's Astra Fellowship 🚀

Fully funded, 5-month fellowship at our Berkeley research institute. Pair with mentors across empirical AI safety research, strategy, and governance at @ConstellOrg!

📅 Apply by May 3rd (begins Sep 2026)
🔗 https://t.co/pxtOduDBFh

168

232K

PeterWallich retweeted

Anthropic

@AnthropicAI

6 months ago

We’re opening applications for the next two rounds of the Anthropic Fellows Program, beginning in May and July 2026. We provide funding, compute, and direct mentorship to researchers and engineers to work on real safety and security projects for four months.

AnthropicAI's tweet photo. We’re opening applications for the next two rounds of the Anthropic Fellows Program, beginning in May and July 2026.

We provide funding, compute, and direct mentorship to researchers and engineers to work on real safety and security projects for four months. https://t.co/DoskdFTJSb

112

350

Peter Wallich

@PeterWallich

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users