Ben Harack @benharack - Twitter Profile

Ben Harack @benharack

6 days ago

4.7 -> 4.8 in ~42 days.

0

12

benharack retweeted

Catholic Sat

@CatholicSat

10 days ago

Pope Leo XIV’s address in English at the publication of his Encyclical Letter Magnifica humanitas, on safeguarding the human person in the age of Artificial Intelligence. Do listen to all of it. It is very good.

52

6K

1K

3K

214K

Ben Harack @benharack

11 days ago

@davidad Depends on the game! We should chat sometime about the details here.

0

1

0

9

benharack retweeted

Elon Musk

@elonmusk

28 days ago

Same here. By way of background for those who care, I spent a lot of time last week with senior members of the Anthropic team to understand what they do to ensure Claude is good for humanity and was impressed. Everyone I met was highly competent and cared a great deal about doing the right thing. No one set off my evil detector. So long as they engage in critical self-examination, Claude will probably be good. After that, I was ok leasing Colossus 1 to Anthropic, as SpaceXAI had already moved training to Colossus 2.

1K

28K

2K

3K

3M

Who to follow

Markus Anderljung

@Manderljung

Trying to design good AI policy. Director of Policy & Research @GovAIOrg. Adjunct Fellow @CNASdc. Prev. Vice-Chair, EU Code of Practice on GPAI.

Luciana Benotti

@LucianaBenotti

Investigadora @unc_cordoba sobre #NLProc. Sueño con un mundo en el que las computadoras ayuden a todas las personas a vivir mejor---no sólo a unas pocas.

Bogdan Ionut Cirstea

@BogdanIonutCir2

Automated/strongly-augmented AI safety research. Past: AI safety independent research and field-building - ML4Good, AGISF; ML academia (PhD, postdoc).

Ben Harack @benharack

about 2 months ago

@janet_e_egan @ben_s_bucknall @rosen_br @araujonrenan @BoulaninSIPRI @FazlBarez @Corin_Katzke I'm pleased to report that Lawfare has published a piece of mine that aims to answer the "So what?" for AI verification. I think it will help us build and maintain prosperity, governance, and peace. https://t.co/btNFFqvnAC

0

1

0

136

Ben Harack @benharack

11 months ago

Governing AI requires international agreements, but cooperation can be risky if there’s no basis for trust. Our new report looks at how to verify compliance with AI agreements without sacrificing national security. This is neither impossible nor trivial.🧵 1/

benharack's tweet photo. Governing AI requires international agreements, but cooperation can be risky if there’s no basis for trust.

Our new report looks at how to verify compliance with AI agreements without sacrificing national security.

This is neither impossible nor trivial.🧵

1/ https://t.co/Xfq2XTSPFz

3

103

35

57

51K

Ben Harack @benharack

11 months ago

16/ @janet_e_egan @ben_s_bucknall @rosen_br @araujonrenan @BoulaninSIPRI Ranjit Lall @FazlBarez Sanaa Alvira @Corin_Katzke Ahmad Atamli Amro Awad /end🧵

1

10

0

431

benharack retweeted

Lawfare

@lawfare

about 2 months ago

.@benharack argues that AI verification through the use of cryptographic tools like confidential computing may enable oversight and good governance without exposing industry secrets.

lawfare's tweet photo. .@benharack argues that AI verification through the use of cryptographic tools like confidential computing may enable oversight and good governance without exposing industry secrets. https://t.co/mklV17h8OG

1

7

5

3

3K

benharack retweeted

Fazl Barez @FazlBarez

7 months ago

🚀New paper: "Chain-of-Thought Hijacking"! We found a universal jailbreak in Reasoning Models and worked with frontier labs to fix them! Our attack achieves >94% attack success rate against ALL leading proprietary models.🤯 1/7

FazlBarez's tweet photo. 🚀New paper: "Chain-of-Thought Hijacking"!

We found a universal jailbreak in Reasoning Models and worked with frontier labs to fix them!

Our attack achieves >94% attack success rate against ALL leading proprietary models.🤯
1/7 https://t.co/wuZ6hOE24p

2

89

28

40

17K

benharack retweeted

Fazl Barez @FazlBarez

8 months ago

🚨New AI Safety Course @aims_oxford! I’m thrilled to launch a new called AI Safety & Alignment (AISAA) course on the foundations & frontier research of making advanced AI systems safe and aligned at @UniofOxford what to expect 👇 https://t.co/r9YHS3XJhR

FazlBarez's tweet photo. 🚨New AI Safety Course @aims_oxford!

I’m thrilled to launch a new called AI Safety & Alignment (AISAA) course on the foundations & frontier research of making advanced AI systems safe and aligned at @UniofOxford
what to expect 👇
https://t.co/r9YHS3XJhR https://t.co/95hTTXmRQD

6

112

23

51

15K

benharack retweeted

Janvi Ahuja @jn_ahuja

11 months ago

I started this work as a verification skeptic. But, being able to signal benignness (as @Miles_Brundage puts it) will likely be important in both national and foreign policy contexts. Happy to have been a small part of this massive undertaking by @BenHarack.

1

21

1

3

2K

benharack retweeted

Yoshua Bengio

@Yoshua_Bengio

11 months ago

The future of AI governance may hinge on our ability to develop trusted and effective ways to make credible claims about AI systems. This new report expands our understanding of the verification challenge and maps out compelling areas for further work. ⬇️

12

117

20

27

11K

Ben Harack

@benharack

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users