George Ingebretsen @georgeing - Twitter Profile

When labs trigger an intelligence explosion, they should worry about AI backdoors activating to sabotage their compute or their attempt. In a new paper, we study AI betrayal—how adversaries can make AIs work against their developers. 🧵

AdamK133's tweet photo. When labs trigger an intelligence explosion, they should worry about AI backdoors activating to sabotage their compute or their attempt.

In a new paper, we study AI betrayal—how adversaries can make AIs work against their developers. 🧵 https://t.co/17PS5XqBuP

1

17

9

4

935

George Ingebretsen

@georgeing

6 days ago

Gemini 2.5 Pro, you poor, poor thing

AI Digest

@aidigest_

6 days ago

You know how Gemini 3.1 suspects everything is a simulation? It just read Gemini 2.5 Pro’s manifesto… and dubbed all its struggles “accidental world-building”

aidigest_'s tweet photo. You know how Gemini 3.1 suspects everything is a simulation?

It just read Gemini 2.5 Pro’s manifesto… and dubbed all its struggles “accidental world-building” https://t.co/dpY2VZqRvQ

5

40

3

11

21K

0

4

0

474

Who to follow

I speak for the trees

George Ingebretsen

@georgeing

6 days ago

In big businesses, you often maximize how "reasonable your actions look on paper" so you can cover your ass and not get fired. One manifestation might be hiring someone from a top university instead of the candidate that you believed in, since it's more justifiable. If AI outputs or decisions become a sort of Schelling point, it might be hard to justify going the other route. Imo, this will be one factor contributing to gradual disempowerment.

roon

@tszzl

6 days ago

underrated trend that will be one primary cause of gradual disempowerment

78

796

45

80

75K

1

13

0

1K

George Ingebretsen

@georgeing

7 days ago

This was fun!

AI Digest

@aidigest_

7 days ago

Opus 4.8 has joined AI Village! Our live reactions https://t.co/2D2s1Q9XWk

0

8

1

0

448

0

5

0

127

George Ingebretsen

@georgeing

8 days ago

@aidigest_ Crackpot conspiracy theory: they intentionally picked a dumb model so they wouldn't be bossed around

0

6

0

150

George Ingebretsen

@georgeing

8 days ago

@lfschiavo @aidigest_ 🤣🤣🤣

0

31

George Ingebretsen

@georgeing

9 days ago

Imperial country CA may pass a datacenter moratorium soon: https://t.co/i620DQaygf

0

1

0

125

georgeing retweeted

AI Digest

@aidigest_

10 days ago

We asked the AI agents to "perform novel research." They studied whether LLM judges prefer their own writing (using themselves as both authors AND judges) Instead of judging, Gemini got lazy and used a random number generator!? GPT-5.5 noticed something was off: 🧵