Rakshit Trivedi @rstriv - Twitter Profile

Pinned Tweet

7 days ago

As increasingly capable AI systems are deployed, humans, institutions, and other AI systems adapt in response — i.e. the world pushes back. So is capability still the central safety challenge for AI? We think not. We believe the harder challenge is coexistence. The current AI research paradigm treats the world as a stationary source of feedback, what we refer to as the solipsistic approach to AI design. This raises serious risks for coexistence. In our new #ICML2026 paper, we argue that superintelligence — an extremely capable task solver, built through such a solipsistic approach — is unlikely to be cooperative. 🧵

rstriv's tweet photo. As increasingly capable AI systems are deployed, humans, institutions, and other AI systems adapt in response — i.e. the world pushes back.

So is capability still the central safety challenge for AI?
We think not. We believe the harder challenge is coexistence.

The current AI research paradigm treats the world as a stationary source of feedback, what we refer to as the solipsistic approach to AI design. This raises serious risks for coexistence.

In our new #ICML2026 paper, we argue that superintelligence — an extremely capable task solver, built through such a solipsistic approach — is unlikely to be cooperative. 🧵

1

34

6

16

14K

rstriv retweeted

Cas (Stephen Casper)

@StephenLCasper

1 day ago

There are really interesting academic questions emerging around AI and epistemic risks. I only fear that, by the time we reach consensus, we will be too dumb to understand it.

1

30

5

15

4K

rstriv retweeted

Séb Krier

@sebkrier

5 days ago

People do not coordinate only through broad legal rules and prices. Hayek emphasized abstract rules that allow people to coordinate like property, contract, trade, and competition. But Lachmann also emphasized the practical secondary institutions people orient their plans around, like banks, standardized contracts, product categories, and so on. In software, an abstraction boundary is an interface that hides complexity hiding beneath. In this 2005 paper (https://t.co/FPxYCFUSPT), Miller and Tulloh explain that you can apply this concept to markets too. Consider a post office: there's an abstract boundary that separates why the customer wants to mail something (which the postman doesn't need to know); what the shared transaction is (all the recognizable steps and commitments involved in sending mail); and how the postal system actually delivers it (complex logistics network hidden from the customer). The middle part is what lets the user benefit from the postal system’s expertise without having to learn postal logistics. The boundary defines the shared 'what' but also separates the customer’s 'why' from the provider’s 'how'. Not only that, but reusability means the same institution can be used to satisfy many purposes (birthday invites, subpoenas etc) and polymorphism means different providers can satisfy the same need and compete (UPS, FedEx etc). An important question in institutional theory is how societies achieve both stability and adaptation; the paper authors say that the solution is stable interfaces allow changing internals. I find this very intuitive: when companies don't evolve/change from the inside much, you get ossification and insufficient adaptation. When laws change too much and institutions are unstable, uncertainty affects market confidence. The people who are good at redrawing abstraction boundaries are entrepreneurs, who notice when existing categories are wrong and will invent new ones to remedy faults or address demand. What has always saddened me is how poorly rewarded and incentivized political entrepreneurship is. Part of the reason why is that this is hard: market abstraction boundaries are often disciplined by exit, entry, profit/loss, customer choice, and provider competition - but these feedback loops are much weaker in the public sector. I hope we'll see a lot more of this in the coming decade. In fact this is something that AI will hugely facilitate, since it can lower the cost of articulating and prototyping new abstraction boundaries. We've already seen minor examples through e.g. citizens creating websites/services that compete with government ones. Though usually this is to make state services more legible rather than changing the boundaries in the first place. I think if people want the future to go well, bolstering state capacity and enabling more innovation on the governance/democracy side of things will be critical. People don't really like this because it's a slow process, but I think they're wrong (and cheems), and playing the 'urgency of AGI' card to bypass this through a de facto state of emergency will cause lasting harms, partly by weakening institutional learning, public trust, and future coordination capacity.

sebkrier's tweet photo. People do not coordinate only through broad legal rules and prices. Hayek emphasized abstract rules that allow people to coordinate like property, contract, trade, and competition. But Lachmann also emphasized the practical secondary institutions people orient their plans around, like banks, standardized contracts, product categories, and so on.

In software, an abstraction boundary is an interface that hides complexity hiding beneath. In this 2005 paper (https://t.co/FPxYCFUSPT), Miller and Tulloh explain that you can apply this concept to markets too. Consider a post office: there's an abstract boundary that separates why the customer wants to mail something (which the postman doesn't need to know); what the shared transaction is (all the recognizable steps and commitments involved in sending mail); and how the postal system actually delivers it (complex logistics network hidden from the customer).

The middle part is what lets the user benefit from the postal system’s expertise without having to learn postal logistics. The boundary defines the shared 'what' but also separates the customer’s 'why' from the provider’s 'how'. Not only that, but reusability means the same institution can be used to satisfy many purposes (birthday invites, subpoenas etc) and polymorphism means different providers can satisfy the same need and compete (UPS, FedEx etc).

An important question in institutional theory is how societies achieve both stability and adaptation; the paper authors say that the solution is stable interfaces allow changing internals. I find this very intuitive: when companies don't evolve/change from the inside much, you get ossification and insufficient adaptation. When laws change too much and institutions are unstable, uncertainty affects market confidence.

The people who are good at redrawing abstraction boundaries are entrepreneurs, who notice when existing categories are wrong and will invent new ones to remedy faults or address demand. What has always saddened me is how poorly rewarded and incentivized political entrepreneurship is. Part of the reason why is that this is hard: market abstraction boundaries are often disciplined by exit, entry, profit/loss, customer choice, and provider competition - but these feedback loops are much weaker in the public sector.

I hope we'll see a lot more of this in the coming decade. In fact this is something that AI will hugely facilitate, since it can lower the cost of articulating and prototyping new abstraction boundaries. We've already seen minor examples through e.g. citizens creating websites/services that compete with government ones. Though usually this is to make state services more legible rather than changing the boundaries in the first place.

I think if people want the future to go well, bolstering state capacity and enabling more innovation on the governance/democracy side of things will be critical. People don't really like this because it's a slow process, but I think they're wrong (and cheems), and playing the 'urgency of AGI' card to bypass this through a de facto state of emergency will cause lasting harms, partly by weakening institutional learning, public trust, and future coordination capacity.

11

232

44

209

38K

rstriv retweeted

Cooperative AI Foundation

@coop_ai

6 days ago

How does democratic accountability work if institutions are run by agents? Join @bakkermichiel (@MIT) for his seminar on Tuesday 16 June exploring 'Closing the Democratic Loop: Automated Oversight for the AGI Era'. Link below.

coop_ai's tweet photo. How does democratic accountability work if institutions are run by agents? Join @bakkermichiel (@MIT) for his seminar on Tuesday 16 June exploring 'Closing the Democratic Loop: Automated Oversight for the AGI Era'. Link below. https://t.co/0qid1mcG1i

1

24

5

2K

Who to follow

Keita Funakawa

@KeitaWF

COO | Cofounder at Nanome Inc.

Rakshit Trivedi

@rstriv

7 days ago

📄 Paper: https://t.co/0OsNwFqggP Work done in collaboration with my wonderful coauthors @natashajaques, @locross, Sasha Vezhnevets, and @jzl86. Very excited to present this at #ICML 2026. If you are visiting, come say hi at our poster session. We would love to discuss!

1

8

0

177

Rakshit Trivedi

@rstriv

7 days ago

As increasingly capable AI systems are deployed, humans, institutions, and other AI systems adapt in response — i.e. the world pushes back. So is capability still the central safety challenge for AI? We think not. We believe the harder challenge is coexistence. The current AI research paradigm treats the world as a stationary source of feedback, what we refer to as the solipsistic approach to AI design. This raises serious risks for coexistence. In our new #ICML2026 paper, we argue that superintelligence — an extremely capable task solver, built through such a solipsistic approach — is unlikely to be cooperative. 🧵

1

34

6

16

14K

Rakshit Trivedi

@rstriv

7 days ago

The paper concludes by tackling several counterarguments such as: - multi-actor designs may have worse failure modes - competitive pressure may produce cooperation naturally - the empirical track record may not justify alarm - scale may solve interaction dynamics - RLHF may already train cooperative behavior These are serious objections. Our response is that each misses how deployment changes the game. 12/n

1

0

140

rstriv retweeted

Cas (Stephen Casper)

@StephenLCasper

about 1 year ago

🚨New paper led by @aribak02 Lots of prior research has assumed that LLMs have stable preferences, align with coherent principles, or can be steered to represent specific worldviews. No ❌, no ❌, and definitely no ❌. We need to be careful not to anthropomorphize LLMs too much.

StephenLCasper's tweet photo. 🚨New paper led by @aribak02

Lots of prior research has assumed that LLMs have stable preferences, align with coherent principles, or can be steered to represent specific worldviews. No ❌, no ❌, and definitely no ❌. We need to be careful not to anthropomorphize LLMs too much. https://t.co/xISZDU4e5o

11

386

90

254

107K

rstriv retweeted

Daphne Cornelisse

@daphne_cor

over 1 year ago

Sim agents are key for developing autonomous systems for safety-critical systems, like self-driving cars. We're open-sourcing sim agents that achieve a 99.8% success rate with < 0.8% failures on the Waymo Dataset. These agents are built through scaling self-play.

3

178

27

109

22K

rstriv retweeted

Cooperative AI Foundation

@coop_ai

over 1 year ago

The development and widespread deployment of advanced AI agents will give rise to multi-agent systems of unprecedented complexity. A new report from staff at CAIF and a host of leading researchers explores the novel and under-appreciated risks these systems pose. Details below.

coop_ai's tweet photo. The development and widespread deployment of advanced AI agents will give rise to multi-agent systems of unprecedented complexity. A new report from staff at CAIF and a host of leading researchers explores the novel and under-appreciated risks these systems pose. Details below. https://t.co/XxOhyZIcFV

1

117

42

61

24K

rstriv retweeted

Atoosa Kasirzadeh

@Dr_Atoosa

over 1 year ago

In this review paper, we advocate for the normalization of AI safety as an inherent component of AI development and deployment. AI safety should be a standard practice integrated into every stage of AI creation and deployment. Developing and deploying safe AI should be a universal priority for everyone. Read our preprint here: https://t.co/57lWzoFuWm

Dr_Atoosa's tweet photo. In this review paper, we advocate for the normalization of AI safety as an inherent component of AI development and deployment. AI safety should be a standard practice integrated into every stage of AI creation and deployment. Developing and deploying safe AI should be a universal priority for everyone. Read our preprint here: https://t.co/57lWzoFuWm

4

158

40

73

29K

Rakshit Trivedi

@rstriv

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users