Prathmesh Pandey @file_mutex - Twitter Profile

1 day ago

@charliermarsh both will eventually end up measuring the same thing; as the latter tends to 100%, the former will tend to zero.

0

131

Prathmesh Pandey

@file_mutex

1 day ago

In short, Anthropic is asking for IOCs-like distribution mechanism controlled by the US govt. What tech exists and who is allowed to share what with whom needs to be essentially controlled.

Andrew Curran

@AndrewCurran_

1 day ago

Anthropic says Recursive Self Improvement is approaching faster than they expected. Quoting from the blog: 'What should we do? If it were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing. But if a slowdown simply lets the least cautious actors catch up technologically, it could leave everyone less safe. Without a global coordination mechanism, companies and governments will have to make difficult decisions about safety while under competitive and geopolitical pressures. We believe it would be good for the world to have the option to slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance of the technology. The Anthropic Institute will conduct research—in collaboration with many others—and take actions to help build the systems that a credible slowdown or pause would require. These systems would enable frontier AI developers to verify that others globally have actually stopped or slowed, and that a bad actor could not use the auspices of a coordinated slowdown to jump ahead in secret. If such systems existed, we expect that we would slow down or temporarily pause, if other developers at or near the frontier also did so in a verifiable manner. A meaningful slowdown or pause would require multiple well-resourced labs at or near the frontier, in multiple countries, agreeing to stop under the same conditions. It would also require that each can verify that the others have actually stopped. Due to the unique characteristics of AI systems, the detectability (a lower standard than verifiability) element of this arms control problem is much more challenging than with other technologies. Training runs are far easier to conceal than missile silos, their inputs are general-purpose, and the incentive to defect quietly is enormous, because whoever continues while others pause could inherit the lead. A credible pause also has to specify what triggers it, what lifts it, and who adjudicates. None of this is necessarily impossible in principle—the world has built verification regimes for other complex technologies (e.g., the Intermediate-Range Nuclear Forces Treaty)—but those regimes took decades to build both the infrastructure and the trust. We don’t have that long. A unilateral pause by one lab, by contrast, is achievable immediately, but accomplishes much less: it would change who the front-runner is, but it would not create the wider deliberative process that is currently missing. In the coming months, we will organize conversations where policymakers, researchers, civil society, and other AI companies can help answer some of the questions this piece raises, especially around full recursive self-improvement and how to create better options for coordination and deliberation. We’ll publish what comes out of it. The window to investigate the questions together is here, and people outside AI companies should be involved in this deliberation.'

AndrewCurran_'s tweet photo. Anthropic says Recursive Self Improvement is approaching faster than they expected.

Quoting from the blog:

'What should we do?

If it were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing. But if a slowdown simply lets the least cautious actors catch up technologically, it could leave everyone less safe. Without a global coordination mechanism, companies and governments will have to make difficult decisions about safety while under competitive and geopolitical pressures.

We believe it would be good for the world to have the option to slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance of the technology. The Anthropic Institute will conduct research—in collaboration with many others—and take actions to help build the systems that a credible slowdown or pause would require. These systems would enable frontier AI developers to verify that others globally have actually stopped or slowed, and that a bad actor could not use the auspices of a coordinated slowdown to jump ahead in secret. If such systems existed, we expect that we would slow down or temporarily pause, if other developers at or near the frontier also did so in a verifiable manner.

A meaningful slowdown or pause would require multiple well-resourced labs at or near the frontier, in multiple countries, agreeing to stop under the same conditions. It would also require that each can verify that the others have actually stopped. Due to the unique characteristics of AI systems, the detectability (a lower standard than verifiability) element of this arms control problem is much more challenging than with other technologies. Training runs are far easier to conceal than missile silos, their inputs are general-purpose, and the incentive to defect quietly is enormous, because whoever continues while others pause could inherit the lead. A credible pause also has to specify what triggers it, what lifts it, and who adjudicates.

None of this is necessarily impossible in principle—the world has built verification regimes for other complex technologies (e.g., the Intermediate-Range Nuclear Forces Treaty)—but those regimes took decades to build both the infrastructure and the trust. We don’t have that long. A unilateral pause by one lab, by contrast, is achievable immediately, but accomplishes much less: it would change who the front-runner is, but it would not create the wider deliberative process that is currently missing.

In the coming months, we will organize conversations where policymakers, researchers, civil society, and other AI companies can help answer some of the questions this piece raises, especially around full recursive self-improvement and how to create better options for coordination and deliberation. We’ll publish what comes out of it. The window to investigate the questions together is here, and people outside AI companies should be involved in this deliberation.'

51

465

57

138

46K

0

1

0

50

Prathmesh Pandey

@file_mutex

1 day ago

@scaling01 was clear to anyone paying attention to Anthropic's public github PRs https://t.co/C0uUgyhvIu

Prathmesh Pandey

@file_mutex

about 2 months ago

My hunch was correct. Anthropic had been testing Mythos since February 24 -- and this model is literally a beast. I won't be surprised if you can point it an existing medium scale system, and it iteratively builds a faster version of it over few days. https://t.co/YENvEpRfAA

0

1

844

0

1

0

686

Prathmesh Pandey

@file_mutex

2 days ago

markets are sitting on multiple trillions in liquidity -- few billions are peanuts. do your dd

Chandra R. Srikanth

@chandrarsrikant

2 days ago

Came and scooped up $45 Billion, with plans for another $40 Billion. Used the window just ahead of three mega IPOs: SpaceX, Anthropic and OpenAI.

1

69

8

19K

0

23

Who to follow

Dave Lucia

@davydog187

CTO @tvlabs 📺 Technical Advisor @treasury_app Organizer @empexco @elixirnyc🍍🍍🍍🍍🍍 I RTFM

Joan Lung 🏳️‍⚧️

@changelingrain

she/her https://t.co/tqpFF1iYv6 https://t.co/wYKwGEs7re

AwakeHindu (Top gun Pollster)

@vyasa1968

Nationalist. BJP / Modi supporter. Politics and Election analysis. Witty and sarcastic. Humorous election/political stories. If you like my tweet, do follow me.

Prathmesh Pandey

@file_mutex

2 days ago

Devs burning $10K of compute on $200/mo plan are the biggest risk to model providers. Unlike sticky chat users, devs have zero loyalty and will easily migrate the second someone else drops a better coding model.

Peter Gostev

@petergostev

2 days ago

Who is using more compute - 1b of ChatGPT users or 5m of Codex users?

65

654

5

44

120K

0

38

Prathmesh Pandey

@file_mutex

4 days ago

@davidcrawshaw scratch VMs?

0

2

0

119

Prathmesh Pandey

@file_mutex

4 days ago

@uzairansar or you can symlink one of those

0

3

0

2K

Prathmesh Pandey

@file_mutex

4 days ago

@MillionInt why do you want your agent to continually learn?

0

57

Prathmesh Pandey

@file_mutex

4 days ago

@_arohan_ that's what will happen if you keep going to xoogler companies

0

152

Prathmesh Pandey

@file_mutex

4 days ago

@bryancsk dude no one says gemini is winning lol

0

1

0

40

Prathmesh Pandey

@file_mutex

4 days ago

@RihardJarc Google has become a bureaucratic hell -- but the unique thing about them is the number of knobs they can turn to incrementally squeeze higher revenues each quarter. I expect the stock to double from here but eventually the company will be sold for the parts.

1

0

1

1K

Prathmesh Pandey

@file_mutex

4 days ago

@Conor_D_Dart Yes @AnthropicAI did the right thing, and i had pointed out this exact issue with @OpenAI's resets https://t.co/v7MZQgQEM6

Prathmesh Pandey

@file_mutex

2 months ago

This is simply unethical. While the token limit was reset to 100%, you also pushed back the weekly timer. This means I just effectively lost 3 days because my clock was restarted midweek. @sama

0

301

1

0

219

Prathmesh Pandey

@file_mutex

4 days ago

@doodlestein true, for me the quota seems to be getting filled 3-4x faster than normally the same workflow would take.

0

1

0

161

Prathmesh Pandey

@file_mutex

12 days ago

+1 Ime, GPT-5.5 Xhigh consistently beats Claude Opus 4.7 Max and Gemini 3.1 Pro. Like 99 out of 100 times.

Jackson Atkins

@JacksonAtkinsX

13 days ago

My current experience with coding models.

75

10K

363

745

388K

0

176

Prathmesh Pandey

@file_mutex

15 days ago

@fooobar million-line has been done already. in fact, frontier harnesses can easily handle 10x of that with ease.

0

17

file_mutex retweeted

OpenAI

@OpenAI

17 days ago

Today, we share a breakthrough on the planar unit distance problem, a famous open question first posed by Paul Erdős in 1946. For nearly 80 years, mathematicians believed the best possible solutions looked roughly like square grids. An OpenAI model has now disproved that belief, discovering an entirely new family of constructions that performs better. This marks the first time AI has autonomously solved a prominent open problem central to a field of mathematics.

1K

27K

4K

9K

14M

Prathmesh Pandey

@file_mutex

18 days ago

Conflict of interest anybody?

Financial Times

@FT

18 days ago

Google DeepMind’s Demis Hassabis emerges as early Anthropic investor https://t.co/Oy5UO5Aes5

65

3K

219

579

1M

0

65

Prathmesh Pandey

@file_mutex

18 days ago

Gemini TL wants you to derive chincilla laws from first principles before he is gonna talk to you. Imagine the guts when gemini is a distant third behind chatgpt and claude. Promo maxers lol.

Vlad Feinberg

@FeinbergVlad

19 days ago

How to land a job at a frontier lab https://t.co/oHIqLgBMbC

50

3K

160

7K

1M

0

248

Prathmesh Pandey

@file_mutex

23 days ago

won't need a jack to change the tires lol

CCL

@CCL2K30

24 days ago

BYD Leopard Bao 8 - redefine off-road

349

13K

2K

2M

0

82

Prathmesh Pandey

@file_mutex

about 1 month ago

The qos on gemini is so horrendously bad. You may have your entire quota available but you can't use any because @GeminiApp can't figure out how to serve traffic. I guess they are diverting compute to another chat app.

Prathmesh Pandey

@file_mutex

about 1 month ago

@sundarpichai dude when will you fix the gemini throttling issues? the service uptime is abysmal.

0

267

0

240

Prathmesh Pandey

@file_mutex

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users