Puneesh Deora @puneeshdeora - Twitter Profile

Pinned Tweet

6 months ago

Check out our recent work on understanding and comparing generalization of Shampoo/Muon with GD 👇

6 months ago

📢 Late post Our recent work studies when and why spectrum-aware optimizers like Muon generalize better than standard Euclidean gradient descent (GD) 🧵below (1/N)

bhavya_vasudeva's tweet photo. 📢 Late post

Our recent work studies when and why spectrum-aware optimizers like Muon generalize better than standard Euclidean gradient descent (GD)

🧵below (1/N) https://t.co/UnUxVak1Su

1

13

2

3K

0

6

0

1

932

puneeshdeora retweeted

Bhavya Vasudeva @bhavya_vasudeva

5 days ago

📢 Excited to announce the 2nd edition of MOSS workshop at @COLM_conf. We welcome work that highlights the potential and limits of small-scale in the era of large-scale ML. CfP: https://t.co/qi8t8xMiJH Deadline: June 30, 2026 Please help spread the word and see you in SF!

0

17

1

2

3K

puneeshdeora retweeted

Daniel Litt

@littmath

10 days ago

My guess is that eventually AI has advantage over us in all aspects of mathematical research. But I think there are two possible worlds where this doesn't happen soonish: (1) AI exceeds our capabilities in all measurable capabilities, but there are still some (very intangible) intangibles where humans have higher capabilities. In this world we see AI making huge numbers of amazing discoveries but for some weird reason humans sometimes also do so. This could happen if there are some skills that are useful for math research but very hard to measure or train for. I think this is unlikely but possible. (2) Rather than AI exceeding humans in all capabilities, AI becomes more like *one* (or several) amazing mathematician(s). By which I mean, very technically strong but with a certain point of view, which makes them more likely to make certain discoveries than others. In this world humans are useful because of their greater cognitive diversity--they sometimes think of approaches/questions/ideas that don't occur to the AI. This happens now--there are mathematicians much stronger than I am but I'm still useful (plausibly for reasons other than comparative advantage) because I have some different points of view. I think this is closer to what the near-term trend looks like, but the medium-term is less clear. In the (IMO more likely, in the medium-to-long term) world where AI does have absolute advantage over us in all aspects, I don't think research ends. Plausibly we have AI producing huge numbers of papers, asking all sorts of interesting questions, and so on. Probably these questions do not include all questions we might ask, so there's still some role where we ask questions (even if the models are better at this, me still might care about questions they haven't asked)! And IMO the goal of science isn't just "papers," it also includes things like "understanding." Maybe the models have this in their weights or something, but I also want to understand stuff. So we have to extract this understanding from papers or model weights or w/ever. This might look more like hermeneutics than research currently does, but my experience is that trying to understand mathematics that someone else has produced is not actually so different from research. You end up doing a lot of work, developing your own intuition, and so on.

20

280

24

83

34K

Puneesh Deora @puneeshdeora

15 days ago

Now that the hype has settled a little, why is the immediate reaction to dunk on math people the second an AI helps prove a result? Why are some of you so desperate to laugh at mathematicians, what did even they do to you?

5

53

3

7

4K

Who to follow

Aniket Singh

@an1kets1ngh03

IIT Roorkee ‘20 | Engineering Lead @usebarq (10M users) | Join 4000+ software engineers learning how to ace system design interviews 👇

Flip bits for a living! Engineering @Razorpay, previously @Microsoft

puneeshdeora retweeted

Talia Ringer 🕊🪬 @TaliaRinger

19 days ago

I suspect for the next few years, AI for math will be most useful for pushing back against our biases that (1) certain false conjectures are true, and (2) certain approachable problems are intractably hard. Mostly by trying things our judgment is now too clouded to try

4

63

6

7

3K

Puneesh Deora @puneeshdeora

18 days ago

I totally get the speeding up new findings angle with AI in Math, but the process of discovery using the traditional way we've known for a while is very rewarding in itself. It is now being killed at a very fast pace and we're not talking much about it.

0

2

0

85

Puneesh Deora @puneeshdeora

about 1 month ago

Claude be like

Anthropic

@AnthropicAI

about 1 month ago

In fact, NLAs suggest Claude suspects it’s being tested across many of our evaluations, even when it doesn’t verbalize its suspicions.

AnthropicAI's tweet photo. In fact, NLAs suggest Claude suspects it’s being tested across many of our evaluations, even when it doesn’t verbalize its suspicions. https://t.co/dSD429ZlKA

30

1K

97

425

977K

0

5

0

1

610

Puneesh Deora @puneeshdeora

about 1 month ago

@PontiEdoardo @icmlconf Been saying for a while that ACs need to be held more accountable. Arguably the most important role in the current system. Revs have some consequences for not doing their job properly, it's time ACs have some too.

0

8

1

0

2K

Puneesh Deora @puneeshdeora

about 1 month ago

@icmlconf "lowk" "y'all" words I never expected icml account to use. Also, no caps at the start of sentences, missing periods, tough.

1

41

0

1

6K

puneeshdeora retweeted

Kimon Fountoulakis

@kfountou

about 1 month ago

I use GPT-5.5 Pro daily, and I have been using GPT models daily since 5.2. It does pose questions that seem like they could lead to a paper, but then I spend multiple days on these ideas and they flop. The good ideas come from humans. The models don’t have good taste yet.

8

70

5

11

9K

puneeshdeora retweeted

Bhavya Vasudeva @bhavya_vasudeva

about 1 month ago

I'll be presenting our recent work on how transformers learn to do contextual recall @scifordl workshop tomorrow poster session 1, 11:45 am

bhavya_vasudeva's tweet photo. I'll be presenting our recent work on how transformers learn to do contextual recall @scifordl workshop tomorrow poster session 1, 11:45 am https://t.co/xmaq7sbbZi

0

20

7

5

1K

Puneesh Deora @puneeshdeora

about 1 month ago

I'm at #ICLR2026 and will be presenting our recent work on in-context learning analogue of benign overfitting @scifordl workshop tomorrow poster session 1, 11:45 am.

puneeshdeora's tweet photo. I'm at #ICLR2026 and will be presenting our recent work on in-context learning analogue of benign overfitting @scifordl workshop tomorrow poster session 1, 11:45 am. https://t.co/zYl4Ilxb02

0

24

4

1

1K

puneeshdeora retweeted

Bhavya Vasudeva @bhavya_vasudeva

about 2 months ago

2⃣ On when and why Muon generalizes better than (S)GD 🕙 Poster session 6, 3:15pm-5:45pm 📍Pavilion 3 #809 https://t.co/ZV406xSX8w

0

9

4

2

959

puneeshdeora retweeted

Bhavya Vasudeva @bhavya_vasudeva

about 2 months ago

I'm at #ICLR2026 and will present two posters on Day 3: 1⃣ On how Transformer-based language models disentangle and compose latent concepts for in-context learning 🕙 Poster session 5, 10:30am-12pm 📍Pavilion 4 #4008

bhavya_vasudeva's tweet photo. I'm at #ICLR2026 and will present two posters on Day 3:

1⃣ On how Transformer-based language models disentangle and compose latent concepts for in-context learning
🕙 Poster session 5, 10:30am-12pm
📍Pavilion 4 #4008 https://t.co/IWQeddoOIt

1

61

15

26

3K

puneeshdeora retweeted

ICML Conference @icmlconf

2 months ago

Announcing the #ICML2026 tutorials! All ten tutorials will be presented the first day of the conference, Monday July 6. Read the blog post for more details on the selection process!

icmlconf's tweet photo. Announcing the #ICML2026 tutorials!

All ten tutorials will be presented the first day of the conference, Monday July 6.

Read the blog post for more details on the selection process! https://t.co/rHoko8Y3kZ

2

98

21

41

16K

puneeshdeora retweeted

Juno KIM @junokim_ai

2 months ago

Excited to share our new paper on sharp capacity scaling of the Muon optimizer! Joint work with @EshaanNichani Denny Wu @albertobietti @jasondeanlee: https://t.co/v1k1B4mSkG (1/7)

4

125

31

71

21K

Puneesh Deora @puneeshdeora

3 months ago

@f14bertolotti I was looking for a wall clock time comparison instead of just steps

0

2

0

257

Puneesh Deora @puneeshdeora

3 months ago

A big part is learning things, learning how to think, what to think, developing taste. Sending papers in the void is no good, but still formalizing and writing things helps. I'm not sure agent-written code would help with these

Jon Barron

@jon_barron

3 months ago

If I was a grad student today, I would: 1) Not write papers, 2) push my (agent-written) code to a public repo ~weekly, 3) maintain (via agents) a writeup.tex (manually verified) and a skill.md in the repo, and 4) work towards establishing skill usage as the new "citation" format.

16

573

29

470

95K

0

1

0

1

101

puneeshdeora retweeted

Edgar Dobriban @EdgarDobriban

3 months ago

AI is getting great at math, but how good is it at solving real research problems in areas outside of those covered by Erdős problems? Towards gauging this, I have started putting together a list of unsolved research problems in mathematical statistics and machine learning, sourced from recent papers in a leading statistics journal, the Annals of Statistics (with some bonus COLT open problems: https://t.co/aT9kyozYsv. Currently >100 problems. In my view, much of the value of AI for researchers in the mathematical sciences stems from helping with their own research problems. These are problems without known solutions. There are many math benchmarks, but few with the following properties: (1) of a realistic research-level, so that solving them can potentially lead to a publication in a top journal (problems discussed in papers already, not contest math, not Millenium problems, not problems created for a benchmark, not problems that have a known solution); I'd say Erdős problems are the best example of this. (2) cover problems outside of the usual focus (combinatorics, number theory, ... ) of Erdős problems. Especially under-represented are domains of applied math, along with statistics, operations research, etc. I'm interested in statistics and ML, so that's where I started, but this could grow over time. Hope this can grow into something useful to the community! Happy to hear your thoughts...

EdgarDobriban's tweet photo. AI is getting great at math, but how good is it at solving real research problems in areas outside of those covered by Erdős problems? Towards gauging this, I have started putting together a list of unsolved research problems in mathematical statistics and machine learning, sourced from recent papers in a leading statistics journal, the Annals of Statistics (with some bonus COLT open problems: https://t.co/aT9kyozYsv.

Currently >100 problems.

In my view, much of the value of AI for researchers in the mathematical sciences stems from helping with their own research problems. These are problems without known solutions. There are many math benchmarks, but few with the following properties:

(1) of a realistic research-level, so that solving them can potentially lead to a publication in a top journal (problems discussed in papers already, not contest math, not Millenium problems, not problems created for a benchmark, not problems that have a known solution);
I'd say Erdős problems are the best example of this.

(2) cover problems outside of the usual focus (combinatorics, number theory, ... ) of Erdős problems. Especially under-represented are domains of applied math, along with statistics, operations research, etc.
I'm interested in statistics and ML, so that's where I started, but this could grow over time.

Hope this can grow into something useful to the community! Happy to hear your thoughts...

32

433

73

334

55K

Puneesh Deora @puneeshdeora

3 months ago

I can finally pack the 8D oranges in my kitchen I’ve been putting off for a while. Feeling quite relieved and satisfied.

Math, Inc.

@mathematics_inc

3 months ago

We are pleased to share that using Gauss, we have completed a ~200K LOC formalization of Maryna Viazovska’s 2022 Fields Medal theorems on optimal sphere packing in dimensions 8 and 24. This is the only Fields Medal-winning result from this century to be completely formalized, and is the largest single-purpose Lean formalization in history. We are honored to have assisted @SidharthHarihar1 and the rest of the sphere packing team in this achievement. https://t.co/DhGDQzLkpH

45

2K

340

556

410K

1

8

0

550

Puneesh Deora @puneeshdeora

3 months ago

@miniapeur Some of the issues I think we could also fix ourselves; I've seen this lot in papers citing other papers https://t.co/bN32pv1VFr

Puneesh Deora @puneeshdeora

4 months ago

Idk if people know this, but google scholar does not index the second reference format type (first it does), and the second is the bibtex you get from arxiv.

puneeshdeora's tweet photo. Idk if people know this, but google scholar does not index the second reference format type (first it does), and the second is the bibtex you get from arxiv. https://t.co/eRWX7En4as

1

2

0

2

225

0

1

0

143

Puneesh Deora

@puneeshdeora

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users