Adil Salim

14 days ago

😯

15 days ago

https://t.co/eHlpgdNDmw

73

2K

233

991

552K

0

152

23 days ago

Very elegant

Peter Richtarik

@peter_richtarik

23 days ago

Imagine that projected gradient descent (PGD) was a new method, discovered today. How would that feel? This is a textbook algorithm... What further research, extensions, improvements and variants would this enable? In fact, together with Kaja Gruntkowska and Hanmin Li, we have just discovered a sister method to projected gradient descent -- one of equal conceptual importance. Our method admits the same or very similar guarantees as PGD. However, instead of relying on projections onto the constraint, it relies on linear minimization! You may say: Did you rediscover Frank-Wolfe? No. In contrast to Frank-Wolfe, which uses a global linear minimization oracle (global LMO), our method relies on a local minimization oracle (local LMO). For this reason, we simply call the method "Local LMO" (admittedly, conflating the oracle name with the method name). Frank-Wolfe theory is much more limited to the theory of Local LMO. Here are some key differences: 1) Frank-Wolfe only works if the constraint is bounded, and its convergence theory depends in the diameter of the constraint set. Local LMO works even for unbounded constraints, and its theory does not depend on the diameter of the constraint set. 2) In fact, Local LMO reduces to gradient descent (GD) in the unconstrained case. If the constraint is affine, Local LMO reduces to (preconditioned) GD in the affine space. 3) While Frank-Wolfe does not converge linearly for smooth strongly convex functions, Local LMO does. 4) While Frank-Wolfe does not converge for non-smooth convex problems (its theory depends on a curvature assumption), Local LMO does. https://t.co/znljMkSMqC

peter_richtarik's tweet photo. Imagine that projected gradient descent (PGD) was a new method, discovered today. How would that feel? This is a textbook algorithm... What further research, extensions, improvements and variants would this enable?

In fact, together with Kaja Gruntkowska and Hanmin Li, we have just discovered a sister method to projected gradient descent -- one of equal conceptual importance.

Our method admits the same or very similar guarantees as PGD. However, instead of relying on projections onto the constraint, it relies on linear minimization!

You may say: Did you rediscover Frank-Wolfe?

No.

In contrast to Frank-Wolfe, which uses a global linear minimization oracle (global LMO), our method relies on a local minimization oracle (local LMO). For this reason, we simply call the method "Local LMO" (admittedly, conflating the oracle name with the method name).

Frank-Wolfe theory is much more limited to the theory of Local LMO. Here are some key differences:

1) Frank-Wolfe only works if the constraint is bounded, and its convergence theory depends in the diameter of the constraint set. Local LMO works even for unbounded constraints, and its theory does not depend on the diameter of the constraint set.

2) In fact, Local LMO reduces to gradient descent (GD) in the unconstrained case. If the constraint is affine, Local LMO reduces to (preconditioned) GD in the affine space.

3) While Frank-Wolfe does not converge linearly for smooth strongly convex functions, Local LMO does.

4) While Frank-Wolfe does not converge for non-smooth convex problems (its theory depends on a curvature assumption), Local LMO does.

https://t.co/znljMkSMqC

8

124

21

73

22K

0

6

0

3

781

Who to follow

I work on AI at OpenAI. Former VP AI and Distinguished Scientist at Microsoft.

Fabian Pedregosa

@fpedregosa

Keeping the gradients flowing since 2013. Loves open source. Sometime blogs and writes papers.

Yi Zhang

@YiZhangZZZ

plant fruits@meta Prev. @xAI, @Apple, @MSFTResearch, PhD @princeton

AdilSlm retweeted

Lancelot Da Costa @lancelotdacosta

about 1 month ago

🎉 Applications are open for MLSS 2026 — the 50th edition of the Machine Learning Summer School! Join us in Tübingen for world-class lectures, hands-on sessions, and an amazing ML community. 🧠 Apply now 👉 https://t.co/21kvjHNdc5 #MLSS2026 #MachineLearning #SummerSchool #ML

3

240

26

247

16K

AdilSlm retweeted

Lancelot Da Costa @lancelotdacosta

about 2 months ago

We'll be organizing the Machine Learning Summer School in Tübingen to be held Aug 31st-Sept 11th, featuring top speakers across academia and industry. If you are a student or ML researcher, save those dates and stay tuned for updates! 🚀

13

269

19

189

18K

AdilSlm retweeted

Peter Richtarik

@peter_richtarik

3 months ago

Following my visit last month, I've just arrived to Berkeley again! At 9:30am PT today, I am giving the opening keynote talk at the Simons Institute workshop "Learning from Heterogeneous Sources". https://t.co/M3reOFx5f5 Title of my talk: "From the Ball-proximal (Broximal) Point Method to Efficient Training of LLM". Abstract: https://t.co/eoFsEyos2D During my February visit, I gave a tutorial on "Federated Optimization" at the "Federated and Collaborative Learning Boot Camp". Recordings of my lectures are available on the Simons Institute YouTube channel: Part 1: https://t.co/N47t85Mhh8 Part 2: https://t.co/kxiqicKWAi Part 3: https://t.co/3TzwbMF1Ac

peter_richtarik's tweet photo. Following my visit last month, I've just arrived to Berkeley again!

At 9:30am PT today, I am giving the opening keynote talk at the Simons Institute workshop "Learning from Heterogeneous Sources".

https://t.co/M3reOFx5f5

Title of my talk: "From the Ball-proximal (Broximal) Point Method to Efficient Training of LLM". Abstract: https://t.co/eoFsEyos2D

During my February visit, I gave a tutorial on "Federated Optimization" at the "Federated and Collaborative Learning Boot Camp". Recordings of my lectures are available on the Simons Institute YouTube channel:

Part 1: https://t.co/N47t85Mhh8
Part 2: https://t.co/kxiqicKWAi
Part 3: https://t.co/3TzwbMF1Ac

0

28

3

11

2K

5 months ago

@sitanch Link to the paper: https://t.co/zYMhZGQzme

0

15

1

3

962

5 months ago

📢New paper out! We propose an inference algorithm for diffusion models that does not explicitly depend on the ambient dimension and converges exponentially fast. That’s because, unlike most of the competition, we solve the reverse ODE via Picard and not via Euler discretization

AdilSlm's tweet photo. 📢New paper out!

We propose an inference algorithm for diffusion models that does not explicitly depend on the ambient dimension and converges exponentially fast. That’s because, unlike most of the competition, we solve the reverse ODE via Picard and not via Euler discretization https://t.co/H2B6W78TRN

9

210

22

133

15K

5 months ago

Kudos to Khashayar Gatmiry who led this project (that was part of his 2024 internship at MSR) and to @sitanch

1

8

0

903

AdilSlm retweeted

Ernest Ryu @ErnestRyu

5 months ago

I’ll work to make ChatGPT a better tool for accelerating scientific and mathematical discoveries. If you come across failure cases to improve upon (or exciting success stories) please send them my way!

46

512

30

66

147K

6 months ago

@peter_richtarik @orvieto_antonio 🤣

0

94

AdilSlm retweeted

7 months ago

It's now on the arxiv, enjoy! https://t.co/IfotVApR3X

21

635

101

411

128K

AdilSlm retweeted

7 months ago

Tres heureux de voir du coverage positif sur l'IA en France !!! https://t.co/ofvcHmnwaY

6

88

7

8

7K

7 months ago

When I say mistakes are more subtle than before. You see the bug?

1

28

1

13

6K

7 months ago

@vladtenev @HarmonicMath Okay I had a quick look. How does Lean stay up to date with mathematical literature? That’s probably not a big deal for IMO problems, that's a big deal for math research. All the theorems in my proof are 50+ years old — yet Lean doesn’t know them.

1

0

313

Timothy Gowers @wtgowers @wtgowers

7 months ago

100% agree on the productivity boost. One just needs patience to correct mistakes, which are more subtle than before imo. I had a nice interaction with GPT-5-pro while proving a convex analysis lemma: https://t.co/96qnbqeTca The model didn’t write the full proof, but the interaction was interesting enough for me to write a short report about it. The report illustrates both the productivity gain and the need for careful proof-checking. The model’s contributions are in blue, and the full chat is in the Appendix. You will see my prompts and how I think, so, no judgement please :) The problem itself has an history in optimal transport (see intro) and comes from a question I was discussing with some UCLA math professors last summer. Simpler than @ErnestRyu's recent result imo, but still very useful in optimal transport!

Sanjeev Arora

@prfsanjeevarora

7 months ago

Totally agree with @ErnestRyu that AI helpers will become very useful for research. But in the near future the biggest help will be with *informal* math, the kind we work out with our collaborators/grad students on a whiteboard. I already use frontier models to help write/debug lemmas for my papers and lectures. AI is fast, but can also misunderstand. So have to still carefully check the lemma statements and proofs. But already a big productivity boost. (Lean provers will automate the proof checking, but the human will still need to check that the lean formalization accurately captures their intent, which humans will be doing for a while.)

3

168

15

56

53K

6

157

19

75

46K

AdilSlm retweeted

Ernest Ryu @ErnestRyu

7 months ago

I firmly believe we are at a watershed moment in the history of mathematics. In the coming years, using LLMs for math research will become mainstream, and so will Lean formalization, made easier by LLMs. (1/4)

46

1K

187

406

474K

AdilSlm retweeted

7 months ago

I crossed an interesting threshold yesterday, which I think many other mathematicians have been crossing recently as well. In the middle of trying to prove a result, I identified a statement that looked true and that would, if true, be useful to me. 1/3

61

2K

300

758

893K

7 months ago

@PiusSprenger @ErnestRyu I think it’s just that many researchers work in both AI and convex optimization, because these are neighboring fields. For example, @ErnestRyu and I have both published in convex optimization journals and AI conferences.

1

0

111

7 months ago

@hayou_soufiane No I haven't. My original goal was to prove the result, not to evaluate GPT-5. Also, I don't know if I can behave with the other LLM as I behaved with GPT-5, since now I know the proof.

0

1

0

518