100% agree on the productivity boost. One just needs patience to correct mistakes, which are more subtle than before imo.
I had a nice interaction with GPT-5-pro while proving a convex analysis lemma: https://t.co/96qnbqeTca
The model didn’t write the full proof, but the interaction was interesting enough for me to write a short report about it. The report illustrates both the productivity gain and the need for careful proof-checking. The model’s contributions are in blue, and the full chat is in the Appendix. You will see my prompts and how I think, so, no judgement please :)
The problem itself has an history in optimal transport (see intro) and comes from a question I was discussing with some UCLA math professors last summer. Simpler than @ErnestRyu's recent result imo, but still very useful in optimal transport!
Totally agree with @ErnestRyu that AI helpers will become very useful for research. But in the near future the biggest help will be with *informal* math, the kind we work out with our collaborators/grad students on a whiteboard. I already use frontier models to help write/debug lemmas for my papers and lectures. AI is fast, but can also misunderstand. So have to still carefully check the lemma statements and proofs. But already a big productivity boost. (Lean provers will automate the proof checking, but the human will still need to check that the lean formalization accurately captures their intent, which humans will be doing for a while.)
ChatGPT solved an optimization problem that puzzled me for a very long time. Back in summer 2020, I started to work with coauthors on algorithms for optimal transport. We then invented a very simple and elegant algorithm: Bregman Douglas-Rachford splitting method (BDRS).
Imagine that projected gradient descent (PGD) was a new method, discovered today. How would that feel? This is a textbook algorithm... What further research, extensions, improvements and variants would this enable?
In fact, together with Kaja Gruntkowska and Hanmin Li, we have just discovered a sister method to projected gradient descent -- one of equal conceptual importance.
Our method admits the same or very similar guarantees as PGD. However, instead of relying on projections onto the constraint, it relies on linear minimization!
You may say: Did you rediscover Frank-Wolfe?
No.
In contrast to Frank-Wolfe, which uses a global linear minimization oracle (global LMO), our method relies on a local minimization oracle (local LMO). For this reason, we simply call the method "Local LMO" (admittedly, conflating the oracle name with the method name).
Frank-Wolfe theory is much more limited to the theory of Local LMO. Here are some key differences:
1) Frank-Wolfe only works if the constraint is bounded, and its convergence theory depends in the diameter of the constraint set. Local LMO works even for unbounded constraints, and its theory does not depend on the diameter of the constraint set.
2) In fact, Local LMO reduces to gradient descent (GD) in the unconstrained case. If the constraint is affine, Local LMO reduces to (preconditioned) GD in the affine space.
3) While Frank-Wolfe does not converge linearly for smooth strongly convex functions, Local LMO does.
4) While Frank-Wolfe does not converge for non-smooth convex problems (its theory depends on a curvature assumption), Local LMO does.
https://t.co/znljMkSMqC
🎉 Applications are open for MLSS 2026 — the 50th edition of the Machine Learning Summer School!
Join us in Tübingen for world-class lectures, hands-on sessions, and an amazing ML community. 🧠
Apply now 👉 https://t.co/21kvjHNdc5
#MLSS2026#MachineLearning#SummerSchool#ML
We'll be organizing the Machine Learning Summer School in Tübingen to be held Aug 31st-Sept 11th, featuring top speakers across academia and industry. If you are a student or ML researcher, save those dates and stay tuned for updates! 🚀
Following my visit last month, I've just arrived to Berkeley again!
At 9:30am PT today, I am giving the opening keynote talk at the Simons Institute workshop "Learning from Heterogeneous Sources".
https://t.co/M3reOFx5f5
Title of my talk: "From the Ball-proximal (Broximal) Point Method to Efficient Training of LLM". Abstract: https://t.co/eoFsEyos2D
During my February visit, I gave a tutorial on "Federated Optimization" at the "Federated and Collaborative Learning Boot Camp". Recordings of my lectures are available on the Simons Institute YouTube channel:
Part 1: https://t.co/N47t85Mhh8
Part 2: https://t.co/kxiqicKWAi
Part 3: https://t.co/3TzwbMF1Ac
📢New paper out!
We propose an inference algorithm for diffusion models that does not explicitly depend on the ambient dimension and converges exponentially fast. That’s because, unlike most of the competition, we solve the reverse ODE via Picard and not via Euler discretization
I’ll work to make ChatGPT a better tool for accelerating scientific and mathematical discoveries. If you come across failure cases to improve upon (or exciting success stories) please send them my way!
@vladtenev@HarmonicMath Okay I had a quick look. How does Lean stay up to date with mathematical literature? That’s probably not a big deal for IMO problems, that's a big deal for math research. All the theorems in my proof are 50+ years old — yet Lean doesn’t know them.
100% agree on the productivity boost. One just needs patience to correct mistakes, which are more subtle than before imo.
I had a nice interaction with GPT-5-pro while proving a convex analysis lemma: https://t.co/96qnbqeTca
The model didn’t write the full proof, but the interaction was interesting enough for me to write a short report about it. The report illustrates both the productivity gain and the need for careful proof-checking. The model’s contributions are in blue, and the full chat is in the Appendix. You will see my prompts and how I think, so, no judgement please :)
The problem itself has an history in optimal transport (see intro) and comes from a question I was discussing with some UCLA math professors last summer. Simpler than @ErnestRyu's recent result imo, but still very useful in optimal transport!
Totally agree with @ErnestRyu that AI helpers will become very useful for research. But in the near future the biggest help will be with *informal* math, the kind we work out with our collaborators/grad students on a whiteboard. I already use frontier models to help write/debug lemmas for my papers and lectures. AI is fast, but can also misunderstand. So have to still carefully check the lemma statements and proofs. But already a big productivity boost. (Lean provers will automate the proof checking, but the human will still need to check that the lean formalization accurately captures their intent, which humans will be doing for a while.)
I firmly believe we are at a watershed moment in the history of mathematics. In the coming years, using LLMs for math research will become mainstream, and so will Lean formalization, made easier by LLMs. (1/4)
I crossed an interesting threshold yesterday, which I think many other mathematicians have been crossing recently as well. In the middle of trying to prove a result, I identified a statement that looked true and that would, if true, be useful to me. 1/3
@PiusSprenger@ErnestRyu I think it’s just that many researchers work in both AI and convex optimization, because these are neighboring fields. For example, @ErnestRyu and I have both published in convex optimization journals and AI conferences.
@hayou_soufiane No I haven't. My original goal was to prove the result, not to evaluate GPT-5. Also, I don't know if I can behave with the other LLM as I behaved with GPT-5, since now I know the proof.