This is a remarkable milestone in which our agent can work on a research problem for a very long time, then come back and tell us if it has succeeded or failed! We visualize the inference cost Aletheia decided to spend on each candidate solution (as a multiple of the inference cost of for solving Erdős-1051, see our previous work https://t.co/yEXZdMSXuW).
P7 is extremely interesting. It has been an open problem for several years, and nobody else came close to solving it in the FirstProof contest per @tonylfeng. We initially thought Aletheia had no chance; turned out it was right! Aletheia spent most compute on P7, 16x amount we used for Erdős-1051. Remarkably, per @kimshmath, "This was the first case that I have ever seen that an AI applies several deep mathematical results (by Cartan/Leray/Borel/Atiyah/Quillen/Novikov/Kasparov...) flawlessly. It is a very unique instance."
Thrilled to share: #Aletheia, our math research agent, just solved 6/10 notoriously hard FirstProof problems autonomously, the best result in the inaugural challenge! To me, this is even bigger than our historic IMO-gold achievement last year; these problems challenge even top mathematicians. We share our results transparently, see paper and full thoughts in the thread. 👇
Yes, we provided 3 things for AI-assisted math:
* Human-AI interaction (HAI) card (photo), inspired by model cards
* Full transcripts https://t.co/NvO8p4Wiva
* A label for novelty-autonomy, inspired by SAE Levels of autonomy, see #Aletheia paper https://t.co/8pLHmZZQO4
6 months in, after the IMO-gold achievement, I’m very excited to share another important milestone: AI can help accelerate knowledge discovery in mathematics, physics, and computer science! We’re sharing Two new papers from @GoogleDeepMind and @GoogleResearch that explore how Gemini #DeepThink together with agentic workflows can empower mathematicians and scientists to tackle professional research problems. Some highlights:
The first paper built a research agent #Aletheia, powered by an advanced version of Gemini Deep Think, that can autonomously produce publishable math research and crack open Erdős problems.
The second paper, built on similar agentic reasoning ideas, helped resolve bottlenecks in 18 research problems, across algorithms, ML and combinatorial optimization, information theory and economics.
See the thread for details about the two papers and the joint blog post.
A new proof reveals a surprising new link between graph theory and the Fourier transform. “It is a little bit like the moon landing or the 4-minute mile,” said Tom Sanders of the University of Oxford. “It’s not clear ahead of time what this is going to open up.”
https://t.co/hkZa9ueMIL
https://t.co/CikpPtCIOx
The concluding remark from the introduction (I didn't write this part, but cannot agree more with this):
"... we caution against overexcitement about its mathematical significance. (1/3)
... As AI-generated mathematics grows, the community must remain vigilant of “subconscious plagiarism”, whereby AI reproduces knowledge of the literature acquired during training, without proper acknowledgment. (2/3)
* Strongly Polynomial Time Complexity of Policy Iteration for L∞ Robust MDPs
https://t.co/33MxUt3Ln1
Human found a theoretical theorem in the complexity theory, and AI produced a generalization. (6/6)
Six very recent AI-math papers from our team:
* Irrationality of rapidly converging series: a problem of Erdős and Graham
https://t.co/gdEUfyXDkc
An Erdos-1051 problem was solved and generalized together by an AI and human authors. It required a *lot* of human efforts, especially in the generalization part. (1/6)
* Arithmetic volumes of moduli stacks of shtukas
https://t.co/GthiTPmgHb
A difficult computation was first done by humans by brute force, and then was done again elegantly by AI. The authors chose the AI proof in the draft in the end. (5/6)
* Eigenweights for arithmetic Hirzebruch Proportionality
https://t.co/CBT6XtySlJ
Fully autonomously, AI was able to compute certain arithmetic-analytic quantities. (4/6)
* Lower bounds for multivariate independence polynomials and their generalisations
Collaborating with AI, the authors (including 이준경) prove a fundamental inequality regarding multivariable versions of independence polynomials. (3/6)
* Semi-Autonomous Mathematics Discovery with Gemini: A Case Study on the Erdős Problems
https://t.co/CikpPtCIOx
We describe our experieces regarding the use of AI for mathematics research. We explain why we should care, and why we should still *not be overexcited*. (2/6)