Researcher in maths+formal methods+ml. Working on using formal verification to train models for reliable mathematics, software, and reasoning @harmonicmath
I used @HarmonicMath's Aristotle to formalize Erdős problem #426 in Lean 4…
and ended up fully verifying a stronger bound that the original paper only suggested 🧵
Erdős offered $25 for a disproof, and $100 if the conjecture was true 👇
"Aristotle's proof is correct, simple, elegant, and beautiful. It uses techniques in the original paper and adds its own new ideas. I am amazed and impressed by what Aristotle has done."
This is what Melvyn Nathanson, a leading additive number theorist and longtime Erdős collaborator, wrote to me after reading solutions by Aristotle (@HarmonicMath) to two problems he had posed earlier this year.
Our paper answers Nathanson's Problems 10 and 11 on product intersection sets in semigroups, and also settles the second parts of Problems 4 and 7 as corollaries.
Kind of crazy. I had a rough idea for an Erdős problem, gave it to GPT-5.4 Pro, went for a walk, came back to a solution. I verified it, formalized it with Aristotle from @HarmonicMath together with @Tomodovodoo. Incredible how powerful these tools are in the right hands!
No amount of testing will ever prove your software is secure. People are waking up to this. Formal verification isn’t academic anymore, it’s inevitable.
Another very cool use case of Aristotle in https://t.co/tgWe6wYQAr, fixing a gap in GPT-5.4's proof. This is exactly why we need verification in the age of LLM produced mathematics; as LLMs get more reliable and more convincing we will increasingly use them for harder and harder tasks, and in domains like mathematics where correctness is key we need ways of verifying (and in this case repairing) that can keep up.
A decade ago, I abandoned the first math problem my PhD advisor ever gave me.
This week, I finally solved it—and formally verified it—using @AnthropicAI's Claude Code, @OpenAI's Codex, and @HarmonicMath's Aristotle.
Here’s how AI turned my 10-year-old notes into a 15,000-line Lean 4 proof. 🧵
@wu_s_john Try using `lake exe cache get` to download all the build artifacts for mathlib, looks like your PC is failing to build them all locally for some reason (mathlib is pretty big by now!)
🦾Meet Aristotle Agent, the world’s first autonomous mathematician — live and currently free of charge. We designed Aristotle Agent to solve and formalize the world’s most challenging mathematical research problems. It is now:
☑️#1 in Formal Math: We’re the #1 formal math model according to ProofBench, by @ValsAI, ahead of the closest competitor by 15%. Aristotle Agent can autonomously prove/formalize for up to 24 hrs without human intervention.
☑️Fully Agentic: Give it an English problem and it will prove/formalize from scratch, or it can work and edit files directly inside your Lean project / repository.
☑️Github-ready: Aristotle agent produces repo-quality code; project leads are increasingly merging Aristotle-drafted PRs with no modifications.
Now live across both web, CLI, and API. 🔥
"Lean in my view is the best programming language ever created." 😍
Great interview with @tachim and @vladtenev on @CogRev_Podcast!
🔗https://t.co/RL7N58n8I8
For the first time since 2018 I'll be attending an Ethereum venue: EthCC[9]. I'll speak about one critical choice for getting your code formally verified. This will be my response to "So, Computers Can Prove Theorems (in Lean), What Next?" by @AlexJBest.
Today we're donating $300k to @leanprover as the inaugural sponsor!
We believe the future of mathematical reasoning lies in formal verification. Our model, Aristotle, uses Lean to eliminate errors and verify results. We're thrilled to support the tools and people that make safe, accurate Mathematical Superintelligence possible.
JUST IN: You can now submit questions in natural language through our web interface. You no longer need to use the API / TUI.
Give it a try, and let us know what you think!
We formalized FRI soundness in Lean, using @HarmonicMath and Claude Code.
- FRI analysis by @nico_mnbl and collaborators
- turned into a Lean proof by @pirapira
🔥
We’re excited to announce $1,000,000 of sponsorships directly to students and researchers to encourage further exploration using AI and formal verification.
More details 👇
Ever had a critical Python script hang after days of runtime? 0% CPU. No errors. Restarting wipes the debug state and existing tools just show the thread as "idle".
Today, we’re open-sourcing python-memtools to solve exactly this. 🧵👇
Milestone reached: we have finished formalizing Section 3 of the Noperthedron paper!
Reuben Steenekamp's work on https://t.co/WLzCaEqTwK gave us a significant boost, and @HarmonicMath's Aristotle has been repeatedly helpful for filling in proofs.
Here I present a complete auto-formalization of a recent maths paper (again!)
https://t.co/c6bJgi4YOl
Barańczuk, Stefan. "Reducing the Number of Equations Defining a Subset of the n-Space over a Finite Field." Annales de la Faculté des sciences de Toulouse : Mathématiques, ser. 6, vol. 33, no. 1 (2024): 177–182. https://t.co/LdPVSE9mCU
I spent a few days on this project. First, I ran Aristotle by @HarmonicMath , which in about 15 hours completely auto-formalized the proof. Then, with the great help of @PietroMonticone , I managed to set up a blueprint version of the proof. This is a version in which all parts of the documentation in LaTeX become interactive and can be inspected and studied. We can see the dependencies in the proof and study their relations.
In the post-processing stage, I also used Grok Heavy and Codex CLI with GPT-5.2 in xhigh mode to write a line-by-line analysis of the formal proof. This is a great help for people who are not professional Lean 4 programmers. You can really internalize all the steps of the proof.
I want to summarize my impressions and what I learned from this experience. @vladtenev@Leonard41111588@HarmonicMath@llllvvuu@littmath@AlexKontorovich@jdlichtman@KenOno691@CarinaLHong@gdb@hongyuan_mei
Today we’re open sourcing pbcc, a streamlined Protobuf compiler for Python. Built for high-performance workloads, it handles massive datasets with reduced overhead and a much cleaner Python API.