What if every quantum researcher had an army of students to help write quantum algorithms? LLMs are starting to serve as such a resource. We’ve partnered with @HivergeAI to use AI for quantum algorithm discovery, exploring practical quantum chemistry. https://t.co/RLeQKVTNCO
Really happy to share our new paper on using AlphaEvolve for mathematical exploration at scale, written with Javier Gómez-Serrano, Terence Tao, and @GoogleDeepMind's Bogdan Georgiev. We tested it on 67 problems and documented all our successes and failures. 🧵
- TTA: Skip for easy examples
- Thermal throttling: Sleep for 8s between runs (only affects average not record time)
Note: The authors reported a time of 2.02 seconds. My reproduction (torch 2.7.0; hardware seen below) had a min time of 1.99s.
Code: https://t.co/qPN6oebF5T
3/3
We challenged our intern @ramadan_al76760 (zero prior AI experience) to beat the CIFAR-10 training speed record using @hivergeai's algorithmic discovery engine.
Result: Sub-2-second (!!) training for the first time ever.
Falsest friend conjecture:
"Inhabitable" is the only word (or else the longest one) such that:
- it has the exact same spelling in two different languages (English & Spanish)
- it means the exact opposite thing in each
We took on the challenge and we’ve put our system to work on the nanoGPT benchmark. @hivergeai tech discovered new algorithmic improvements beyond the existing optimizations.
Check out the results in the PR https://t.co/CIS5phXK04 and read our blogpost https://t.co/92Bi764Ofv!
Love this project: nanoGPT -> recursive self-improvement benchmark. Good old nanoGPT keeps on giving and surprising :)
- First I wrote it as a small little repo to teach people the basics of training GPTs.
- Then it became a target and baseline for my port to direct C/CUDA re-implementation in llm.c.
- Then that was modded (by @kellerjordan0 et al.) into a (small-scale) LLM research harness. People iteratively optimized the training so that e.g. reproducing GPT-2 (124M) performance takes not 45 min (original) but now only 3 min!
- Now the idea is to use this process of optimizing the code as a benchmark for LLM coding agents. If humans can speed up LLM training from 45 to 3 minutes, how well do LLM Agents do, under different kinds of settings (e.g. with or without hints etc.)? (spoiler: in this paper, as a baseline and right now not that well, even with strong hints).
The idea of recursive self-improvement has of course been around for a long time. My usual rant on it is that it's not going to be this thing that didn't exist and then suddenly exists. Recursive self-improvement has already begun a long time ago and is under-way today in a smooth, incremental way. First, even basic software tools (e.g. coding IDEs) fall into the category because they speed up programmers in building the N+1 version. Any of our existing software infrastructure that speeds up development (google search, git, ...) qualifies. And then if you insist on AI as a special and distinct, most programmers now already routinely use LLM code completion or code diffs in their own programming workflows, collaborating in increasingly larger chunks of functionality and experimentation. This amount of collaboration will continue to grow.
It's worth also pointing out that nanoGPT is a super simple, tiny educational codebase (~750 lines of code) and for only the pretraining stage of building LLMs. Production-grade code bases are *significantly* (100-1000X?) bigger and more complex. But for the current level of AI capability, it is imo an excellent, interesting, tractable benchmark that I look forward to following.
Quietly building, testing, iterating. Today we’re out of stealth. Excited to finally share what we’ve been building at @hivergeai: an algorithm factory that creates new algorithms, optimised for real-world impact. Stay tuned!
Thrilled to announce @hivergeai. Our goal is to build algorithmic superintelligence.
See how we accelerate AI training and solve large-scale planning problems: https://t.co/rWH3KcIXro
After 1.5 years of work, I'm so excited to announce AlphaEvolve – our new LLM + evolution agent!
Learn more in the blog post: https://t.co/UwbM3jjN4t
White paper PDF: https://t.co/KpZUHAZeFm
(1/2)
Excited to share our latest work on EvoTune, a novel method integrating LLM-guided evolutionary search and reinforcement learning to accelerate the discovery of algorithms! 1/12🧵
The Beluga™ Competition results are in!
🏆Congrats to teams led by @ber24 , Daniel Gnad & Jean Jodeau!
Thanks also to the hundreds who registered & explored the challenge.
👉Check the winning teams https://t.co/cvrOvOqBHh
#TrustworthyAI#ExplainableAI
Our team @GoogleDeepMind has released AlphaTensor-Quantum, a new method that improves quantum circuits by reducing T gates needed for quantum ops such as those used in Shor’s alg & chemistry sims. A step towards scalable Quantum Computing that can transform Science&Security.
I love when people notice the secret sauce that is: things should just work!
@jetscott:
“I lifted my hand off the mouse, hand tracking was instantly back in action. Android XR made to whatever inputs are available: hands, eyes, voice, keyboards, mice or connected phones” 🧵
A clear step towards achieving my dream: building AI that assists competitive programmers 🧑💻
“This is an exciting approach to combine work of human competitive programmers and LLMs, to achieve results that neither would achieve on their own.” --Petr Mitrichev
Details below! 🧵
We also need more awareness of existing tools by organizations and governments. Please take a look at Google's Floodcast tool (https://t.co/f7f68VO37p) powered by Google Deepmind's Graphcast model (https://t.co/xdsZQ0pQVp)
Great episode! Asked whether an evolutionary approach is needed to get to a higher level @_rockt said "Science, the way humans do it, is evolutionary search and I don't see any other way of how automated scientific process can work differently. It has to be evolutionary". 💯%
I had a fantastic time talking with @samcharrington (@twimlai) about @orionbooks' "Artificial Intelligence: 10 Things You Should Know" (https://t.co/SDRrIQ7p30) and many exciting 2024 research papers (some of them from my teams) in the Open-Endedness community by outstanding researchers like @merrierm@jennyzhangzt@jeffclune@chrisantha_f@2ne1@edwardfhughes@MichaelD1729@ber24@akbirkhan@_chris_lu_@cong_ml@RobertTLange@j_foerst@hardmaru ...
Levels of AGI: Morris et al. Levels of AGI: Operationalizing Progress on the Path to AGI. ICML 2024. https://t.co/E8icsfd2Oj
OMNI: Zhang et al. OMNI: Open-endedness via Models of human Notions of Interestingness. ICML 2024. https://t.co/i5kRepu7hV
Promptbreeder: Fernando et al. Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution. ICML 2024. https://t.co/PUqg6uxbG6
Epistemology (@DavidDeutschOxf): Deutsch, D. (2012). The Beginning of Infinity. Penguin Books.
Open-Endedness: Hughes et al. Open-Endedness is Essential for Artificial Superhuman Intelligence. ICML 2024. https://t.co/bMu6dKqjlT
Rainbow Teaming: Samvelyan et al. Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts. NeurIPS 2024. https://t.co/bA3HDiJQ0q
FunSearch: Romera-Paredes et al. Mathematical discoveries from program search with large language models. Nature, 625(7995), 468–475. https://t.co/E0pvrs2j6g
AI Debate: Khan et al. Debating with More Persuasive LLMs Leads to More Truthful Answers. ICML 2024. https://t.co/QIySisEpb3
AI Scientist: Lu et al. The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery. arXiv 2024. https://t.co/DXfAdbROJd
Great to see FunSearch featured as one of this year’s top contributions to the field in the State of AI Report!
As always, a highly recommended report on how everything around AI keeps evolving. It even includes yesterday’s great news about the Nobel Prize awarded to Demis & John
🪩The @stateofai 2024 has landed! 🪩
Our seventh installment is our biggest and most comprehensive yet, covering everything you *need* to know about research, industry, safety and politics.
As ever, here's my director’s cut (+ video tutorial!) 🧵
I'm excited to speak at the next ML in PL Conference in November, thanks @MLinPL for organising!
I'm planning to talk about our last work, FunSearch. If you have any questions or topics you’d like me to cover, related or not to FunSearch, let me know.