Julien Pourcel

11 months ago

Introducing SOAR 🚀, a self-improving framework for prog synth that alternates between search and learning (accepted to #ICML!) It brings LLMs from just a few percent on ARC-AGI-1 up to 52% We’re releasing the finetuned LLMs, a dataset of 5M generated programs and the code. 🧵

PourcelJulien's tweet photo. Introducing SOAR 🚀, a self-improving framework for prog synth that alternates between search and learning (accepted to #ICML!)

It brings LLMs from just a few percent on ARC-AGI-1 up to 52%

We’re releasing the finetuned LLMs, a dataset of 5M generated programs and the code.

🧵 https://t.co/P7TVjcN0kM

189

135

32K

PourcelJulien retweeted

Demis Hassabis

@demishassabis

2 months ago

Excited to launch Gemma 4: the best open models in the world for their respective sizes. Available in 4 sizes that can be fine-tuned for your specific task: 31B dense for great raw performance, 26B MoE for low latency, and effective 2B & 4B for edge device use - happy building!

demishassabis's tweet photo. Excited to launch Gemma 4: the best open models in the world for their respective sizes. Available in 4 sizes that can be fine-tuned for your specific task: 31B dense for great raw performance, 26B MoE for low latency, and effective 2B & 4B for edge device use - happy building! https://t.co/Sjbe3ph8xr

326

874

991K

PourcelJulien retweeted

Qwen

@Alibaba_Qwen

3 months ago

🚀 Introducing the Qwen 3.5 Small Model Series Qwen3.5-0.8B · Qwen3.5-2B · Qwen3.5-4B · Qwen3.5-9B ✨ More intelligence, less compute. These small models are built on the same Qwen3.5 foundation — native multimodal, improved architecture, scaled RL: • 0.8B / 2B → tiny, fast, great for edge device • 4B → a surprisingly strong multimodal base for lightweight agents • 9B → compact, but already closing the gap with much larger models And yes — we’re also releasing the Base models as well. We hope this better supports research, experimentation, and real-world industrial innovation. Hugging Face: https://t.co/wFMdX5pDjU ModelScope: https://t.co/9NGXcIdCWI

Alibaba_Qwen's tweet photo. 🚀 Introducing the Qwen 3.5 Small Model Series
Qwen3.5-0.8B · Qwen3.5-2B · Qwen3.5-4B · Qwen3.5-9B

✨ More intelligence, less compute.
These small models are built on the same Qwen3.5 foundation — native multimodal, improved architecture, scaled RL:
• 0.8B / 2B → tiny, fast, great for edge device
• 4B → a surprisingly strong multimodal base for lightweight agents
• 9B → compact, but already closing the gap with much larger models
And yes — we’re also releasing the Base models as well.
We hope this better supports research, experimentation, and real-world industrial innovation.
Hugging Face: https://t.co/wFMdX5pDjU
ModelScope: https://t.co/9NGXcIdCWI

911

21K

14K

Nothing special. ML, OR, RL. #Umontreal, #Concorde, #Mcgill.

4 months ago · Talence

@silviasapora @GuillaumeAP time to improve your LLM rewards!

Who to follow

Touraj

@toory465

###paric***

@_____rich______

Debajyoti (Debo) Datta

@debo_datta_

Co-Founder @HippocraticAI | PhD @UVa | Ex Amazon (AWS) Interests: Differential Geometry, Tensor Decomposition, Large Language Models, Healthcare

4 months ago

@ADarmouni @jonashuebotter Really cool paper🔥

129

PourcelJulien retweeted

6 months ago

ARC Prize 2025 Winners Interviews Paper Award 2nd Place @PourcelJulien, @cedcolas, @pyoudeyer discuss SOAR - a self-improving evolutionary program synthesis framework that fine-tunes an LLM on its own search traces - without human-engineered DSLs or solution datasets.

PourcelJulien retweeted

Cédric @cedcolas

6 months ago

Our self-improving genetic algorithm received the 2nd place paper award for the @arcprize! Congrats in particular to @PourcelJulien the experiments wizard! We proposed a simple, general algorithm ⬇️

6 months ago · Yucca Valley

@82deutschmark @arcprize @cedcolas @pyoudeyer Awesome !

6 months ago · San Diego

@shubhramishra_ @fchollet Thanks!

6 months ago · San Diego

Check out the awesome Oral paper of my brother @GuillaumeAP now in NeurIPS #2101 👀

333

PourcelJulien retweeted

François Chollet

@fchollet

6 months ago

Congrats to the ARC Prize 2025 winners! The Grand Prize remains unclaimed, but nevertheless 2025 saw remarkable progress on LLM-driven refinement loops, both with "local" models and with commercial frontier models. We also saw the rise of zero-pretraining DL approaches like HRM and TRM. Lots of new learnings!

517

112

74K

PourcelJulien retweeted

6 months ago

@jm_alexia ARC Prize 2025 / Paper Award Winner Second Place / SOAR @PourcelJulien, @cedcolas, @pyoudeyer Interview: https://t.co/soy5aNndAY

12K

PourcelJulien retweeted

6 months ago

ARC Prize 2025 Paper Award Winners 1st / "Less is More: Recursive Reasoning with Tiny Networks" (TRM) / A. Jolicoeur-Martineau / $50k 2nd / "Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI" (SOAR) / J. Pourcel et al. / $20k 3rd / "ARC-AGI Without Pretraining" / I. Liao et al. / $5k

arcprize's tweet photo. ARC Prize 2025 Paper Award Winners

1st / "Less is More: Recursive Reasoning with Tiny Networks" (TRM) / A. Jolicoeur-Martineau / $50k

2nd / "Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI" (SOAR) / J. Pourcel et al. / $20k

3rd / "ARC-AGI Without Pretraining" / I. Liao et al. / $5k

279

148

133K

PourcelJulien retweeted

Greg Kamradt

@GregKamradt

6 months ago

ARC Prize 2025 competition concluded today - The year of Refinements Our goal is to bring meaningful open source research into the community and today we awarded $137K to 14 teams Benchmarks matter, but their true value comes from the progress they catalyze ARC Prize 2025 was designed to inspire the community to publish research aimed at building more generalized systems The grand prize remains unclaimed, but the leaderboard reflects strong advances, and all submissions and solutions are now open sourced. Here is a recap of the winners, for more, checkout the great recap by @mikeknoop (link below) ** Paper Prizes ** 1/ Alexia Jolicoeur-Martineau (@jm_alexia) - TRM Tiny Recursive Model (TRM) is a tiny 2-layer network that does recursive reasoning: it keeps a latent state z and a current answer y, repeatedly updates z using the puzzle and y, then refines y from z over many “deep supervision” steps, so it can gradually fix its own mistakes without needing a huge model. It simplifies Hierarchical Reasoning Model (HRM). 2/ Pourcel julien (@PourcelJulien) - Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI SOAR is a self-improving evolutionary program synthesis system: it uses an LLM to sample and refine Python programs for ARC tasks (Sample & Refine phase), then turns all those attempts-both successes and failures-into new problem–solution pairs via hindsight relabeling, and fine-tunes the same LLM so it gets better at both sampling and refinement next time. 3/ ARC-AGI Without Pretraining - Isaac Liao (@LiaoIsaac91893) CompressARC shows that lossless information compression alone can produce intelligent behavior on ARC-AGI: for each puzzle, it builds a randomly initialized neural network and uses gradient descent at inference time to find a compact representation (like a VAE-style loss: cross-entropy + KL) that best “compresses” all the given example grids. ** Top Scores ** 1/ NVARC (@JFPuget, Ivan Sorokin) The NVIDIA team built a huge synthetic dataset of ARC-AGI puzzles, then turned those summaries into Python programs that produce consistent input/output grid pairs. Used test-time fine-tuning (TTFT) plus a fast Depth-First Search decoding process to adapt each model to the hidden test puzzles. 2/ the ARChitects (@dvhrtm, Daniel Franzen, @JDisselh) The ARChitects fine-tune a LLM on ARC-style grids and then use it at test time in two roles: 1) As a generator that, via depth-first search (DFS) over token probabilities, systematically explores the space of high-probability candidate solutions (not just random samples), 2) Second as a scorer that evaluates how likely each complete solution is. 3/ MindsAI @ Tufa Labs (@MindsAI_Jack, @DriesSmit1, @MohamedOsmanML, @bayesilicon) Trained a trimmed CodeT5 encoder–decoder model for years on the massive ARC-AGI Mega dataset (100M+ examples) using span corruption, reversals, and BPE dropout so it learned structure, not surface patterns. At inference, they ran large-scale test-time training (TTT) on thousands of permuted and augmented versions of the test set, then applied AIRV. 4/ Lonnie Lonnie reused the 2024 ARChitects pipeline but treated the random seed as a hyperparameter, systematically exploring seeds to exploit variance on the small 240-task evaluation set, which pushed an otherwise baseline-style system up to 5th place on the private leaderboard. 5/ Guillermo Barbadillo @ Veridas (@guille_bar) Guillermo believes that ARC will ultimately be solved by a search-and-learn approach that combines program synthesis with test-time training (TTT) and hindsight relabeling, so the system can search over code, learn from failed attempts, and steadily refine its solutions. We're going bigger in 2026! Let' go!!

PourcelJulien retweeted

6 months ago

Announcing the ARC Prize 2025 Top Score & Paper Award winners The Grand Prize remains unclaimed Our analysis on AGI progress marking 2025 the year of the refinement loop

arcprize's tweet photo. Announcing the ARC Prize 2025 Top Score & Paper Award winners

The Grand Prize remains unclaimed

Our analysis on AGI progress marking 2025 the year of the refinement loop https://t.co/Lbap0VVFs9

310

124

223K

6 months ago

@francoisfleuret Baguettes are safe? 🍷🥖

PourcelJulien retweeted

Cédric @cedcolas

6 months ago

In San Diego for #NeurIPS Happy to chat about open-endedness, self goal-generation, intrinsic motivations, self-improvement, human-machine collective intelligence Open to hear about research scientist opportunities too Don't hesitate to reach out!

6 months ago

@ClementRomac Thanks @ClementRomac !

6 months ago

Big news: I’m officially a 2025 Google PhD Fellow! 🎓✨ I’m also heading to #NeurIPS2025 in SD! Happy to chat about LLM, code gen, evolutionary algo, open-endedness, self-improvement, enhancing LLM diversity, ARC-AGI, and other subjects. Open to hear about summer internship. ☀️

968