Our paper proposing a first study of LLMs grounded in an interactive environment with Online RL got accepted to ICML 2023 🎉
We also released new results 🚀
🥳 🎊 🎉 Exciting News! Our paper "Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning" has been accepted for ICML2023.
We have posted a new version on arxiv with 𝗶𝗺𝗽𝗿𝗼𝘃𝗲𝗱 𝗿𝗲𝘀𝘂𝗹𝘁𝘀 to celebrate 🎉📝
https://t.co/HzzAsc5Hqe
We looked into the weeds of hindsight experience replay and came up with an efficient way of learning from all goals at once in off-policy goal-conditioned RL!
It works well for reasonable numbers (100s--1000s) of non-exclusive sparse reward goals
I want to share Alan Code, my own implementation of a Claude Code like agent 🤖
What it adds:
- Designed as a Python library to build your systems on, not just a CLI
- Browser GUI with an LLM Perspective panel and Git integration
- Project and global memory + live cost tracking
Our self-improving genetic algorithm received the 2nd place paper award for the @arcprize!
Congrats in particular to @PourcelJulien the experiments wizard!
We proposed a simple, general algorithm ⬇️
New research engineer position open in my lab @FlowersINRIA at @Inria on
generative AI agentic architectures for educational technologies ! This is for a large scale collaborative project with EvidenceB edTech company.
More info here: https://t.co/OLsUcyLyH7
🚀 New internship positions available in the Flowers AI & CogSci lab !!!
Topics: Curiosity-driven deep RL, program synthesis with SWE LLM agents, data science for edTech + other !
Only for students currently enrolled in a master program
Info here: https://t.co/Cv49BOyAj0
After four incredible years in the Flowers AI & CogSci Lab, I’m thrilled to share that I’ve officially become a Doctor! 🎓💫
My thesis: “Language as a Cognitive Tool for Open Agents.”
It explores how language helps artificial agents explore, adapt, and learn efficiently.
And just like that, @OpenAI gpt-oss is now the number one trending model on @huggingface, out of almost 2M open models 🚀
People sometimes forget that they've already transformed the field: GPT-2, released back in 2019 is HF's most downloaded text-generation model ever, and Whisper has consistently ranked in the top 5 audio models.
Now that they are doubling down on openness, they may completely transform the AI ecosystem, again. Exciting times ahead!
Visiting @FlowersINRIA this week to dive deeper into a project I’ve been working on with @ClementRomac, @cedcolas, and @pyoudeyer all about balancing exploration and exploitation in autotelic RL agents. Preview soon 👀
Introducing SOAR 🚀, a self-improving framework for prog synth that alternates between search and learning (accepted to #ICML!)
It brings LLMs from just a few percent on ARC-AGI-1 up to 52%
We’re releasing the finetuned LLMs, a dataset of 5M generated programs and the code.
🧵
I’m attending #ICML this week! We’ll be presenting MAGELLAN during the poster session on Thursday with @CartaThomas2 & @ClementRomac
If you’re not in Vancouver, we recorded a talk presenting the paper last week, it’s available on YouTube (link below)
I'm attending ICML 2025 this week in Vancouver where we're presenting our MAGELLAN paper along with @LorisGaven and @CartaThomas2!
📅 Come discuss at our poster session on July 17 at 11 am East Exhibition Hall A-B E-2803
Or reach out for a chat!
https://t.co/Uk8HKmcOHM
🚀 Introducing 🧭MAGELLAN—our new metacognitive framework for LLM agents! It predicts its own learning progress (LP) in vast natural language goal spaces, enabling efficient exploration of complex domains.🌍✨Learn more: 🔗 https://t.co/uGLBSsOgMn #OpenEndedLearning#LLM#RL
We just released SmolTalk 2, a carefully balanced dataset to unlock dual-mode reasoning in LLMs through multi-stage training 🔥!
Models trained on SmolTalk 2 can:
🎲 Reason across multiple turns
🌎 Converse in 6 languages: en, it, es, de, pt, fr
🛠️ Use tools with & without long CoT
Link to the dataset ⤵️
Thrilled to finally share what we've been working on for months at @huggingface 🤝@pollenrobotics
Our first robot: Reachy Mini
A dream come true: cute and low priced, hackable yet easy to use, powered by open-source and the infinite community.
Tiny price, small size, huge possibilities. A robot built to code, learn, share with AI builders of all ages, all around the globe, using the latest vision, speech and text AI model. A first robot for today's and tomorrow's AI builders.
Read more and order now at https://t.co/UpjRipw5tP
First deliveries expected right after the summer.
Two weeks left to register (for free) for the stand-alone workshop on intrinsically motivated open-ended learning (IMOL) @UniofHerts, 8. to 10. of Sept. 2025, UK.
Details on website:
https://t.co/A2X4lKe1iA
Please share with interested parties.
#AI#IntrinsicMotivation
We're giving a public and remote talk on MAGELLAN!
1h talk + 1h Q&A, to discuss MAGELLAN but also autotelic (LLM) agents, learning progress, curiosity-driven learning...
All you need is to register 😀
🔔 Join our MAGELLAN talk on July 2!
We'll explore how LLM agents can monitor their own learning progress and choose what to learn next, like curious humans 🤔
1h presentation + 1h Q&A on autotelic agents & more!
📅 July 2, 4:30 PM CEST
🎟️ https://t.co/aYo8Xutm4q
🔔 Join our MAGELLAN talk on July 2!
We'll explore how LLM agents can monitor their own learning progress and choose what to learn next, like curious humans 🤔
1h presentation + 1h Q&A on autotelic agents & more!
📅 July 2, 4:30 PM CEST
🎟️ https://t.co/aYo8Xutm4q