Excited to be at hashtag #ICML2025 this week! 🌟
Delighted to present two main-track conference papers and honored to receive an oral presentation slot at the Computer Use Agents Workshop. Looking forward to connecting and discussing cutting-edge developments in LLM agents and in-context planning:
"EvoControl: Multi-Frequency Bi-Level Control for High-Frequency Continuous Control"
A novel hierarchical RL method combining PPO and Evolution Strategies to achieve robust high-frequency control without manual tuning, enabling safer and faster interactions in real-world robotic tasks —work done during my internship at Google DeepMind.
🗓️ Main conference, Thu 17 Jul, 4:30–7:00 pm PDT, West Exhibition Hall B2-B3 (#W-401)
📄 https://t.co/pgGTFVX8BX
“G-Sim: Generative Simulations with Large Language Models and Gradient-Free Calibration”
A new LLM-agent-based method for automatically generating and calibrating stochastic simulation environments for use in planning.
🗓️ Main conference, Tue 15 Jul, 4:30–7:00 pm PDT, East Exhibition Hall A-B (#E-1400)
📄 https://t.co/Ivg2mlOo4e
"Improving LLM Agent Planning with In-Context Learning via Atomic Fact Augmentation and Lookahead Search"
An in-context planning framework that equips LLM agents with atomic-fact memory and depth-limited look-ahead search—enabling LLM agents capable of trustworthy, long-horizon decision-making in complex, embodied settings such as people’s homes, through online in-context RL without any model weight updates.
🗓️ Oral Talk @ Workshop on Computer Use Agents, 11:45–11:55 am PDT, West Meeting Room 211-214
@ https://t.co/OiwpMHIqXM
Huge thanks to my incredible collaborators: @MihaelaVDS , @m_wulfmeier , @FranceRomano , @tdavchev , Dhruva Tirumala, @erikfrey , @MaxRuizLuyten , Antonin Berthon, Thomas Pouplin.
DM me if you are at ICML and are interested in discussing how to advance the state-of-the-art in LLM agents!
#MachineLearning #ReinforcementLearning #HierarchicalRL #ICML #Robotics #LLMAgents #AI #Conference #ICML2025 #OnlineRLLLMAgent
Gemini 3 Flash is now available ⚡
Since introducing the Gemini 3 series last month, we've seen you vibe code simulations to learn about complex topics, build and design interactive websites and understand multimodal content. Now we're introducing Gemini 3 Flash, our latest model with frontier intelligence built for speed
Gemini 3 Flash is starting to roll out today as the default model in the @GeminiApp and in Search via AI Mode, and is also available in our developer and enterprise products.
I am attending NeurIPS 2025 (@NeurIPSConf) in San Diego this week and would be very happy to connect with people interested in LLM agents and RL.
My recent work has focused on reinforcement learning for and with large language models: agents with memory and tools, long-context generation, and automating parts of the scientific pipeline. More broadly, I am interested in building universal AI assistants that understand and remember context, use tools, act on users’ behalf across devices and, in the long term, help accelerate scientific discovery.
If you would like a concrete example of what I have been working on recently at the end of my PhD, this is a recording of my ICML 2025 workshop (WCUA) oral talk, “Improving LLM Agent Planning with In-Context Learning via Atomic Fact Augmentation and Lookahead Search.”
I will be around the conference all week (and spending some time at the @GoogleDeepMind booth), and would be very happy to grab a coffee to talk about LLM agents, RL, evaluation, or automating aspects of science and engineering. I am also happy to chat with students who are curious about research or student researcher opportunities.
@DonRumataEstor@Cambridge_Uni@GoogleDeepMind You are right, the picture is from one of my PhD papers on hierarchical RL for long-horizon control tasks. We evaluated it in control/robotics settings, and extending it to other domains is open future work. Happy to chat more.
I am happy to share that I have completed my PhD in Machine Learning (2021–2025) at @Cambridge_Uni and have now joined @GoogleDeepMind as a Research Scientist. I am working on LLM agents and RL towards universal AI assistants that understand and remember context, use tools, act on users’ behalf across devices, and, in the long term, help accelerate scientific discovery.
During my PhD with Prof. Mihaela van der Schaar FRS (@MihaelaVDS) in the Machine Learning and Artificial Intelligence group, my work focused on RL for and with LLMs: agents, memory, tool use, long-context generation, and automating parts of the scientific pipeline. I am very grateful to Mihaela, to my mentors at Google DeepMind (including @m_wulfmeier, @FranceRomano, and Nicolas Heess), and to my collaborators and colleagues at both Cambridge and Google DeepMind for their support and guidance. I am very excited to be starting in this role and to be working with such a talented team.
Want to learn more? Come chat with us at #ICML2025!
🗓️ When: Today! 4:30–7:00 pm PDT 📷 Where: West Exhibition Hall B2-B3 (#W-401)
📄 Paper: https://t.co/pgGTFVX8BX
Huge thanks to my incredible collaborators:
@m_wulfmeier , @FranceRomano , @tdavchev, Dhruva Tirumala, @erikfrey, Ben Moran, @atiliscen, Antoine Laurens, @yixin_lin_, Nicolas Heess
Excited to share our new @icmlconf paper, "EvoControl"
A hierarchical (bi-level) reinforcement learning (HRL) framework for long-horizon continuous control tasks. EvoControl. A slow (30Hz) high-level policy sets latent actions/goals for a fast (500Hz+) low-level policy that acts.
A thread 🧵👇
We even validated EvoControl in the real world! We demonstrated zero-shot sim-to-real transfer on a 7-DoF Franka Emika Panda robot. The policy, trained entirely in sim, worked effectively at high-frequency & reduced collision forces vs a standard controller. #Sim2Real
Thrilled to share our new @icmlconf paper, G-Sim! 🤖
We built an LLM agent that automatically creates and calibrates complex simulators for "what if?" analysis in critical domains like healthcare and logistics.
A thread 🧵👇
Want to learn more? Come chat with us at #ICML2025!
🗓️ When: Today! 4:30–7:00 pm PDT 📍 Where: East Exhibition Hall A-B (Poster #E-1400)
We're excited to discuss how we can build more reliable, causally-informed simulators for real-world decisions.
Huge thanks to my incredible collaborators:
@MihaelaVDS, @MaxRuizLuyten, Antonin Berthon
#AI #LLMs #Simulation #Agent