How can an agent learn state representations that capture exactly the information needed for control, no more, no less?
Our new ICML paper shows that Empowerment is an answer!
📝Paper: https://t.co/K9Oh4N1I2F
📽️ Video: https://t.co/09eLnUtBPG
Narrow finetuning on bad data can cause broad misalignment.
Can inoculation prompting or diluting bad data with good prevent this emergent misalignment?
We find such interventions hide misalignment rather than remove it: it reappears when prompts contain cues (sometimes surprising ones) that evoke the bad data.
Really enjoyed working on this with @OwainEvans_UK, @BetleyJan, and @anna_sztyber during the Astra Fellowship at @ConstellOrg!
What if we represent a state as a "list" of similarities to all other states? In our recent ICLR paper, we studied this "dual" representation.
Come visit our poster at #4608 10:30a-1p on Fri (morning, 2nd day)!
Paper: https://t.co/zYKFjyO0i4
Blog post: https://t.co/lw1Port5k6
1/ Reinforcement learning is usually framed as maximizing rewards. But can we cast it as reaching the right goals?
New blog on bridging RL, goal-conditioned RL, and stochastic shortest path:
https://t.co/WBpEtx9Wiu
Also #ICLR2026 Poster: Thu 10:30 AM–1:00 PM, P4 #4611.
🧵⬇️
Which representations are meaningful for control?
We're presenting TD-JEPA as an oral at ICLR🇧🇷: a zero-shot reinforcement learning algorithm using self-prediction (JEPA) to learn representations that are predictive of long-term, policy-dependent behavior. It works pretty well!🧵
Optimize materials yourself! 🔥
No cloning, no install — just open our interactive Google Colab demo and start optimizing crystals for formation energy or band gap in < 2 minutes.
→ Try it here: https://t.co/aUyEH9M1EE
→ Or grab the repo + open weights: https://t.co/rm6QID9xAK https://t.co/FPkjesbMoL
🧠🔭Today's AI models synthesize knowledge acquired from the internet/books/etc. Ultimately, that knowledge usually derives from real experiments. We know (say) the moon's mass because a human did a science experiment.
How well do AI models fare at generating knowledge? 🤔
I spent some time evaluating the best AI models on interactive block-building tasks. I am surprised by 1) the fragility of these systems when solving tasks that require generating creative ideas and updating hypotheses via interaction, and 2) the vast, but often unnecessary, amount of knowledge and compute they are willing to throw.
I, along with @karthik_r_n and @ben_eysenbach, wrote a blog post about these findings and limitations, and what a satisfying solution might look like:
https://t.co/H7wToskK88
Can strong models like Claude Opus 4.6, GPT 5.2, Gemini 3 Flash learn to solve novel tasks which require exploration and creativity?
It seems like the answer is, "not yet"!
Introducing BuilderBench 🏗️- We propose a setup of physically building architectures via interaction, to show how current crop of AI agents struggle in learning to solve novel tasks.
We hypothesize that exploration, both in the space of physical interactions and the space of thoughts is the primary bottleneck.
Blogpost with detailed failure modes - https://t.co/H7wToskK88
Paper - https://t.co/pzHxevBMDL
Representation learning is all about capturing the right prior. What is the right prior for *reinforcement learning*?
We propose a new unsupervised pre-training method for RL: https://t.co/6GgNRVK0np.
🧵⬇️
Moje najważniejsze wyniki naukowe opublikowałem na wiodących konferencjach informatycznych takich jak STOC, FOCS, ICML, AAAI, IJCAI czy SODA. W proponowanych zmianach przez ministerstwo ocena tych publikacji zostaje zmniejszona z 200pkt do 100pkt. Innymi słowy:
- osiągnięcia, za które dostałem 4ry granty Europejskiej Rady ds. Nauki (ERC), będą teraz warte 100 punktów zamiast 200 - ERC (https://t.co/yz7XXo4Jef) to system najbardziej prestiżowych grantów naukowych w Unii Europejskiej,
- ocena prac, za które zostałem przyjęty do Europejskiego Laboratorium Uczenia Maszynowego i Systemów Inteligentnych (ELLIS), zostanie zmniejszona z 200 punktów do tylko 100 - kryteria przyjęcia do tej sieci można znaleźć na tej stronie https://t.co/eQoI6nUdFL i odnoszą się z jednym wyjątkiem do publikacji konferencyjnych,
- moje publikacje, które liczą się do dziedzinowego Rankingu Szanghajskiego (GRAS), będą teraz warte 100 punktów zamiast 200 - ranking ten bierze pod uwagę tylko publikacje na 26ciu najlepszych konferencjach informatycznych z całkowitym pominięciem publikacji w czasopismach, lista znajduje się tutaj https://t.co/HXXPtai1xT.
Krajowa ranga moich publikacji, które są rozpoznawane na arenie międzynarodowej, zostaje drastycznie zmniejszona. Bardzo trudno mi zrozumieć przyczyny leżące u podstawy tej zmiany, która doprowadzi do marginalizacji w Polsce moich osiągnięć, a długofalowo zabiera informatykom zachęty do publikowania w najważniejszych i najbardziej rozpoznawanych miejscach na świecie.
Opis proponowanych zmian można znaleźć tutaj: https://t.co/GsT0MulD8D.
Bring new robot testing environments to life with @theworldlabs and Isaac Sim. 🤖 If you can describe a world 🌎, you can start testing in it the same day.
Learn how to:
1. Export scenes from World Labs' Marble as Gaussian splats
2. Convert to USD using @nvidiaomniverse NuRec
3. Import into NVIDIA Isaac Sim
4. Add a robot and run the simulation
Read the guide ➡️ https://t.co/0kCSFIxhVC
#SIGGRAPHAsia2025
Great week at #NeurIPSanDiego packed, intense, and genuinely inspiring. Grateful for all the discussions and feedback. Now looking forward to some quieter days and cooking up new stuff 🚀
🚀I'm excited to present our work at Neurips!
I will present our poster on, "Normalizing Flows are Capable Models for Continuous Control". We show that Normalizing Flows can be used as plug and play models in imitation learning, offline RL and unsupervised RL algorithms to reap the significant performance benefits of exact inference!
I will be presenting on Friday at 4:30-7:30 pm in Exhibit Hall C,D,E #312.
Original Tweet - https://t.co/RYITt3FOXJ
Feel free to stop by or reach out to chat about RL and probabilistic inference, exploration, long horizon RL and open problems in RL. Also happy to chat about meta research stuff and the PhD life!
🏆1000 Layer Networks for Self-Supervised RL wins a Best Paper Award at #NeurIPS25 !
Proud of @kevin_wang3290@IJ_Apps@m_bortkiewicz for all the hard work they put into this!
👇for key results and open problems!
It’s a perfect day to announce that I’ve joined Mistral as an AI scientist, when our new flagship model has arrived :).
Obviously, I did not contribute to this one, but I have high hopes about the next one :).
I am very excited about this opportunity for a few reasons. On the personal level, it's going to exciting, a lot of learning and cool stuff.
More broadly, it is the first time when the frontier lab starts its operations in Warsaw. I’m really proud about the speed of development that the Polish AI ecosystem has witnessed, hope to see much more great things happening :).
“Contrastive Representations for Temporal Reasoning” at NeurIPS this week!
If you’re around, we’d love to chat about representation learning, temporal reasoning, and all things RL.
📍 Poster #2615 — Exhibit Hall C, D, E
📅 Wednesday, Dec 3, 2025
⏰ 11:00 AM – 2:00 PM
Come say hi!
I’m so excited to be presenting at NeurIPS again this year! We’ll be showing our poster “Contrastive Representations for Temporal Reasoning” on Wednesday at 11am. @princeton_rl