AI Scientist - Safety & Security @MistralAI | PhD from @MetaAI and @Polytechnique
I confront equations and inequalities💡
my tweets reflect my own views only.
🧵 For 2 RL checkpoints trained differently, you can just weight extrapolate them and it works!
Bonus: these extrapolated checkpoints are complementary policies
-> Get exploration and diversity for free
-> Better inference scaling when ensembling
Paper: https://t.co/zU0LH0TOdm
I'm actually in Rio 🇧🇷☀️ this week for ICLR to present Winter Soldier ❄️!
Keen to meet folks working on: AI Security, Reasoning, Memory...😏🤐
We also have open positions at Mistral AI on several topics, including AI Safety! 🛡️
Drop a reply or DM me if you want to chat !
Excited to see the recent discussions around trait transfer in LLMs! 🦉
It validates the idea of Indirect Data Poisoning we introduced in our Winter Soldier paper (https://t.co/lNHMDpE4TP) which predates the Subliminal Learning work.
It's important to connect these lines of work.
Our paper on Subliminal Learning was just published in Nature!
Last July we released our preprint. It showed that LLMs can transmit traits (e.g. liking owls) through data that is unrelated to that trait (numbers that appear meaningless).
What’s new?🧵
I'm particularly proud of the Winter Soldier project, being my last PhD project, and it's great to see the broader community engaging with these ideas 🤗
Data are not just passive inputs, and training on them could lead to similar risks as running an untrusted program 🛡️💻
1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵
We have tons of terminal apps but all rely on the same outdated multiplexer...
Use a terminal that supports tmux's control mode (`tmux -CC`) and you'll never have to learn a single tmux "short"cut again
I hate tmux
It's so incredibly user unfriendly
The shortcuts make no sense
I wish someone would make a better tmux
Even just logging into tmux attaching the screen is an illogical hell to type
Again I hate tmux, it's so shit
✨Vision Transformer finetuning benefits from non-smooth components
🔍Our new paper shows that high-plasticity transformer modules adapt better during finetuning.
🍕 Whether you prefer theory or experiments, we hope you'll find something you like in this work.
Details below 🧵
My first PhD paper is out! 🎓
"What Drives Success in Physical Planning with Joint-Embedding Predictive World Models?"
tl:dr: JEPA-WMs for robotics: learn dynamics on top of visual encoders, optimize actions towards goal 👇
w/ @JimmyTYYang1, Jean Ponce, @AdrienBardes, @ylecun
Big milestone 🎓✨
I’ve successfully defended my PhD thesis at @Polytechnique in collaboration with @AIatMeta !
"Towards Secure and Trustworthy Machine Learning: From Data Poisoning to Ownership Verification"
Grateful to my advisors, jury, and everyone who supported me 🙏
I am happy to share the work of our team. The outcome of a collaborative effort, by a joyful group of skilled and determined scientists and engineers! Congrats to the team on this amazing milestone!
Say hello to DINOv3 🦖🦖🦖
A major release that raises the bar of self-supervised vision foundation models.
With stunning high-resolution dense features, it’s a game-changer for vision tasks!
We scaled model size and training data, but here's what makes it special 👇
Peer review ML conferences:
> Only "top" X% get in
> Jobs, grants, bonuses hinge on it
> No penalty for bad-faith reviews
> No cost for flooding submissions
Who could have seen this going wrong? 🤭
🚀New paper alert! 🚀
In our work @AIatMeta we dive into the struggles of mixing languages in largely multilingual Transformer encoders and use the analysis as a tool to better design multilingual models to obtain optimal performance.
📄: https://t.co/3qxUWDkoN5
🧵(1/n)
@OwainEvans_UK Interesting work!
We found that similar misalignment can be implanted via Indirect Data Poisoning at pre-training time:
https://t.co/lNHMDpE4TP
Would love to hear your thoughts 😄
heading to @icmlconf#ICML2025 next week! come say hi & i'd love to learn about your work :)
i'll present this paper (https://t.co/4rFtApYs2Q) on the pitfalls of training set inclusion in LLMs, Thursday 11am
here are my talk slides to flip through: https://t.co/kdM992vkTv