🚨 NEW PAPER! (this is a big one; 3B and 10B models included)
Masked diffusion LLMs are getting a lot of attention. They outperform other diffusion types (such as uniform diffusion) at small scales.
But what if I told you that uniform diffusion actually scales better? 🧵👇
A fantastic opportunity to join the new lab of my former master's thesis co-advisor, @f_dangel! He is an incredible researcher and mentor. I really recommend working with him
🎓 Looking for MSc or PhD opportunities in Machine Learning for Fall 2026?
Join my group at @Concordia and @Mila_Quebec!
🔍 Focus: autodiff, second-order optimization, and Hessian-based methods for LLMs & scientific ML.
📅 Apply by Dec 1: https://t.co/qJ3AcgmpUQ
I would highly recommend using this library for any research on influence functions.
Implementing scalable IFs (usually ≡ K-FAC) is a massive pain, especially for modern architectures. With curvlinops, getting plots like the below for diffusion models is relatively easy
We’re thrilled to welcome Sander Dieleman, Research Scientist at Google DeepMind, to ML in PL Conference 2025!
Sander Dieleman is a Research Scientist at Google DeepMind in London, UK, where he has worked on the development of AlphaGo, WaveNet, Imagen 4, Veo 3, and more. He obtained his PhD from Ghent University in 2016. His current research interests include representation learning and generative modelling of audio, images and video.
📍 15–17 October 2025, Copernicus Science Centre, Warsaw, Poland
🔗 To learn more and secure your spot https://t.co/gMRele492y
🚨 Incredibly excited to share that I'm starting my research group focusing on AI safety and alignment at the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems in September 2025! 🚨
Hiring. I'm looking for multiple PhD students: both those able to start in Fall 2025 (i.e., as soon as possible) and through centralized programs like CLS, IMPRS, and ELLIS (the deadlines are in November) to start in Spring–Fall 2026. I'm also searching for postdocs, master's thesis students, and research interns. Fill the Google form below if you're interested!
Research group. We will focus on developing algorithmic solutions to reduce harms from advanced general-purpose AI models. We're particularly interested in alignment of autonomous LLM agents, which are becoming increasingly capable and pose a variety of emerging risks. We're also interested in rigorous AI evaluations and informing the public about the risks and capabilities of frontier AI models. Additionally, we aim to advance our understanding of how AI models generalize, which is crucial for ensuring their steerability and reducing associated risks. For more information about research topics relevant to our group, please check the following documents:
- International AI Safety Report,
- An Approach to Technical AGI Safety and Security by DeepMind,
- Open Philanthropy’s 2025 RFP for Technical AI Safety Research.
Research style. We are not necessarily interested in getting X papers accepted at NeurIPS/ICML/ICLR. We are interested in making an impact: this can be papers (and NeurIPS/ICML/ICLR are great venues), but also open-source repositories, benchmarks, blog posts, even social media posts—literally anything that can be genuinely useful for other researchers and the general public.
Broader vision. Current machine learning methods are fundamentally different from what they used to be pre-2022. The Bitter Lesson summarized and predicted this shift very well back in 2019: "general methods that leverage computation are ultimately the most effective". Taking this into account, we are only interested in studying methods that are general and scale with intelligence and compute. Everything that helps to advance their safety and alignment with societal values is relevant to us. We believe getting this—some may call it "AGI"—right is one of the most important challenges of our time.
Join us on this journey!
We are delighted to introduce our next ML in PL Conference 2025 speaker: Alexey Dosovitskiy!
Alexey Dosovitskiy is a distinguished AI researcher who gained prominence at @GoogleResearch as lead author of the "An Image is Worth 16x16 Words" paper, which introduced Vision Transformers (ViT). As of February 2024, he joined Inceptive as a Member of Technical Staff, where he's now applying machine learning techniques to RNA research.
👉 Check all the conference details on our event page: https://t.co/QhkAstO7d4
Is equivariance necessary for a good 3D molecule generative model? Check out our #icml2025 paper, which closes the performance gap between non-equivariant and equivariant diffusion models via rotational alignment, while also being more efficient (1/7):
https://t.co/kFiZptFwsr
We are delighted to introduce you to our next ML in PL Conference 2025 Speaker: Federico Tombari @fedassa!
Federico Tombari is Research Director at @Google, leading Computer Vision and Machine Learning teams across North America and Europe. His team has contributed CV/ML technology to Google Lens, Maps, Android, ARCore, and Pixel. He's also a Lecturer at @TU_Muenchen with 300+ peer-reviewed publications in CV/ML, covering robotics, autonomous driving, healthcare, and AR. He co-founded a 3D perception startup that was acquired by Google in 2018-19. Federico serves as Area Chair and Associate Editor for top conferences like @NeurIPSConf, @CVPR, and @eccvconf, and has received Google Faculty Awards, an Amazon Research Award, and multiple Outstanding Reviewer Awards.
👉 Stay up to date with ML in PL news by following us on social media!
👉 ML in PL Conference 2025 website: https://t.co/Kf1fXquI7j
👉 ML in PL Facebook Page: https://t.co/OrfVZaZB4D
👉 ML in PL LinkedIn Page: https://t.co/olmAd99Mdq
👉 ML in PL X Profile: https://t.co/BypkKmH7Jr
📣 Calling all ML researchers, students and professionals: ML in PL Conference 2025 is now accepting submissions for talks, posters, and tutorials.
We’re accepting submissions across a wide range of machine learning topics, including (but not limited to):
- Core ML & Optimization: Classification, Clustering, Learning Theory, Online/Semi/Unsupervised Learning
- Deep Learning: Architectures, Generative Models, Recurrent Networks, DL Optimization
- Reinforcement Learning: Bandits, Control, MDPs, Planning, Multi-Agent Systems
- Probabilistic & Causal Methods: Bayesian Inference, Gaussian Processes, Graphical Models
- Applications: NLP, Vision, Robotics, Audio, Biology, Neuroscience, Physics, Social Good
- Responsible AI: Fairness, Privacy, Robustness, Safety, Ethics, Bias, Explainability
- Technical Resources: Datasets, Tools, Software, Distributed ML, Open Competitions
- ML Stories: Case studies, startup journeys, and lessons from real-world deployments
Whether you're a PhD student sharing your first major results, a senior researcher, or a seasoned ML practitioner, we welcome your contribution—be it theoretical or applied, early-stage or production-ready.
Submission Deadline: July 31st
📄 Talks & Posters → https://t.co/C7k2ObGnQz
💻 Tutorials → https://t.co/0mSi5fX5ZN
Thinking of attending instead?
Early Bird registration is open until July 31st: https://t.co/JUVVnPUQst
A string may get 17 times less probability if tokenised as two symbols (e.g., ⟨he, llo⟩) than as one (e.g., ⟨hello⟩)—by an LM trained from scratch in each situation! Our #acl2025nlp paper proposes an observational method to estimate this causal effect! Longer thread soon!
We are thrilled to announce that Antonio, a leading researcher in deep learning and optimization, will be joining us as a speaker at the ML in PL 2025 Conference!
Antonio studied Control Engineering in Italy and Switzerland. He holds a PhD in Computer Science from ETH Zürich and spent time at Deepmind (UK), Meta (US), MILA (CA), INRIA (FR), and HILTI (LI). He is currently a Hector Endowed Fellow and Principal Investigator (PI) at the ELLIS Institute Tübingen and Independent Group Leader of the MPI for Intelligent Systems, where he leads the Deep Models and Optimization group. He received the ETH medal for outstanding doctoral theses and the Schmidt Sciences AI2050 Early Career Fellowship.
In his research, Antonio strives to improve the efficiency of deep learning technologies by pioneering new architectures and training techniques grounded in theoretical knowledge. His work encompasses two main areas: understanding the intricacies of large-scale optimization dynamics and designing innovative architectures and powerful optimizers capable of handling complex data. Central to his studies is exploring innovative techniques for decoding patterns in sequential data, with implications in biology, neuroscience, natural language processing, and music generation.
👉 Early Bird Registration is open until the 31st of July.
https://t.co/JUVVnPUQst
Check all the conference details on our event page: https://t.co/QhkAstO7d4
Excited to share another paper at #ICLR2025
We introduce iSCMs—a new method for generating synthetic data for causal discovery benchmarks that avoids variance and covariance artifacts.
Joint work with Scott Sussex, Lars Lorch, @bschoelkopf & @arkrause.
Key insights👇
1/8
Our code is publicly available, so you can generate iSCM-based datasets and easily benchmark causal discovery methods.
🔗 Code: https://t.co/tRHlKoxsQx
7/8