Data mixing - determining ratios across your training datasets - matters a lot for model quality. While building Olmo 3, we learned it’s hard to set up a method that finds a strong mix, and hard to maintain that mix as datasets change throughout development.
Introducing Olmix👇
Thrilled to have contributed to Olmo 3! The best fully open 32B model (data, training recipes, checkpoints and more!)
As an intern at AI2 these last 8 months, I’ve grown to deeply appreciate the careful science, iteration, and collaboration that go into models like this and have learned so much from the team. I am more optimistic than ever about the future of open-source and data-centric research right now.
My particular contribution was working on the Dolma 3 data mix 👩🍳 I was able to apply ideas from some of my earlier mixing work, explore new problem settings, and see firsthand the data challenges that arise when building datasets intended for real models at scale. More on this coming soon!
Very satisfied with some neat results on imitation learning. When distribution matching isn’t possible, what’s even the role of demonstrations? Cloning/log-loss minimization? We propose directly encoding reward structure—motivating new algorithmic ideas. https://t.co/QZlmCBSzlr
Gave a talk at @OpenAI on our work 🌸 POPri “Policy Optimization for Private Data”. POPri is a huge improvement in synthetic data generation under security+privacy constraints! Learn more: