Released torchdistill v1.1.0 last night!
⚗️ Key updates ⚗️
- @PyTorch 2.2.1 support
- 3 new KD methods
- Custom low-level loss support
https://t.co/IXWsYr2nsx
For the new KD methods (SRD w/ @Miles12Roy), we reproduced ImageNet results (2 wins, 1 lose)
https://t.co/zjDmK5RAgU
I’m happy to share that I’m starting a new position as Founding Research Engineer @SpiffyAI !
This is a totally new adventure for me. I am very excited to join such an amazing team and work together on challenging problems!
LLMs have problems that are difficult to identify and explain. To understand them better, we need to study the relationship between LLMs and their pre-training data, and with the OLMo release, @allen_ai helps make this possible!
https://t.co/KYX8Ahzo3U
OLMo-7b is finally out 🎉, and we are releasing everything; weights, intermediate checkpoints, training code and logs, training data and toolkit, evaluation and adaptation code and data.
Most of it has been released, and the rest is coming soon. OLMo-65b and Adapted OLMo-7b are training, and we plan extensions to future modalities. Check the report for details https://t.co/LEH7gBZUTd
Thanks to all of @allen_ai, especially those who took the project from zero-to-one: @mechanicaldirk, Pete Walsh, @akshitab93, Rodney Kinney, @oyvindtafjord, @AnanyaHarsh, @hamishivi, @IanMagnusson, @yizhongwyz, Kyle Richardson, @LukeZettlemoyer, @soldni, @nlpnoah, @HannaHajishirzi