New Preprint Alert โฐ
Propose Dr. Post-training ๐ฉบ a Data Regularization framework, making your data more effective with ZERO overheads
Experiments demonstrate faster training convergence across SFT, RLHF, RLVR over SOTA data selection, opening up new data optimization designs!
We've raised $65 billion in Series H funding at a $965 billion post-money valuation, led by @AltimeterCap, Dragoneer, @Greenoaks, and @sequoia.
This investment will help us advance our research and expand our capacity to meet growing demand for Claude.
New Preprint Alert โฐ
Propose Dr. Post-training ๐ฉบ a Data Regularization framework, making your data more effective with ZERO overheads
Experiments demonstrate faster training convergence across SFT, RLHF, RLVR over SOTA data selection, opening up new data optimization designs!
while ago @joemelko told me that the post-training technique I'm working on (https://t.co/SJOcHdEWR4) will also work in pretraining, if not then it's skill issue.
now given this promising signal I'm ready. only problem is where's the gpu credit ๐ญ
There is now a smarter way to pick data for training LLMs!
Enter OPUS!
This is an ICML Oral paper from SJTU, Alibaba, UWโMadison, UIUC, and Mila - Quebec AI Institute.
The proposed method dynamically and intelligently selects the most impactful data for LLM pre-training in every single training iteration, bringing principled, continuous data optimization to the forefront.
This approach aims to significantly boost training efficiency and yield higher-quality LLMs, outperforming conventional static data selection methods across diverse language tasks.
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration
Paper: https://t.co/zgAzuwoTJf
Our report: https://t.co/tUCDBOHV5q
๐ฌ #PapersAccepted by Jiqizhixin
This is really my best work so far and am generally proud of sharing this with people who are interested.
Side note: I'm right now in Anthropic team matching process after the fellowship if someone know teams that are doing data please dm me!!!
Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.
fun little artifact, i worked on something similar to freon last year and started writing an (unedited) post that is hidden on my blog: https://t.co/4flVlmPEnC
very naive implementation of steepest descent under various p using full svd: https://t.co/civIcS5XP4
New Preprint Alert โฐ
Propose Dr. Post-training ๐ฉบ a Data Regularization framework, making your data more effective with ZERO overheads
Experiments demonstrate faster training convergence across SFT, RLHF, RLVR over SOTA data selection, opening up new data optimization designs!
New Preprint Alert โฐ
Propose Dr. Post-training ๐ฉบ a Data Regularization framework, making your data more effective with ZERO overheads
Experiments demonstrate faster training convergence across SFT, RLHF, RLVR over SOTA data selection, opening up new data optimization designs!