We have released a technical report (+ a qwen checkpoint) for the reproduction of Self-Distillation Fine-Tuning (SDFT), check it out here https://t.co/o1MmWQ67t8
Self-distillation has been an awesome idea! However, since the amazing paper was out: (https://t.co/nB4IX9duXZ), there have been many difficulties regenerating the results and many issues put out!
I am happy to tell you that the problem is not the idea, we have been able to reproduce the results on Tool Usage!
To help the community push this forward, we wrote a small technical report which provides the way to reproduce those results.
Not only that, we have also open-sourced HF checkpoints for the fine-tuned models for you to use.
We still think self-distillation is amazing ♥️♥️
Report: https://t.co/jrdynIs1E7
Checkpoints: https://t.co/PjKXpCaBda
Happy weekend! You can take it from there!!
#AI #MachineLearning
I have so much fun writing this position with some of the most amaaazing people in robotics!
Have a look at it here: https://t.co/zM3NBtobkx
#AI#MachineLearning#Robotics
We just published internal data on how much of Claude's development is already being done by Claude:
- Over 80% of all code merged into our codebase is now written by Claude
- It's been months since many researchers at Anthropic hand-wrote code
- The typical Anthropic engineer ships 8x as much code as they did in 2024
- On the most open-ended engineering tasks, Claude's success rate jumped from ~26% to 76% in 6 months
- When research sessions went off-track, Claude proposed a better next step than the human took 64% of the time
We're not at recursive self-improvement yet, but it could come sooner than most expect. I highly recommend reading the full blog post.
@hbouammar@MarcHoelle our experiments were based on the codebase released by the authors, which is fully built on TRL https://t.co/tcBrj8ZlUF
i believe it's been copied over to trl library later after release, so you should see similar results if you try trl
Couldn’t reproduce the “continual learning” claim of the original paper. Their released checkpoint showed drastic performance drops in prior tasks. No updated releases for months now.
Great claims should come with great reproducibility.
Happy to see this effort; hope it works!
Self-distillation has been an awesome idea! However, since the amazing paper was out: (https://t.co/nB4IX9duXZ), there have been many difficulties regenerating the results and many issues put out!
I am happy to tell you that the problem is not the idea, we have been able to reproduce the results on Tool Usage!
To help the community push this forward, we wrote a small technical report which provides the way to reproduce those results.
Not only that, we have also open-sourced HF checkpoints for the fine-tuned models for you to use.
We still think self-distillation is amazing ♥️♥️
Report: https://t.co/jrdynIs1E7
Checkpoints: https://t.co/PjKXpCaBda
Happy weekend! You can take it from there!!
#AI #MachineLearning
New in Claude Code (research preview): dynamic workflows.
Claude writes an orchestration script on the fly, then spins up a large fleet of coordinated subagents in parallel to take on your most complex tasks.
Use the word "workflow" in a prompt to get started.
BREAKING: Micron stock, $MU, officially hits $1 trillion in market cap for the first time in history.
12 months ago, this stock was worth just $70 billion.