Ahmed @KickItLikeShika - Twitter Profile

Pinned Tweet

12 days ago

We have released a technical report (+ a qwen checkpoint) for the reproduction of Self-Distillation Fine-Tuning (SDFT), check it out here https://t.co/o1MmWQ67t8

Haitham Bou Ammar

@hbouammar

12 days ago

Self-distillation has been an awesome idea! However, since the amazing paper was out: (https://t.co/nB4IX9duXZ), there have been many difficulties regenerating the results and many issues put out! I am happy to tell you that the problem is not the idea, we have been able to reproduce the results on Tool Usage! To help the community push this forward, we wrote a small technical report which provides the way to reproduce those results. Not only that, we have also open-sourced HF checkpoints for the fine-tuned models for you to use. We still think self-distillation is amazing ♥️♥️ Report: https://t.co/jrdynIs1E7 Checkpoints: https://t.co/PjKXpCaBda Happy weekend! You can take it from there!! #AI #MachineLearning

hbouammar's tweet photo. Self-distillation has been an awesome idea! However, since the amazing paper was out: (https://t.co/nB4IX9duXZ), there have been many difficulties regenerating the results and many issues put out!

I am happy to tell you that the problem is not the idea, we have been able to reproduce the results on Tool Usage!

To help the community push this forward, we wrote a small technical report which provides the way to reproduce those results.

Not only that, we have also open-sourced HF checkpoints for the fine-tuned models for you to use.

We still think self-distillation is amazing ♥️♥️

Report: https://t.co/jrdynIs1E7
Checkpoints: https://t.co/PjKXpCaBda

Happy weekend! You can take it from there!!

#AI #MachineLearning

1

138

19

140

12K

0

3

0

1

388

Ahmed

@KickItLikeShika

about 11 hours ago

pretty sure tons of these vibe-checks were part of the the training set, wanna see how it does it in a very tense research environment

Tib3rius

@0xTib3rius

about 12 hours ago

My god, its incredible.

331

18K

426

1K

2M

0

2

0

538

Ahmed

@KickItLikeShika

about 12 hours ago

the fuck are we about to witness

Cursor @cursor_ai

about 13 hours ago

Claude Fable 5 is now available in Cursor. It sets a new state of the art on CursorBench at 72.9%, 8 points above the previous best.

cursor_ai's tweet photo. Claude Fable 5 is now available in Cursor.

It sets a new state of the art on CursorBench at 72.9%, 8 points above the previous best. https://t.co/L3Wm8mSYq9

205

5K

382

534

742K

0

1

0

35

KickItLikeShika retweeted

Haitham Bou Ammar

@hbouammar

5 days ago

I have so much fun writing this position with some of the most amaaazing people in robotics! Have a look at it here: https://t.co/zM3NBtobkx #AI #MachineLearning #Robotics

hbouammar's tweet photo. I have so much fun writing this position with some of the most amaaazing people in robotics!

Have a look at it here: https://t.co/zM3NBtobkx
#AI #MachineLearning #Robotics https://t.co/GrRJZ89pwg

17

703

105

824

113K

Who to follow

vascular surgeon 👨🏻‍⚕️ In love with ZAMALEK🇦🇹🏹 BARCELONA❤️💙Mes Que Un Club

5 days ago

llms can break a bunch of things and try and fix them and break other things etc

0

67

Ahmed

@KickItLikeShika

5 days ago

quantifying this with number of lines isn’t something i can trust tbh

Alex Albert @alexalbert__

6 days ago

We just published internal data on how much of Claude's development is already being done by Claude: - Over 80% of all code merged into our codebase is now written by Claude - It's been months since many researchers at Anthropic hand-wrote code - The typical Anthropic engineer ships 8x as much code as they did in 2024 - On the most open-ended engineering tasks, Claude's success rate jumped from ~26% to 76% in 6 months - When research sessions went off-track, Claude proposed a better next step than the human took 64% of the time We're not at recursive self-improvement yet, but it could come sooner than most expect. I highly recommend reading the full blog post.

189

3K

220

784

419K

1

0

128

Ahmed

@KickItLikeShika

5 days ago

llms scale fried my brain so bad that seeing JEPA's performance with only a few million parameters makes me doubt my life choices

0

1

0

45

KickItLikeShika retweeted

llm_enjoyer

@LLMenjoyer

11 days ago

u js trained on test bro, it's not that deep

15

2K

90

490

197K

KickItLikeShika retweeted

Dave Banerjee

@DaveRBanerjee

11 days ago

lol

DaveRBanerjee's tweet photo. lol https://t.co/akwtRpvrti

125

17K

509

1K

474K

Ahmed

@KickItLikeShika

11 days ago

@hbouammar @MarcHoelle our experiments were based on the codebase released by the authors, which is fully built on TRL https://t.co/tcBrj8ZlUF i believe it's been copied over to trl library later after release, so you should see similar results if you try trl

0

3

0

109

Ahmed

@KickItLikeShika

11 days ago

@md_kasif_uddin qwen is clear

0

112

Ahmed

@KickItLikeShika

11 days ago

@aamixsh this was one of the main motivations for publishing this report, you can find the detailed evaluation results in the shared HF model repo!

0

2

0

172

KickItLikeShika retweeted

Aayush Mishra @aamixsh

11 days ago

Couldn’t reproduce the “continual learning” claim of the original paper. Their released checkpoint showed drastic performance drops in prior tasks. No updated releases for months now. Great claims should come with great reproducibility. Happy to see this effort; hope it works!

1

9

3

8

2K

KickItLikeShika retweeted

Haitham Bou Ammar

@hbouammar

12 days ago

Self-distillation has been an awesome idea! However, since the amazing paper was out: (https://t.co/nB4IX9duXZ), there have been many difficulties regenerating the results and many issues put out! I am happy to tell you that the problem is not the idea, we have been able to reproduce the results on Tool Usage! To help the community push this forward, we wrote a small technical report which provides the way to reproduce those results. Not only that, we have also open-sourced HF checkpoints for the fine-tuned models for you to use. We still think self-distillation is amazing ♥️♥️ Report: https://t.co/jrdynIs1E7 Checkpoints: https://t.co/PjKXpCaBda Happy weekend! You can take it from there!! #AI #MachineLearning

1

138

19

140

12K

Ahmed

@KickItLikeShika

12 days ago

why would they call rlm dynamic workflows

ClaudeDevs

@ClaudeDevs

13 days ago

New in Claude Code (research preview): dynamic workflows. Claude writes an orchestration script on the fly, then spins up a large fleet of coordinated subagents in parallel to take on your most complex tasks. Use the word "workflow" in a prompt to get started.

ClaudeDevs's tweet photo. New in Claude Code (research preview): dynamic workflows.

Claude writes an orchestration script on the fly, then spins up a large fleet of coordinated subagents in parallel to take on your most complex tasks.

Use the word "workflow" in a prompt to get started. https://t.co/re4SG3AyDm

370

11K

952

6K

4M

1

2

0

191

Ahmed

@KickItLikeShika

15 days ago

placed my bet on $mu months ago, went in with 60% of my portfolio weeks ago

The Kobeissi Letter

@KobeissiLetter

15 days ago

BREAKING: Micron stock, $MU, officially hits $1 trillion in market cap for the first time in history. 12 months ago, this stock was worth just $70 billion.