Kashif Rasul @krasul - Twitter Profile

6 days ago

@PavloMolchanov Thanks for extending and analysing the shortcomings of GOLD. Enjoyed the paper and its insights. Let's try to add to TRL!

1

6

0

6K

Kashif Rasul @krasul

7 days ago

@llllvvuu You're right, compute_delta is imprecise on where to cut, it should trim to the last EOS before subtracting (otherwise the <|im_end|>\n separator gets dropped on ChatML templates). We'll tighten the post. TRL already handles this correctly.

0

1

0

257

Kashif Rasul @krasul

15 days ago

@SungjinAhn_ Fashion-mnist please!

0

1

0

100

krasul retweeted

Sergio Paniego

@SergioPaniego

28 days ago

OpenEnv already ships 🚢 with a ready-to-deploy RLM environment on free HF Spaces Drop "Attention Is All You Need", write code that spawns parallel LLM calls → ✅ answer in 4.2s Run GRPO (TRL) → model learns to write that search strategy itself 👀@lateinteraction @a1zhang

SergioPaniego's tweet photo. OpenEnv already ships 🚢 with a ready-to-deploy RLM environment on free HF Spaces

Drop "Attention Is All You Need", write code that spawns parallel LLM calls → ✅ answer in 4.2s

Run GRPO (TRL) → model learns to write that search strategy itself

👀@lateinteraction @a1zhang https://t.co/DnReX4CDM4

2

48

10

28

4K

Who to follow

Pavel Izmailov

@Pavel_Izmailov

Researcher @AnthropicAI 🤖 Assistant Professor @nyuniversity 🏙️ Previously @OpenAI #StopWar 🇺🇦

Jay Alammar

@JayAlammar

Machine Learning Researcher and writer https://t.co/5GlbofAHs0. O'Reilly Author https://t.co/Fl3uPAZHLg. LLM Builder @Cohere.

Arthur Mensch

@arthurmensch

Co-founder and CEO @MistralAI. Talk to le Chat https://t.co/ZMZG8rAlWz https://t.co/ydSK6xG4Ce https://t.co/b1uf0UK5U8

Kashif Rasul @krasul

about 1 month ago

@willccbb we have recently added these methods in TRL if you want to try them out https://t.co/AaNqLtHlDv https://t.co/AaNqLtHlDv and also https://t.co/ZLyrEoRg8I

0

6

1

3

634

krasul retweeted

Sergio Paniego

@SergioPaniego

about 2 months ago

Earlier this month, Apple introduced Simple Self-Distillation: a fine-tuning method that improves models on coding tasks just by sampling from the model and training on its own outputs with plain cross-entropy and… it's already supported in TRL, built by @krasul. you can really feel the pace of development in the team 🐎 paper by @onloglogn, @richard_baihe, @UnderGroundJeg, Navdeep Jaitly, @trebolloc, @YizheZhangNLP at Apple 🍎 how it works: the model generates completions at a training-time temperature (T_train) with top_k/top_p truncation, then fine-tunes on them with plain cross-entropy. no labels or verifier needed you can try it right away with this ready-to-run example (Qwen3-4B on rStar-Coder): https://t.co/zizfISD6bq or benchmark a checkpoint with the eval script: https://t.co/mKlafTyKSe one neat insight from the paper: T_train and T_eval compose into an effective T_eff = T_train × T_eval, so a broad band of configs works well. even very noisy samples still help want to dig deeper? paper: https://t.co/aj1ZAcr8Mw trainer docs: https://t.co/TNVz93kZi9

SergioPaniego's tweet photo. Earlier this month, Apple introduced Simple Self-Distillation: a fine-tuning method that improves models on coding tasks just by sampling from the model and training on its own outputs with plain cross-entropy

and… it's already supported in TRL, built by @krasul. you can really feel the pace of development in the team 🐎

paper by @onloglogn, @richard_baihe,
@UnderGroundJeg, Navdeep Jaitly, @trebolloc, @YizheZhangNLP
at Apple 🍎

how it works: the model generates completions at a training-time temperature (T_train) with top_k/top_p truncation, then fine-tunes on them with plain cross-entropy. no labels or verifier needed

you can try it right away with this ready-to-run example (Qwen3-4B on rStar-Coder):
https://t.co/zizfISD6bq
or benchmark a checkpoint with the eval script:
https://t.co/mKlafTyKSe

one neat insight from the paper: T_train and T_eval compose into an effective T_eff = T_train × T_eval, so a broad band of configs works well. even very noisy samples still help

want to dig deeper?

paper: https://t.co/aj1ZAcr8Mw
trainer docs: https://t.co/TNVz93kZi9

6

225

36

206

38K

krasul retweeted

Benjamin Bossan @BenjaminBossan

about 2 months ago

Today, we released PEFT v0.19.0 and it's a big one. Not only did we add 9 new PEFT methods, the release also contains a bunch of improvements to make PEFT more useful. Check the thread for details:

BenjaminBossan's tweet photo. Today, we released PEFT v0.19.0 and it's a big one. Not only did we add 9 new PEFT methods, the release also contains a bunch of improvements to make PEFT more useful. Check the thread for details: https://t.co/zhDK5jIh5C

2

13

2

3

541

krasul retweeted

Quentin Gallouédec @QGallouedec

2 months ago

France is about to pass a law punishing support for the genocide in Palestine! 🇫🇷🇵🇸 just kidding. it’s actually a proposal to restrict criticism of Israel, in the so-called country of human rights and free speech. @SandrineRunel, je vous appelle à voter contre la loi Yadan.

QGallouedec's tweet photo. France is about to pass a law punishing support for the genocide in Palestine! 🇫🇷🇵🇸

just kidding. it’s actually a proposal to restrict criticism of Israel, in the so-called country of human rights and free speech.

@SandrineRunel, je vous appelle à voter contre la loi Yadan. https://t.co/b0mjkMLwmN

0

6

2

0

569

Kashif Rasul @krasul

2 months ago

@m_sirovatka try using a RLM harness?

0

4

0

1

791

krasul retweeted

Sergio Paniego

@SergioPaniego

3 months ago

check out this new notebook by @krasul on TimesFM 2.5, Google's time series foundation model which is now supported in transformers zero-shot forecasting, quantile predictions, LoRA fine-tuning, and forecasting with exogenous covariates https://t.co/aUKP813nIw

SergioPaniego's tweet photo. check out this new notebook by @krasul on TimesFM 2.5, Google's time series foundation model which is now supported in transformers

zero-shot forecasting, quantile predictions, LoRA fine-tuning, and forecasting with exogenous covariates

https://t.co/aUKP813nIw https://t.co/dIo5z6q1e8

0

23

6

5

1K

krasul retweeted

Stas Bekman

@StasBekman

3 months ago

Good news! Ulysses Sequence Parallelism from the Snowflake AI Research and the Deepspeed teams has been integrated into @huggingface Trainer, Accelerate and TRL For extensive details please see this writeup: https://t.co/2xDWUk8p3V Thanks a lot to @krasul for helping make it happen. Also the others in the HF team who helped with integration.

StasBekman's tweet photo. Good news! Ulysses Sequence Parallelism from the Snowflake AI Research and the Deepspeed teams has been integrated into @huggingface Trainer, Accelerate and TRL

For extensive details please see this writeup:
https://t.co/2xDWUk8p3V

Thanks a lot to @krasul for helping make it happen. Also the others in the HF team who helped with integration.

4

116

20

43

18K

Kashif Rasul @krasul

3 months ago

@RisingSayak @vivs4092 @huggingface Thanks @vivs4092 yes about to add llada-2.1 flash to diffusers

1

0

69

Kashif Rasul @krasul

4 months ago

@JeffDean @GoogleResearch Also being added to transformers https://t.co/kWbXEunzfx

1

3

0

1

908

Kashif Rasul @krasul

4 months ago

@jxmnop How is P sampled BTW? I assume it's Gaussian with 1/sqrt(r) normalization?

0

62

Kashif Rasul @krasul

4 months ago

@natolambert added a section on on-policy distillation if interested https://t.co/JqUvLDrsTq

1

6

0

2

725

Kashif Rasul @krasul

5 months ago

@m_sirovatka I also recall this https://t.co/tUWkQ0MoSL from @_lewtun

0

3

0

2

147

Kashif Rasul @krasul

5 months ago

@m_sirovatka I keep going back to https://t.co/xh31jksnkL by @QGallouedec

0

7

4

11

533

Kashif Rasul @krasul

5 months ago

@EyubogluSabri added cartridges natively to PEFT: https://t.co/qchop7r26p

0

4

2

237

Kashif Rasul @krasul

5 months ago

@m_sirovatka @jackminong feel free to check the liger-kernels, we have added support for a lot of RL losses there also this open PR: https://t.co/CKnAcEqjiT

0

2

0

1

82

Kashif Rasul

@krasul

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users