snwfdhmp @snwfdhmp - Twitter Profile

snwfdhmp retweeted

24 days ago

So excited to be opening up OpenEnv to the whole community. It will now be owned by @huggingface , Meta-PyTorch, @reflection_ai , @UnslothAI , @modal, @PrimeIntellect , @NVIDIAAI , @mercor_ai , and @fleet_ai . the reason is: frontier labs train the model and the harness together, so the model is fitted to its harness. that coupling is a chunk of why claude code and codex feel so good. open source can't do that. you bring whatever harness, whatever model, whatever env, whatever trainer. which is the whole point of open source and also the problem for training. openenv is the socket in between all of this. in short: it's a protocol layer, not a reward framework. it does not have opinions about your rewards or your training loop. those live in the libs that are actually good at them. read more in the blog post. it's early, come break it.

16

298

58

209

91K

snwfdhmp retweeted

Maxime Labonne

@maximelabonne

about 1 month ago

So proud of this release! It's the first step towards agents running on device. We learned so, so much post-training this model (stay tuned!). Massive congrats to the team, you've been amazing to work with ♥️

26

309

27

32

20K

snwfdhmp retweeted

Maxime Labonne

@maximelabonne

2 months ago

This is so cool: LFM2-VL deployed directly on a satellite! Building HAL 9000 one step at a time. 🫡

1

24

3

2

2K

snwfdhmp retweeted

Leonie

@helloiamleonie

about 2 months ago

Spent the weekend crossing one thing off my "to learn" list: GRPO In this blog, we walk through: • What is GRPO and how does it work • Fine-tune @liquidai's LFM2.5-1.2B-Instruct • using @UnslothAI and some free @kaggle T4s Blog: https://t.co/vv3VK4GF1j Kaggle Notebook: https://t.co/hXOV9z4mK3

helloiamleonie's tweet photo. Spent the weekend crossing one thing off my "to learn" list:

GRPO

In this blog, we walk through:
• What is GRPO and how does it work
• Fine-tune @liquidai's LFM2.5-1.2B-Instruct
• using @UnslothAI and some free @kaggle T4s

Blog: https://t.co/vv3VK4GF1j

Kaggle Notebook: https://t.co/hXOV9z4mK3

7

630

76

953

73K

Who to follow

白水 feat. 魔法使い

@shirouzu_ref

Software engineer / Open source / 東方 Project / アフィンユーザー / 珂学家 / 和平主义者 / Agent / Explorers / 魔法使い / 🦋

Luuk Derksen

@luckylwk

Co-Founder of @orbiit_ai (acq. by @hivebrite). Building applied AI.

Alessandro De Carli ☁️ 🫡

snwfdhmp retweeted

2 months ago

🧮 Synthetic pretraining for sub-1B reasoning models Cool write-up from Tufa Labs (Matteo Saponati) on whether synthetic data augmentation actually helps very small (<1B) models reason better. They pretrain a 0.8B model with the Qwen3 architecture from scratch on 12B tokens of MegaMath-Web-Pro-Max, and compare the original corpus against three synthetic rewrite prompts. → The synthetic-pretrained models match the original's final accuracy with 3-6x fewer training tokens on GSM8K and ~2.5x fewer on MATH500. → The generator is Qwen3.5-0.8B in non-thinking mode, same parameter count as the student. This shows that a larger teacher is not necessarily needed. → The few-shot gap widens as you add more demonstrations (synthetic models pull 2-3x further ahead at higher shot counts), and it holds when demonstrations are randomized per question. → All three rewriting prompts beat the baseline despite very different output lengths (1.75x token ratio for the lightest "rephrasing" prompt, up to 3.53x for "first principles"). The shortest one is still competitive, which is interesting from a generation cost perspective. I enjoyed the fact that it was nicely self-contained and focused on small language models. It's particularly interesting to see that the shortest rephrasing prompt performs comparably to the longer ones at a much lower generation cost.

maximelabonne's tweet photo. 🧮 Synthetic pretraining for sub-1B reasoning models

Cool write-up from Tufa Labs (Matteo Saponati) on whether synthetic data augmentation actually helps very small (<1B) models reason better.

They pretrain a 0.8B model with the Qwen3 architecture from scratch on 12B tokens of MegaMath-Web-Pro-Max, and compare the original corpus against three synthetic rewrite prompts.

→ The synthetic-pretrained models match the original's final accuracy with 3-6x fewer training tokens on GSM8K and ~2.5x fewer on MATH500.

→ The generator is Qwen3.5-0.8B in non-thinking mode, same parameter count as the student. This shows that a larger teacher is not necessarily needed.

→ The few-shot gap widens as you add more demonstrations (synthetic models pull 2-3x further ahead at higher shot counts), and it holds when demonstrations are randomized per question.

→ All three rewriting prompts beat the baseline despite very different output lengths (1.75x token ratio for the lightest "rephrasing" prompt, up to 3.53x for "first principles"). The shortest one is still competitive, which is interesting from a generation cost perspective.

I enjoyed the fact that it was nicely self-contained and focused on small language models. It's particularly interesting to see that the shortest rephrasing prompt performs comparably to the longer ones at a much lower generation cost.

6

168

16

121

9K

snwfdhmp retweeted

Maxime Labonne

@maximelabonne

3 months ago

Massive unlock: pre-training for $0

3

76

5

24

16K

snwfdhmp retweeted

Maxime Labonne

@maximelabonne

3 months ago

FACT: If you don't train your 350M model on 28T tokens, you're not optimal

1

121

8

41

16K

snwfdhmp retweeted

Greg Brockman

@gdb

5 months ago

how did we ever write all that code by hand

1K

14K

903

875

1M

snwfdhmp retweeted

Cloudflare @Cloudflare

5 months ago

Time to consider not just human visitors, but to treat agents as first-class citizens. Cloudflare’s network now supports real-time content conversion to Markdown at the source using content negotiation headers. https://t.co/B7wYH4PtA8

167

5K

549

3K

2M

snwfdhmp retweeted