M @bhkm99 - Twitter Profile

BHKM99 retweeted

Cameron 🇺🇸 🗽🦅

@CameronCorduroy

7 days ago

>renames it to the Department of War >names himself the Secretary of War >fights one war >loses

1K

387K

39K

10K

4M

M @BHKM99

20 days ago

@lvwerra Great work ! Is it published ?

0

1

24

M @BHKM99

6 months ago

@BetterCallMedhi @steph_sejourne @ThierryBreton Séjourné is out DEI hire

0

1

0

203

M @BHKM99

7 months ago

@ScholarshipfPhd hi

0

7

Who to follow

Marquise de Saad 🔻

@TwittoSterone

لقد انتقل الإنسان من الجنون إلى البهلوانية

pepz 📵

@joybouy

covfefe

#better_call_ben الافوكاتو

@benmedchou

الله لا يمكن ان يعطينا عقولا و يعطينا شرائع مخالفة لها ابن رشد

BHKM99 retweeted

BURKOV

@burkov

7 months ago

A math professor noticed his kitchen sink at home was leaking. He called a plumber. The plumber came the next day, tightened a couple of nuts, and the sink worked perfectly again. The professor was delighted. But when, a minute later, the plumber handed him the bill, he was shocked. “This is a third of my monthly salary!” “Yeah, I get it…” said the plumber. “Why don’t you come work for our company as a plumber? You’ll make three times more than you do as a professor. Just remember: when you apply, say you only finished seventh grade. They don’t like hiring educated people.” So the professor got a job as a plumber, and his life really did improve. All he had to do was tighten a nut here and there every so often, and his salary was much higher. One day, the management of the plumbing company decided that every plumber had to attend evening classes to finish eighth grade. So our professor had to go too. By chance, the very first class was math. The evening school teacher, wanting to check what the students knew, asked for the formula for the area of a circle. They called the professor up to the board, and he suddenly realized he’d forgotten it. He started frantically reasoning it out, covering the board with integrals, differentials, and all sorts of fancy formulas to re-derive the result. In the end, he got: S = –π r² He didn’t like the minus sign, so he started again. Again he got a minus. No matter what he did, it kept coming out negative. He cast a panicked look at the class, and all the plumbers were whispering: “Swap the limits of integration!”

423

36K

3K

7K

4M

M @BHKM99

8 months ago

@WasimShips @Hartdrawss Is it free ?

0

12

M @BHKM99

8 months ago

@yousef_ki1 @awesomesaucce .

0

4

BHKM99 retweeted

Andrej Karpathy

@karpathy

8 months ago

Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI. It weighs ~8,000 lines of imo quite clean code to: - Train the tokenizer using a new Rust implementation - Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics - Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use. - SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval) - RL the model optionally on GSM8K with "GRPO" - Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI. - Write a single markdown report card, summarizing and gamifying the whole thing. Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc. My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved. Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.

karpathy's tweet photo. Excited to release new repo: nanochat!
(it's among the most unhinged I've written).

Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI.

It weighs ~8,000 lines of imo quite clean code to:

- Train the tokenizer using a new Rust implementation
- Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics
- Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use.
- SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval)
- RL the model optionally on GSM8K with "GRPO"
- Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI.
- Write a single markdown report card, summarizing and gamifying the whole thing.

Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc.

My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved.

Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.

681

24K

3K

18K

6M

M @BHKM99

9 months ago

@Sadatty @WayZee__ @ayafcb_ Tu as tué combien de personnes jusqu’ici avec ton remède miracle ?

0

4

0

240

BHKM99 retweeted

Ryan McCoy

@brain_racked

over 1 year ago

Unfinished business in the country of Parth… (a collab with @Kling_ai, using the new 1.6 model ) What do you think of this place?

170

3K

418

724

130K

BHKM99 retweeted

Kat ⊷ the Poet Engineer

@poetengineer__

over 1 year ago

✋🌷

31

2K

133

194

75K

BHKM99 retweeted

Retro Anime

@retro_anime

over 1 year ago

Trigun (1998)

16

5K

597

610

112K

M @BHKM99

over 1 year ago

@Banlieuedeprof Serais-tu un adepte des Vibram five fingers?

1

0

3K

M @BHKM99

almost 2 years ago

@ClientsRATP tenez vos agents sur la ligne 91 . Ils se permettent de verbaliser les clients pendant qu’ils achètent leur ticket sur l’application (ce qui prend un certain temps vous en conviendrez). Je voudrais faire une réclamation pour me faire rembourser cette amende indue

1

0

69

M @BHKM99

about 2 years ago

@slatwslam3laNbi @tawfiq_min28 Mais quel inculte, tu falsifies la parole de Dieu pour défendre une thèse raciste

0

63

BHKM99 retweeted

keta @keta_mean_

over 2 years ago

i think it would be cool to hear a little ding when our frontal cortex is done developing

16

829

73

39

27K

BHKM99 retweeted

Yanis Varoufakis

@yanisvaroufakis

over 2 years ago

Israel's war on civilians in Gaza is proceeding exactly as planned. ‘No food, no water’: aid officials think pockets of famine exist in Gaza. Precisely the kind of war that the civilised world thought it had agreed to ban with the Geneva convention. https://t.co/WYN8ttIeNR

185

6K

3K

141

174K

M @BHKM99

over 2 years ago

@RuxandraTeslo Excellent analysis. However, there remains some uncertainty regarding the increased incidence of malformations in children conceived from frozen embryos. What are your thoughts, did you find any data about this particular issue ?

0

2

0

221

BHKM99 retweeted