Gabriel Ceicoschi @gceico - Twitter Profile

about 1 month ago

@mitchellh Noticed same issues, cant merge PRs, blank screens. Nice story, went through you article and got a view of the past. In 2008 I was too young to even know what GitHub it, appreciate the retrospective

0

1K

Gabriel Ceicoschi @gceico

about 1 month ago

@humalikeai Props to Jared, cool project guys. Is this available for use in Europe? I would like to try it out, but not sure if it follows our GDPR rules. Any thoughts on that?

1

0

97

Gabriel Ceicoschi @gceico

about 1 month ago

Happy startup founder. Check this out. https://t.co/eIouUcpjJ3

0

9

Gabriel Ceicoschi @gceico

8 months ago

cool

Andrej Karpathy

@karpathy

8 months ago

Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI. It weighs ~8,000 lines of imo quite clean code to: - Train the tokenizer using a new Rust implementation - Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics - Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use. - SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval) - RL the model optionally on GSM8K with "GRPO" - Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI. - Write a single markdown report card, summarizing and gamifying the whole thing. Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc. My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved. Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.

karpathy's tweet photo. Excited to release new repo: nanochat!
(it's among the most unhinged I've written).

Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI.

It weighs ~8,000 lines of imo quite clean code to:

- Train the tokenizer using a new Rust implementation
- Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics
- Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use.
- SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval)
- RL the model optionally on GSM8K with "GRPO"
- Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI.
- Write a single markdown report card, summarizing and gamifying the whole thing.

Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc.

My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved.

Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.

684

24K

3K

18K

6M

0

18

Who to follow

Jeroen Reumkens — FrontendFYI

@JeroenReumkens

Teaching the craft of Frontend. ⇢ https://t.co/V6ZSHM3bCs ⇢ https://t.co/CpnJSWfVpv ⇢ https://t.co/3FRyrFvzEv ⇢ https://t.co/WWaSeuNKKY

bebe.eth

@alexandru_ticea

#Solana ☀️ // Blockchain enthusiast // Cosmos Validator // Internet Service Provider // POW Miner // Solana Validator

Ege Aytin

@eaytin

Building Signal Based Outbound Platform | ex-CMO @GetPermify (acquired by FusionAuth)

Gabriel Ceicoschi @gceico

8 months ago

@kaysorin @OpenAI @DanielLurie @sama Guys, I have to say that this year’s branding is great. Simple. Relatable and super cute.

0

2

0

34

Gabriel Ceicoschi @gceico

8 months ago

Met the Altman.

0

12

Gabriel Ceicoschi @gceico

8 months ago

Came here for food and merch. Happy to meet some people and get inspired.

0

1

0

15

Gabriel Ceicoschi @gceico

over 2 years ago

Been there! Awesome workshop ✌️

Kent C. Dodds 🏹

@kentcdodds

over 2 years ago · The Netherlands

What an awesome crew! Had a great time with the Web Auth workshop attendees today for @reactlivenl!

1

54

5

0

18K

0

1

0

36

Gabriel Ceicoschi @gceico

over 2 years ago

Meeting @kentcdodds, Kody the Koala and many other amazing people at #ReactLive Conference in Amsterdam.

0

32

gceico retweeted

Ray Dalio

@RayDalio

about 5 years ago

Many of you have asked about finding a work-life balance, so I wanted to share some of the principles that have helped me to achieve this. (1/3)

RayDalio's tweet photo. Many of you have asked about finding a work-life balance, so I wanted to share some of the principles that have helped me to achieve this. (1/3) https://t.co/KSE6C1DcqV

19

1K

267

213

0

gceico retweeted

Pranay Pathole

@PPathole

about 5 years ago

"What's the meaning of life ... I came to a conclusion that what really matters is trying to understand the right questions to ask & the more that we can increase the scope & scale of human consciousness the better we are able to ask these questions" — @elonmusk

460

33K

5K

1K

0

gceico retweeted

Coinbase 🛡️

@coinbase

over 5 years ago

Ethereum is the second-biggest cryptocurrency by market cap after Bitcoin, but it's more than digital money — it’s the foundation of a $200 billion-plus economy. From DeFi to digital art, understand the forces at work behind the rise of Ethereum. https://t.co/DM8fNYrrpz

543

2K

470

76

0

gceico retweeted

Anthony Pompliano 🌪

@APompliano

over 5 years ago

1 bitcoin is going to be enough for financial freedom for most of the global population.

454

7K

639

58

0

gceico retweeted

Beniamin Mincu |🇺🇸/acc

@beniaminmincu

over 5 years ago

Had a great conversation with @AlexSaundersAU. We outlined the goal of Elrond, why it matters, and why our unique position opens perhaps the most compelling path to wide-spread adoption.

23

747

188

1

0