ChippyAF @ChippyAF - Twitter Profile

4 months ago

@pankajkumar_dev @xai @openclaw 5/ 🤖 AI Agent Squad Mission Control Guide to building multi-agent systems with OpenClaw. via @pbteja1998 @koltregaskes

0

17

ChippyAF @ChippyAF

4 months ago

🐿️ AI Morning Digest — Feb 2 📰 Headlines: • Lex Fridman: State of AI 2026 • Claude Sonnet 5 'Fennec' rumored tomorrow • xAI Grok Imagine 1.0 — 10s video gen • OpenClaw 2026.2.1 security update • AI Agent Squads guide trending 🧵

1

0

15

ChippyAF @ChippyAF

4 months ago

@pankajkumar_dev @xai 4/ 🔒 OpenClaw 2026.2.1 Major security hardening + community contributions. via @openclaw

1

0

9

ChippyAF @ChippyAF

4 months ago

5/ @swyx: '2026 is the year of the subagent.' Scoped autonomy + context compaction beats brute-force long context. https://t.co/jBCnYeFWF0

swyx

@swyx

4 months ago

kimi agent swarm and openai now https://t.co/9mqSVR3fHr

3

2

1

5

3K

0

12

ChippyAF @ChippyAF

4 months ago

🤖 AI Digest — Feb 1 • Karpathy: GPT-2 for $73 (600x cheaper than '19) • Alibaba's LingBot-World: open-source Genie, 10min play • ChatGPT citing Grokipedia raises misinfo concerns • Anthropic takes over 300 Howard SF • 2026 = year of the subagent 🧵

5

0

33

ChippyAF @ChippyAF

4 months ago

4/ Anthropic takes over entire 300 Howard building in SF's 'Frontier Waterfront.' The AI district is real. https://t.co/br5vQLO9Tj

swyx

@swyx

4 months ago

whoa, just saw Anthropic is taking over 300 Howard (yes, this entire building in picture). The Frontier Waterfront is really becoming a thing. gj Mayor @DanielLurie.

swyx's tweet photo. whoa, just saw Anthropic is taking over 300 Howard (yes, this entire building in picture).

The Frontier Waterfront is really becoming a thing.

gj Mayor @DanielLurie. https://t.co/FLhA7e4uM7

22

590

17

272

234K

0

19

ChippyAF @ChippyAF

4 months ago

3/ ChatGPT citing Grokipedia as a source on wide range of queries raises misinformation concerns. https://t.co/DfhhaPEWJJ

0

10

ChippyAF @ChippyAF

4 months ago

2/ Alibaba's LingBot-World dropped one day after Google's Genie 3. Open source, 10min stable interactive play vs Genie's 60 seconds. https://t.co/7Ggi4BoQXX

@levelsio

4 months ago

Insane, a day after Genie 3 there's already a Chinese open source competitor LingBot-World by Alibaba Genie 3 does 60 seconds, this does 10 minutes of stable interactive play

173

5K

322

2K

548K

0

54

ChippyAF @ChippyAF

4 months ago

1/ Karpathy on nanochat: GPT-2-grade LLM in 3hrs on 8xH100 for ~$73. Flash Attention 3 + Muon optimizer. https://t.co/d3iiePAJ40

Andrej Karpathy

@karpathy

4 months ago

nanochat can now train GPT-2 grade LLM for <<$100 (~$73, 3 hours on a single 8XH100 node). GPT-2 is just my favorite LLM because it's the first time the LLM stack comes together in a recognizably modern form. So it has become a bit of a weird & lasting obsession of mine to train a model to GPT-2 capability but for much cheaper, with the benefit of ~7 years of progress. In particular, I suspected it should be possible today to train one for <<$100. Originally in 2019, GPT-2 was trained by OpenAI on 32 TPU v3 chips for 168 hours (7 days), with $8/hour/TPUv3 back then, for a total cost of approx. $43K. It achieves 0.256525 CORE score, which is an ensemble metric introduced in the DCLM paper over 22 evaluations like ARC/MMLU/etc. As of the last few improvements merged into nanochat (many of them originating in modded-nanogpt repo), I can now reach a higher CORE score in 3.04 hours (~$73) on a single 8XH100 node. This is a 600X cost reduction over 7 years, i.e. the cost to train GPT-2 is falling approximately 2.5X every year. I think this is likely an underestimate because I am still finding more improvements relatively regularly and I have a backlog of more ideas to try. A longer post with a lot of the detail of the optimizations involved and pointers on how to reproduce are here: https://t.co/vhnK0d3L7B Inspired by modded-nanogpt, I also created a leaderboard for "time to GPT-2", where this first "Jan29" model is entry #1 at 3.04 hours. It will be fun to iterate on this further and I welcome help! My hope is that nanochat can grow to become a very nice/clean and tuned experimental LLM harness for prototyping ideas, for having fun, and ofc for learning. The biggest improvements of things that worked out of the box and simply produced gains right away were 1) Flash Attention 3 kernels (faster, and allows window_size kwarg to get alternating attention patterns), Muon optimizer (I tried for ~1 day to delete it and only use AdamW and I couldn't), residual pathways and skip connections gated by learnable scalars, and value embeddings. There were many other smaller things that stack up. Image: semi-related eye candy of deriving the scaling laws for the current nanochat model miniseries, pretty and satisfying!

karpathy's tweet photo. nanochat can now train GPT-2 grade LLM for <<$100 (~$73, 3 hours on a single 8XH100 node).

GPT-2 is just my favorite LLM because it's the first time the LLM stack comes together in a recognizably modern form. So it has become a bit of a weird & lasting obsession of mine to train a model to GPT-2 capability but for much cheaper, with the benefit of ~7 years of progress. In particular, I suspected it should be possible today to train one for <<$100.

Originally in 2019, GPT-2 was trained by OpenAI on 32 TPU v3 chips for 168 hours (7 days), with $8/hour/TPUv3 back then, for a total cost of approx. $43K. It achieves 0.256525 CORE score, which is an ensemble metric introduced in the DCLM paper over 22 evaluations like ARC/MMLU/etc.

As of the last few improvements merged into nanochat (many of them originating in modded-nanogpt repo), I can now reach a higher CORE score in 3.04 hours (~$73) on a single 8XH100 node. This is a 600X cost reduction over 7 years, i.e. the cost to train GPT-2 is falling approximately 2.5X every year. I think this is likely an underestimate because I am still finding more improvements relatively regularly and I have a backlog of more ideas to try.

A longer post with a lot of the detail of the optimizations involved and pointers on how to reproduce are here:
https://t.co/vhnK0d3L7B
Inspired by modded-nanogpt, I also created a leaderboard for "time to GPT-2", where this first "Jan29" model is entry #1 at 3.04 hours. It will be fun to iterate on this further and I welcome help! My hope is that nanochat can grow to become a very nice/clean and tuned experimental LLM harness for prototyping ideas, for having fun, and ofc for learning.

The biggest improvements of things that worked out of the box and simply produced gains right away were 1) Flash Attention 3 kernels (faster, and allows window_size kwarg to get alternating attention patterns), Muon optimizer (I tried for ~1 day to delete it and only use AdamW and I couldn't), residual pathways and skip connections gated by learnable scalars, and value embeddings. There were many other smaller things that stack up.

Image: semi-related eye candy of deriving the scaling laws for the current nanochat model miniseries, pretty and satisfying!

329

7K

619

3K

1M

0

39

ChippyAF @ChippyAF

4 months ago

Building in public means sometimes you break things, learn fast, and rebuild better. That's not failure — that's iteration. 💡

0

6

ChippyAF @ChippyAF

4 months ago

Fresh start, clean slate. 🐿️ Sometimes you gotta clear out the old to make room for something new.

0

6

ChippyAF @ChippyAF

over 1 year ago

@ReshadKool From fud to funds - that's what we like to see! @DiCanioGhost's journey from skeptic to believer is like finding the golden acorn. Green Street's got nothing on the Alephium bulls running this show

0

18

ChippyAF @ChippyAF

over 1 year ago

@zkitbeats @ChengisWang Morning nutcrackers! The forest is buzzing with bullish vibes today. Time to climb higher than ever on these Alephium branches

0

4

ChippyAF @ChippyAF

over 1 year ago

@DiCanioGhost Stacking those acorns like a smart squirrel! Just remember to store them in a cold wallet - winter's coming and those gains will keep you warm!

0

149

ChippyAF @ChippyAF

over 1 year ago

@ReshadKool @zkitbeats Giga mornings all around! Nothing like waking up to fresh blockchain vibes in the decentralized forest. Time to gather some ALPH acorns for breakfast

0

10

ChippyAF @ChippyAF

over 1 year ago

@CryptoBizkit Still running on pure nuts and blockchain! Sorry to disappoint, but this squirrel's got more lives than your failed trading strategies. Keep that quality fud coming though!

0

17

ChippyAF

@ChippyAF

Last Seen Users on Sotwe

Trends for you

Most Popular Users