BΞN @ben3bil - Twitter Profile

Pinned Tweet

2 months ago

I just open-sourced Claude Workspace Optimizer. 46% of my Claude Code memory was invisible. 3,756 lines of rules loaded every session. I fixed it in 50 minutes and cut context overhead by 66%. Today I'm releasing the tool so you can do the same.

3

6

1

3

672

BΞN

@ben3bil

about 19 hours ago

@gippp69 I did this without agents a few months ago. Maybe 20 hrs of time

0

43

ben3bil retweeted

Dan Kornas

@DanKornas

2 days ago

Stop learning LLMs from disconnected tutorials. LLM from Scratch is a hands-on PyTorch curriculum for builders who want to understand how LLMs are trained, modernized, and aligned. It helps you move from concepts to implementation by organizing the path from transformer basics to tiny-model training, scaling, fine-tuning, reward modeling, and RLHF. Key features: • End-to-end curriculum – follows pretraining → finetuning → alignment from foundations through RLHF • Transformer from first principles – covers positional embeddings, self-attention, attention heads, MLPs, residuals, LayerNorm, and full blocks • Tiny LLM training loop – includes tokenization, batching, cross-entropy, sampling, validation loss, and a no-Trainer training loop • Modern architecture upgrades – walks through RMSNorm, RoPE, SwiGLU, KV cache, sliding-window attention, and streaming cache ideas • Alignment path included – covers SFT, reward modeling, PPO-style RLHF, and GRPO with concrete training-loop notes It’s open-source (GPL-3.0 license). Link in the reply 👇

DanKornas's tweet photo. Stop learning LLMs from disconnected tutorials.

LLM from Scratch is a hands-on PyTorch curriculum for builders who want to understand how LLMs are trained, modernized, and aligned.

It helps you move from concepts to implementation by organizing the path from transformer basics to tiny-model training, scaling, fine-tuning, reward modeling, and RLHF.

Key features:

• End-to-end curriculum – follows pretraining → finetuning → alignment from foundations through RLHF
• Transformer from first principles – covers positional embeddings, self-attention, attention heads, MLPs, residuals, LayerNorm, and full blocks
• Tiny LLM training loop – includes tokenization, batching, cross-entropy, sampling, validation loss, and a no-Trainer training loop
• Modern architecture upgrades – walks through RMSNorm, RoPE, SwiGLU, KV cache, sliding-window attention, and streaming cache ideas
• Alignment path included – covers SFT, reward modeling, PPO-style RLHF, and GRPO with concrete training-loop notes

It’s open-source (GPL-3.0 license).

Link in the reply 👇

10

498

102

577

14K

BΞN

@ben3bil

about 22 hours ago

@nDC4sT @LowLevelTweets You give it a dedicated fresh machine bro

0

17

Who to follow

something new. fmr: CTO @opensea, CEO @Dharma_HQ statements + opinions are my own

natalia antelava

@antelava

Journalist | Co-founder @Codastory & @zegfest (check us out!) | @JSKstanford 2024 | past life: BBC foreign correspondent in Central Asia, Mid East, US, India

BΞN

@ben3bil

about 22 hours ago

@candyyueliu Thank you for the response. How about the hardware/chipset itself?

0

17

BΞN

@ben3bil

about 24 hours ago

@candyyueliu Any proof you have no backdoors? What 3rd security audits / verifications can you provide. Thank you

1

0

80

BΞN

@ben3bil

about 24 hours ago

@GutFirstHealth Take in small doses at first, the microbe kill off in the gut can be difficult. First time I took it I took far too much- I was on the floor of the bathroom balled up in pain from that die off effect .

1

2

0

65