I just open-sourced Claude Workspace Optimizer. 46% of my Claude Code memory was invisible. 3,756 lines of rules loaded every session. I fixed it in 50 minutes and cut context overhead by 66%. Today I'm releasing the tool so you can do the same.
Stop learning LLMs from disconnected tutorials.
LLM from Scratch is a hands-on PyTorch curriculum for builders who want to understand how LLMs are trained, modernized, and aligned.
It helps you move from concepts to implementation by organizing the path from transformer basics to tiny-model training, scaling, fine-tuning, reward modeling, and RLHF.
Key features:
• End-to-end curriculum – follows pretraining → finetuning → alignment from foundations through RLHF
• Transformer from first principles – covers positional embeddings, self-attention, attention heads, MLPs, residuals, LayerNorm, and full blocks
• Tiny LLM training loop – includes tokenization, batching, cross-entropy, sampling, validation loss, and a no-Trainer training loop
• Modern architecture upgrades – walks through RMSNorm, RoPE, SwiGLU, KV cache, sliding-window attention, and streaming cache ideas
• Alignment path included – covers SFT, reward modeling, PPO-style RLHF, and GRPO with concrete training-loop notes
It’s open-source (GPL-3.0 license).
Link in the reply 👇
@GutFirstHealth Take in small doses at first, the microbe kill off in the gut can be difficult. First time I took it I took far too much- I was on the floor of the bathroom balled up in pain from that die off effect .