Built an open-source RLM (Recursive Language Models) package in Python!
Solves LLM "context rot" by treating context as a programmable object the model explores recursively.
Credits:
- @rasmus1610 for the initial implementation inspiration
Thankyou @AirIndiaX for the lovely check-in experience and hospitality. 10/10 would recommend. Special thanks to Simran Raina, Jagdish Raj, Abhinay Kumar and Dev Singh ๐๐๐ฅ
๐งต I wrote my first GPU kernel solving @UnslothAI 's challenge A
Here's everything I learned about NF4 quantization and Triton
Starting from absolute zero โ working kernel in 3 days
Let's go ๐
Full post includes:
โ Toy examples (dequantize one number by hand)
โ Every broken version of the kernel
โ Line-by-line explanation of the final code
โ What custom PTX assembly could add
For anyone learning GPU programming: https://t.co/WEBhJuRkpK
Thanks to @UnslothAI for Challenge B that started this rabbit hole.
What began as a competition became a masterclass in distributed systems + quantization internals.
Sometimes the best learning comes from "impossible" problems. ๐ฆ
The lesson: tools built by different teams don't always compose cleanly.
The skill: reading source code, understanding assumptions, finding escape hatches.
Full blog with complete code: https://t.co/v5K7Ha209f
@UnslothAI Challenge E on memory-efficient backpropagation ๐งต
Problem: LLMs with 128K vocab = 4-8GB VRAM just for logits
Solution: Custom autograd + gradient checkpointing
Results: 46% memory savings, verified on real models
Here's what I learned ๐