abhishek sharma @asreddevils - Twitter Profile

Pinned Tweet

7 months ago

Built an open-source RLM (Recursive Language Models) package in Python! Solves LLM "context rot" by treating context as a programmable object the model explores recursively. Credits: - @rasmus1610 for the initial implementation inspiration

Marius Vach

@rasmus1610

7 months ago

Here is the solveit dialog implementing RLMs by @a1zhang using `lisette` and `toolslm` by @answerdotai (h/t @jeremyphoward)

1

28

11

29

23K

2

37

8

38

11K

abhishek sharma @asreddevils

5 months ago

Thankyou @AirIndiaX for the lovely check-in experience and hospitality. 10/10 would recommend. Special thanks to Simran Raina, Jagdish Raj, Abhinay Kumar and Dev Singh 😍🎊🔥

asreddevils's tweet photo. Thankyou @AirIndiaX for the lovely check-in experience and hospitality. 10/10 would recommend. Special thanks to Simran Raina, Jagdish Raj, Abhinay Kumar and Dev Singh 😍🎊🔥 https://t.co/dksdptSi1I

1

0

119

abhishek sharma @asreddevils

6 months ago

Unsloth Puzzles: https://t.co/JK6KFzzyok

0

1

0

1

45

abhishek sharma @asreddevils

6 months ago

🧵 I wrote my first GPU kernel solving @UnslothAI 's challenge A Here's everything I learned about NF4 quantization and Triton Starting from absolute zero → working kernel in 3 days Let's go 👇

1

2

0

1

80

Who to follow

Debjeet Biswas

@detj

Building https://t.co/NcVmDAL3l5 - an open source tool to monitor mobile apps

すうしき

@gengosiki

It's very easy to say, I want to be a data scientist. However, in order to do so, you need to get out of comfort zone.

Dipesh Gurav

@dipeshgurav_

Senior product designer behind AI tools cutting enterprise costs 20%, a wellbeing app with 1M+ downloads in 22 languages, and a logo adopted as a national ident

abhishek sharma @asreddevils

6 months ago

Full post includes: ✅ Toy examples (dequantize one number by hand) ✅ Every broken version of the kernel ✅ Line-by-line explanation of the final code ✅ What custom PTX assembly could add For anyone learning GPU programming: https://t.co/WEBhJuRkpK

1

0

55

abhishek sharma @asreddevils

6 months ago

Thanks to @UnslothAI for Challenge B that started this rabbit hole. What began as a competition became a masterclass in distributed systems + quantization internals. Sometimes the best learning comes from "impossible" problems. 🦙

0

22

abhishek sharma @asreddevils

6 months ago

🧵 How to train Llama 3.1 8B on 2 consumer GPUs using FSDP2 + QLoRA. Here's how it works (and what broke along the way):

1

0

37

abhishek sharma @asreddevils

6 months ago

The lesson: tools built by different teams don't always compose cleanly. The skill: reading source code, understanding assumptions, finding escape hatches. Full blog with complete code: https://t.co/v5K7Ha209f

1

0

35

abhishek sharma @asreddevils

6 months ago

If you're into ML systems or LLM training, let's connect! 🚀 #MachineLearning #PyTorch #LLM

0

19

abhishek sharma @asreddevils

6 months ago

@UnslothAI Challenge E on memory-efficient backpropagation 🧵 Problem: LLMs with 128K vocab = 4-8GB VRAM just for logits Solution: Custom autograd + gradient checkpointing Results: 46% memory savings, verified on real models Here's what I learned 👇

1

0

46

abhishek sharma @asreddevils

6 months ago

Full writeup with code, mistakes, and insights: https://t.co/zelJSP47xh

1

0

32

abhishek sharma

@asreddevils

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users