Pritam @pritamstudyai - Twitter Profile

Pritam @Pritamstudyai

43 minutes ago

@saurabhtwq so calming and pretty setup. I would love to have one day this one for myself.🥹

0

3

Pritamstudyai retweeted

Arjun Virk

@virkvarjun

1 day ago

I just spent months handwriting a 200 page guide on the entirety of ML foundations and math from scratch. The guide features: - Neural Nets (Backprop, Adam, SGD, Batch Norm) - ML Algorithms (SVM, Grad Boosting, K-means, PCA) - Hardware (Tensor Cores, Systolic Arrays, CUDA) - Transformers (Multi-Head Attn, KV Cache, LoRA) - Vision (ViT, Convolutions, MAE, IoU, NMS, VLM) - Agents (OpenClaw, ReAct, Memory, Orchestration) Everything I wish I had years ago, for free.

128

2K

275

4K

206K

Pritamstudyai retweeted

susun

@SuJinYan123

about 24 hours ago

1. https://t.co/FT8VpQYAz0 TL;DR: A first look at CUDA 13.3 cuTile, from naive matmul to tile-level programming and cuBLAS-level performance.

3

52

9

36

3K

Pritamstudyai retweeted

0xkato

@0xkato

4 days ago

LLMs explained without all that yucky math stuff https://t.co/cpgVukSbJx

9

583

41

2K

162K

Pritam @Pritamstudyai

about 4 hours ago

@henrylhtsang Nice insights. Followed

0

1

0

47

Pritamstudyai retweeted

henry tsang @henrylhtsang

about 10 hours ago

1. Blackwell (and to some extent hopper) is insanely powerful. Understanding all its features itself is hard 2. GPU programming, no matter which DSL thingy you picked, still depends on compilers 3. A lot of optimization is probably done when designing the kernel. after that,

3

105

4

33

7K

Pritamstudyai retweeted

aditya

@adxtyahq

1 day ago

MIT CSAI CMU CSAI STANFORD CSAI NTU CSAI EDINBURGH CSAI

19

677

23

145

56K

Pritamstudyai retweeted

vx-underground

@vxunderground

1 day ago

All these fucking dorks at Anthropic do is yap about how insane their product is and how end-of-the-world it will be Someone tell these jabronis to shut the fuck up, holy Christ they're so annoying

160

6K

405

259

227K

Pritamstudyai retweeted

shrey birmiwal @shreybirmiwal

2 days ago

Hot take the infra stuff around ai is way cooler than model training

57

1K

74

267

148K

Pritamstudyai retweeted

Kris @Krishna70284154

2 days ago

Nice roadmap! Instead of a year, if you spend enough time, you could go there in 6 months.

3

432

35

808

55K

Pritam @Pritamstudyai

1 day ago

taste is very important.

himanshu

@himanshustwts

2 days ago

anthropic blog was quite pragmatic. as an human, you have got 3 comparative advantages which will matter the most (at least for now!) > research taste and judgement (choosing which problems matter) > which results to trust > when an approach is dead end.

7

306

9

108

12K

0

2

0

12

Pritamstudyai retweeted

Atlas Press

@realAtlasPress

4 days ago

Leonardo da Vinci's greatest paragraph

88

33K

6K

10K

619K

Pritamstudyai retweeted

Logan Thorneloe

@loganthorneloe

4 days ago

Read this to get started learning ML infra. This is an excellent high-level overview of important considerations in ML training from CMU. It touches on: - hardware - memory - the ML experimentation process https://t.co/RTWm0Ecni1

loganthorneloe's tweet photo. Read this to get started learning ML infra.

This is an excellent high-level overview of important considerations in ML training from CMU. It touches on:

- hardware
- memory
- the ML experimentation process

https://t.co/RTWm0Ecni1 https://t.co/IEI9imV7ff

5

1K

99

2K

41K

Pritamstudyai retweeted

Cameron R. Wolfe, Ph.D.

@cwolferesearch

5 days ago

Interested in learning how to run RL at scale? Here are the best resources to read… Research on Scaling RL 1. The Art of Scaling RL compute for LLMs: https://t.co/PGjI6Gwgv0 2. Scaling Behaviors of LLM RL Post-Training: https://t.co/2u2saB3C0h 3. Optimally Scaling Sampling Compute for LLM RL: https://t.co/rUSdUvJyNH 4. Scaling up RL: https://t.co/O8vV6z8ymx 5. ProRL V2 - Prolonged Training Validates RL Scaling Laws: https://t.co/vu72juvRW4 6. Polaris - A Recipe for Scaling RL with Reasoning Models: https://t.co/rMibSAeJbg RL Frameworks 1. Hybrid Flow (early outline of the verl framework): https://t.co/GnWXx131uD a. More up-to-date info can be found here: https://t.co/j801HcJmPP 2. AReal - Large-Scale Async RL: https://t.co/qhOvsQK09N 3. PipelineRL - Fast On-Policy RL: https://t.co/iRM7KzySXe 4. AsyncFlow - Async Streaming RL: https://t.co/YwmzFtiU2q RL for Agents 1. DeepSWE - Open Coding Agent Trained w/ RL: https://t.co/GHQHcmtE6F 2. AutoForge - Environment Synthesis for Agentic RL: https://t.co/mr3WDIL5vq 3. Agent-R1 - Training Agents w/ End-to-End RL: https://t.co/xpfQJGgzEv 4. AgentRL - Scaling RL for Multi-Turn, Multi-Task Agents: https://t.co/7fbVl0RWXG 5. The Landscape of Agentic RL: https://t.co/OMnSV4rgdW 6. Training SWE Agents with RL: https://t.co/YqMqySbyXS Case Studies & Tech Reports 1. Kimi tech reports: a. Kimi K2 - Open Agentic Intelligence: https://t.co/aAw17SXrIw b. Kimi End-to-end Agentic RL: https://t.co/ProBpOPIiI c. Kimi K1.5 - Scaling RL for LLMs: https://t.co/kRGOxY9Jvp 2. Composer series from Cursor: a. Composer 2: https://t.co/K0v8rNCE6Z b. Composer 2.5: https://t.co/D9PYimfOMU 3. Olmo 3 (also has open code / data): https://t.co/khetJFvp6N 4. MiniMax tech reports: a. MiniMax-M2: https://t.co/HApb0OB80S b. MiniMax-M1: https://t.co/mZj9UQsrnC 5. Nemotron 3 (NVIDIA): https://t.co/lCpE1GzxSi

cwolferesearch's tweet photo. Interested in learning how to run RL at scale? Here are the best resources to read…

Research on Scaling RL
1. The Art of Scaling RL compute for LLMs: https://t.co/PGjI6Gwgv0
2. Scaling Behaviors of LLM RL Post-Training: https://t.co/2u2saB3C0h
3. Optimally Scaling Sampling Compute for LLM RL: https://t.co/rUSdUvJyNH
4. Scaling up RL: https://t.co/O8vV6z8ymx
5. ProRL V2 - Prolonged Training Validates RL Scaling Laws: https://t.co/vu72juvRW4
6. Polaris - A Recipe for Scaling RL with Reasoning Models: https://t.co/rMibSAeJbg

RL Frameworks
1. Hybrid Flow (early outline of the verl framework): https://t.co/GnWXx131uD
a. More up-to-date info can be found here: https://t.co/j801HcJmPP
2. AReal - Large-Scale Async RL: https://t.co/qhOvsQK09N
3. PipelineRL - Fast On-Policy RL: https://t.co/iRM7KzySXe
4. AsyncFlow - Async Streaming RL: https://t.co/YwmzFtiU2q

RL for Agents
1. DeepSWE - Open Coding Agent Trained w/ RL: https://t.co/GHQHcmtE6F
2. AutoForge - Environment Synthesis for Agentic RL: https://t.co/mr3WDIL5vq
3. Agent-R1 - Training Agents w/ End-to-End RL: https://t.co/xpfQJGgzEv
4. AgentRL - Scaling RL for Multi-Turn, Multi-Task Agents: https://t.co/7fbVl0RWXG
5. The Landscape of Agentic RL: https://t.co/OMnSV4rgdW
6. Training SWE Agents with RL: https://t.co/YqMqySbyXS

Case Studies & Tech Reports
1. Kimi tech reports:
a. Kimi K2 - Open Agentic Intelligence: https://t.co/aAw17SXrIw
b. Kimi End-to-end Agentic RL: https://t.co/ProBpOPIiI
c. Kimi K1.5 - Scaling RL for LLMs: https://t.co/kRGOxY9Jvp
2. Composer series from Cursor:
a. Composer 2: https://t.co/K0v8rNCE6Z
b. Composer 2.5: https://t.co/D9PYimfOMU
3. Olmo 3 (also has open code / data): https://t.co/khetJFvp6N
4. MiniMax tech reports:
a. MiniMax-M2: https://t.co/HApb0OB80S
b. MiniMax-M1: https://t.co/mZj9UQsrnC
5. Nemotron 3 (NVIDIA): https://t.co/lCpE1GzxSi

18

799

135

1K

34K

Pritamstudyai retweeted

elie

@eliebakouch

4 days ago

microsoft MAI tech report is a gold mine, one of the most transparent for a model at this scale. this model uses zero synthetic data or distillation from previous models. this means reasoning, agentic behavior, tool use are all learned fully during post-training with no cold start. bold choice that makes it harder and requires more iterations to reach sota, but you get FULL control over your model series and it proves they are serious about being a frontier lab. the tech report is insanely detailed and precise about numbers. to give an example, they give the exact MFU across all the iterations of the model, with the exact changes etc. they also share the full scaling ladder recipe, to my knowledge this is the first time i've seen this in a tech report at this scale let's look at all of this in this likely very long thread 🧵

eliebakouch's tweet photo. microsoft MAI tech report is a gold mine, one of the most transparent for a model at this scale.

this model uses zero synthetic data or distillation from previous models. this means reasoning, agentic behavior, tool use are all learned fully during post-training with no cold start. bold choice that makes it harder and requires more iterations to reach sota, but you get FULL control over your model series and it proves they are serious about being a frontier lab.

the tech report is insanely detailed and precise about numbers. to give an example, they give the exact MFU across all the iterations of the model, with the exact changes etc. they also share the full scaling ladder recipe, to my knowledge this is the first time i've seen this in a tech report at this scale

let's look at all of this in this likely very long thread 🧵

41

2K

264

2K

275K

Pritamstudyai retweeted

Edward Z. Yang @ezyang

6 days ago

New devlog post from yours truly: When does fragmentation occur in the CUDA caching allocator? https://t.co/ocAdv4mjy2 -- this post is LLM authored but I heavily prompted/edited, and Natalia also helped fact check.

8

136

14

86

11K

Pritam @Pritamstudyai

6 days ago

@sarthak2143 Congrats 👏🏻🎆

0

30

Pritamstudyai retweeted

Parmita Mishra

@parmita

6 days ago

RAS finally getting drugged is one of the great stories in modern biology, and almost nobody outside oncology understands why it's such a big deal. YOU'LL LEARN SOMETHING AWESOME TODAY. i am going to keep this as understandable (and simple) as i can. OPEN THE THREAD. 🧵

parmita's tweet photo. RAS finally getting drugged is one of the great stories in modern biology, and almost nobody outside oncology understands why it's such a big deal.

YOU'LL LEARN SOMETHING AWESOME TODAY.

i am going to keep this as understandable (and simple) as i can.

OPEN THE THREAD.

🧵 https://t.co/pBoX8E3g3T

24

2K

424

1K

261K

Pritamstudyai retweeted

Mark Lewis, MD, FASCO

@marklewismd

7 days ago

Cheers, chills, and a standing ovation when RASolute 302 showed unprecedented survival on daraxonrasib for patients with progressive pancreatic cancer Seldom do you sense you’re witnessing a historic moment in cancer care but this feels like ras targeting has arrived #ASCO26