Saksham @sgdescent - Twitter Profile

It is the nature of products with ML baked in where you see such failures. Even if new grads don’t have business sense (which they do?) there are other systems in place that make sure no product goes out with this intention, it happens and they will fix it. NL search is hard

Ankit Jxa

@kingofknowwhere

26 days ago

This is what happens when you hire new grads with zero business understanding to do ML products. if cosine similarity >= 0.85 show result (doesn't do Entity vs misspellings vs brand name check) 😭

19

558

20

133

66K

0

2

1

0

359

sgdescent retweeted

Ishaan Watts

@IshaanWatts18

28 days ago

Spending billions to train the "best" base model? You might be optimizing the wrong thing! 🎯 We show that controlling sharpness during mid-training leads to over 35% less forgetting after fine-tuning / quantization... even when the base model itself gets worse. 🧵 Takeaways for pretraining: - Use SAM (Sharpness-Aware-Minimization) in the final steps (~10%) - Try much higher learning rates (yes, even ~10× larger) 1/9

IshaanWatts18's tweet photo. Spending billions to train the "best" base model? You might be optimizing the wrong thing! 🎯

We show that controlling sharpness during mid-training leads to over 35% less forgetting after fine-tuning / quantization... even when the base model itself gets worse.

🧵 Takeaways for pretraining:
- Use SAM (Sharpness-Aware-Minimization) in the final steps (~10%)
- Try much higher learning rates (yes, even ~10× larger)

1/9

31

617

91

440

590K

Who to follow

Harshit Joshi

@harshitj__

CS phd @StanfordNLP, @StanfordOVAL | prev: @MSFTResearch | LLM systems for knowledge access, discovery and curation

Bhavya Chopra

@BhavyaChopra1

CS PhD Student @UCBerkeley • previously at @Microsoft @ProseMSFT @Tableau @IIITDelhi • HCI+AI research for Data

Anirudh Khatry

@AnirudhKhatry

Saksham @sgdescent

about 2 months ago

@jonghyunc_ Seems super cool Andy!

0

1

0

213

sgdescent retweeted

Bhutani

@justbhutani

about 2 months ago

We are hiring in SF in robotics. Folks who wanna live and work end to end - we call you to reach out to us.

0

7

3

1

2K

Saksham @sgdescent

about 2 months ago

One of my really goated friends @SCSatCMU is looking for an ML Research/Engineer intern for the summer https://t.co/QMM1ISmq2s. @Dogged_Raj His offer got rescinded due to unfortunate circumstances. Hire him!

2

33

2

8

5K

sgdescent retweeted

Zhihao Jia

@JiaZhihao

2 months ago

The MLSys’26 program is live! Check out the accepted papers: https://t.co/PKTMF2pOt2 This year marks several exciting firsts: • 28 industry track papers bridging MLSys research & real-world deployment • Our inaugural competition track featuring AWS Trainium, Google Graph Scheduling, and NVIDIA FlashInfer AI Kernel contests Early registration deadline: April 1 — don’t miss it! See you in Seattle this May🌲

JiaZhihao's tweet photo. The MLSys’26 program is live!

Check out the accepted papers: https://t.co/PKTMF2pOt2

This year marks several exciting firsts:
• 28 industry track papers bridging MLSys research & real-world deployment
• Our inaugural competition track featuring AWS Trainium, Google Graph Scheduling, and NVIDIA FlashInfer AI Kernel contests

Early registration deadline: April 1 — don’t miss it! See you in Seattle this May🌲

1

144

25

66

22K

Saksham @sgdescent

2 months ago

Gemini and the Autocompletion model @cursor_ai both use some really funny data. I wish people could identify source of janky data, and there was some effort in removing that :)

sgdescent's tweet photo. Gemini and the Autocompletion model @cursor_ai both use some really funny data. I wish people could identify source of janky data, and there was some effort in removing that :) https://t.co/3hB9FrXJ6k

0

2

0

274

sgdescent retweeted

Pratyush Maini

@pratyushmaini

3 months ago

If I had to compress my PhD into one idea, it is this "The data a model sees early in training leaves an imprint on its representations that is very hard to undo later" This thread runs through - Rephrasing the Web - Safety Pretraining - TOFU This is the Finetuner’s Fallacy🧵

21

727

56

555

58K

sgdescent retweeted

Aflah 🍉🕊️ @Aflah02101

3 months ago

Hi Everyone I'm looking for interesting research scientist/engineer roles around LLM pretraining, studying behavior of agentic systems and just doing cool LLM research overall. If my work interests you and you're hiring feel free to DM/email/reply here!

5

65

3

16

6K

sgdescent retweeted

Bhutani

@justbhutani

3 months ago

If you believe the future of autonomy will be built on hardware like this, we should talk. We’re building real-world AI systems and robots at JustJust AI — and we’re hiring. Email me at [email protected] Hiring in San Francisco/Gurgaon and Shenzhen

justbhutani's tweet photo. If you believe the future of autonomy will be built on hardware like this, we should talk.

We’re building real-world AI systems and robots at JustJust AI — and we’re hiring.

Email me at bhutani@justjust.ai

Hiring in San Francisco/Gurgaon and Shenzhen https://t.co/9XtPo1IY00

2

26

12

3

1K

Saksham @sgdescent

3 months ago

@aramesh27 @SCSatCMU Thanks for the great discussions!

0

1

0

35

Saksham @sgdescent

3 months ago

Started a ml sys reading group with friends @SCSatCMU Systolic arrays are so cool!

5

99

5

16

5K

Saksham @sgdescent

3 months ago

@_stevenkolawole @SCSatCMU Yes please. Feel free to dm me!

0

1

0

87

Saksham @sgdescent

3 months ago

Why do Einops even exist

3

6

0

705

sgdescent retweeted

Peng Qi

@qi2peng2

4 months ago

My team at @Uniphore is hiring summer research interns for 2026! If you are a highly motivated graduate student in the US, interested in working on Language Agents (tool use, agent design, agent evaluation, agent optimization, etc.) or small models (efficient training, efficient serving, data synthesis techniques, RL), you should consider us! We are a team with collective research exprience from Stanford, Google DeepMind, AWS AI, Apple, and more. We just published 3 papers at ICLR this year, check them out: 1. WARC-Bench (GUI agents): https://t.co/RuOumMvBTi 2. PolySkill (agent skill induction): https://t.co/RDL5ShoueJ 3. EvoPresent (presentation generation from academic papers): https://t.co/is2e0Guhzg If you are interested, please send your resume and a brief note about your research interests to [email protected], and we will be in touch shortly!

4

141

7

190

18K

Saksham @sgdescent

4 months ago

She is interested in exploring Vision, Planning, and Control, but is open to doing anything cool with robots! RT's appreciated <3

0

1

0

296

Saksham @sgdescent

4 months ago

Twitter, do your thing. My girlfriend, who is an MS robotics student @SCSatCMU, is looking for a summer internship! She is interested in working at the intersection of Robotics+ML, was an intern at @Google, and did her undergrad at @IITKanpur

2

1

0

1

575

sgdescent retweeted

Vedant Agarwal

@V_Agarwal1

4 months ago

We’re quietly assembling a small, elite team at @InceptLabsAI to work on long-horizon AI research that actually matters. Roles: AI/ML researchers (strong preference for people already doing frontier-level work). If you’ve built something real (papers, open-source, strong internal research, shipped systems), we’d love to talk. Stealth mode • best-in-class resources • equity that reflects the risk/reward. If this sounds like your next chapter, reach out → DMs open 🚀

15

297

23

258

22K

Saksham

@sgdescent

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users