Musharraff Ibrahim @Sharrafff - Twitter Profile

Musharraff Ibrahim @Sharrafff

about 3 hours ago

@nzevh_ What software are you using to design and simulate this ?

1

0

27

Sharrafff retweeted

Xudong Han

@Xudong07452910

1 day ago

现在顶级 AI 实验室的入场券，早就不只是有学术光环了！最近看到一篇很硬核的 ML 面试复盘文章，作者拿到了 DeepMind 等多家顶级 AI 公司的 offer，文章里面有个很现实的观察：哪怕你手里有多篇 AI 顶会的一作，简历也只是把你送进面试间。在真正面试时，很多考官并不会围着你的论文细节聊太久，他们更关心的是：你能不能在有限时间里写出 Transformer 的 backward pass，能不能把基础数学讲清楚，能不能现场手撕算法题。这背后作者讲出了很残酷的行业逻辑：顶级 AI 研究员面试，很多时候筛的不是你的科研上限，而是你的工程、数学和 coding 下限。所以顶尖博士面试前也会焦虑，也要刷题、模拟、补基础。学术成果证明你有潜力，但面试流程要确认你能稳定交付。这也挺反直觉的：做研究像艺术，找工作却像工程。论文、idea、创造力当然重要，但真正进门时，还是要过一套非常标准化、非常具体、甚至有点像高考的筛选流程。另外，文章里对初创公司期权的提醒也很现实：别只听估值故事，税收、流动性、行权成本和退出不确定性，都会让纸面财富和真实收益差很远。在今天的 AI 行业，别指望靠过去的学术功劳簿一路通关。想进顶级实验室，最好提前把面试当成一个工程项目来准备：刷题、推公式、复盘论文、模拟面试，一项项补齐。 https://t.co/3XK93q9TQe

Xudong07452910's tweet photo. 现在顶级 AI 实验室的入场券，早就不只是有学术光环了！

最近看到一篇很硬核的 ML 面试复盘文章，作者拿到了 DeepMind 等多家顶级 AI 公司的 offer，文章里面有个很现实的观察：

哪怕你手里有多篇 AI 顶会的一作，简历也只是把你送进面试间。

在真正面试时，很多考官并不会围着你的论文细节聊太久，他们更关心的是：你能不能在有限时间里写出 Transformer 的 backward pass，能不能把基础数学讲清楚，能不能现场手撕算法题。

这背后作者讲出了很残酷的行业逻辑：顶级 AI 研究员面试，很多时候筛的不是你的科研上限，而是你的工程、数学和 coding 下限。

所以顶尖博士面试前也会焦虑，也要刷题、模拟、补基础。学术成果证明你有潜力，但面试流程要确认你能稳定交付。

这也挺反直觉的：做研究像艺术，找工作却像工程。论文、idea、创造力当然重要，但真正进门时，还是要过一套非常标准化、非常具体、甚至有点像高考的筛选流程。

另外，文章里对初创公司期权的提醒也很现实：别只听估值故事，税收、流动性、行权成本和退出不确定性，都会让纸面财富和真实收益差很远。

在今天的 AI 行业，别指望靠过去的学术功劳簿一路通关。

想进顶级实验室，最好提前把面试当成一个工程项目来准备：刷题、推公式、复盘论文、模拟面试，一项项补齐。

https://t.co/3XK93q9TQe

29

1K

139

2K

117K

Sharrafff retweeted

The Babylon Bee

@TheBabylonBee

1 day ago

552

37K

4K

789

1M

Sharrafff retweeted

DROID

@droidbuilds

3 days ago

"mom, how did we get so poor?" "your father had Claude Max, ChatGPT Pro, Cursor Pro and shipped absolutely nothing"

295

14K

933

1K

695K

Who to follow

Junior D'Coder

@D_Coderr

Web 3.0 | DevOps | AWS Cloud | Developer | CTO Traitz

Eferebo Abraham

@abraham_eferebo

Tech lover💎| chemical engineering ❤️| Lover of God🙇| Data Science

mearasx

@user000002507

Musharraff Ibrahim @Sharrafff

1 day ago

@PaladinPolecat @ReviewsPossum This whore right here.

1

0

15

Musharraff Ibrahim @Sharrafff

1 day ago

@__danielhart @Tshepisoo10 @BafanaBafana Shooters gon shoot. 😂💔

0

1

0

1K

Sharrafff retweeted

Charles 🎉 Frye

@charles_irl

2 days ago

Last fall, we shared our deep dive on FA4 internals. But we didn't stop at grokking the kernel. Since then, we've been developing improvements for inference performance and upstreaming them. This blog post explains those contributions. https://t.co/xzDNHdq3Zw

charles_irl's tweet photo. Last fall, we shared our deep dive on FA4 internals.

But we didn't stop at grokking the kernel.

Since then, we've been developing improvements for inference performance and upstreaming them.

This blog post explains those contributions.

https://t.co/xzDNHdq3Zw https://t.co/AzFs33Xqif

2

192

27

142

16K

Musharraff Ibrahim @Sharrafff

3 days ago

@caliphate494 @sin4ch @_DeejustDee 🙏.

0

18

Musharraff Ibrahim @Sharrafff

3 days ago

@staysaasy Great doom porn, you have a talent for these things.

0

6

0

2K

Sharrafff retweeted

Elon Musk

@elonmusk

4 days ago

Tesla AI chip design engineering reviews are so great! Team is awesome. Our AI6 chip might set a record for most amount of usable intelligence from a wafer when factoring in yield.

5K

76K

8K

2K

15M

Musharraff Ibrahim @Sharrafff

4 days ago

@MoraKing1788 you are welcome. here we pay rent anually, you pay like $3k to $4k for a two bedroom apartment in a major city anually. there are offsides tho like a non-existent public transport system and garbage electricity supply.

0

2

0

5

Musharraff Ibrahim @Sharrafff

4 days ago

@Brandgrowthai @HTMLmerchant @Awamaridiii @Chiivie_ A remote job isn't a career path ?

0

1

0

58

Sharrafff retweeted

Meryem Arik

@MeryemArik9

5 days ago

Another banger blog from @finn_fergus and the Doubleword inference lab. Have a read if you’re interested in speculative decoding.

MeryemArik9's tweet photo. Another banger blog from @finn_fergus and the Doubleword inference lab.

Have a read if you’re interested in speculative decoding. https://t.co/zklzKgEClS

1

11

2

11

1K

Sharrafff retweeted

Jino Rohit

@jino_rohit

4 days ago

ive been seeing gshard come up a lot when training MOEs across multiple gpus. going to give it a read

1

40

1

10

1K

Sharrafff retweeted

Xiangpeng Hao @MOVNTDQ

5 days ago

A system programmer’s guide to LLM inference, hope you like it! https://t.co/uYwmHzOTGT

3

188

14

216

8K

Sharrafff retweeted

Adam Holter

@AdamHoltererer

5 days ago

Personal update: I’ve decided to leave OpenAI. Not that I ever worked there. But it just looks like everyone else is doing it, so I thought I'd hop on the bandwagon. In other news, I've decided to join @AnthropicAI to work on AGI for the benefit of Claude. I don't think they realize that I've decided to join, and to be honest, I don't think my decision carries much weight with them, since I wasn't offered a job there. But the decision stands.

80

3K

115

143

239K

Sharrafff retweeted

Aadi Kulshrestha

@MankyDankyBanky

about 2 months ago

I trained a 12M parameter LLM on my own ML framework using a Rust backend and CUDA kernels for flash attention, AdamW, and more. Wrote the full transformer architecture, and BPE tokenizer from scratch. The framework features: - Custom CUDA kernels (Flash Attention, fused LayerNorm, fused GELU) for 3x increased throughput - Automatic WebGPU fallback for non-NVIDIA devices - TypeScript API with Rust compute backend - One npm install to get started, prebuilt binaries for every platform Try out the model for yourself: https://t.co/TB2itlmCVT Built with @_reesechong. Check out the repos and blog if you want to learn more. Shoutout to @modal for the compute credits allowing me to train on 2 A100 GPUs without going broke cc @sundeep @GavinSherry

131

4K

257

4K

809K

Musharraff Ibrahim @Sharrafff

6 days ago

@mohitwt_ which project is this?

1

0

99

Sharrafff retweeted

Aritra

@ariG23498

15 days ago

It has been more than 6 months (on and off) that I am trying to get upto speed with GPU/TPU kernel development. IMHO, profiling should be the starting point of learning this topic. You profile, you question, you look for answers and in the process read and imbibe. I set out on a journey to do just the same. I began profiling gemma4 and was quickly humbled by the amount of information that was at my disposal. The profiler table with huge GEMM names, the profiler trace with too many CPU rows. To make my life easier, I stepped back and profiled a basic matrix multiplication and addition operation, the weights and bias interaction, as one might see it. The profiler artifacts were simple enough to reason and think through. In this blog post, I document my journey and in the process uncover how one should profile and what one should look at! I hope this helps beginners (like me) with a starting point of their kernel development and optimization journey. PS: This is a big blog post, bookmark it and come back to this when you have the time (good weekend read?)

16

379

41

411

46K

Musharraff Ibrahim

@Sharrafff

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users