David Wang @_dcw02 - Twitter Profile

Added a fun lil widget to the LLM Engineer's Almanac -- a "Token Timing Simulator" so you can get a visceral feel for what a benchmark perf number means. Here's @_dcw02's latest work with @zhijianliu_'s DFlash technique in @sgl_project -- ~1k TPS! https://t.co/iUJm984dq0

3

101

13

51

18K

David Wang

@_dcw02

28 days ago

gpu poor? inframaxx gpu rich? also inframaxx gpu middle class? believe it or not, still inframaxx

2

15

0

1

1K

David Wang

@_dcw02

about 1 month ago

tried kimi k2.6, really shows the literary heritage gap

0

3

0

313

David Wang

@_dcw02

about 1 month ago

my goat @AhanGupta13 !!!

PyTorch

@PyTorch

about 1 month ago

Want to train LLMs on longer contexts without re-engineering your entire systems stack? Introducing AutoSP — the first compiler-based solution that automatically optimizes LLM training for long contexts. Under the hood, AutoSP applies a series of compiler passes that trigger sequence parallelism, paired with a curated activation-checkpointing scheme tailored for long-context training. It's integrated directly into DeepSpeed, so enabling long-context training is just a config change away. No more rewiring your stack to push context lengths. Read the blog to learn more 🖇️ https://t.co/TMjWfsO8fy ✍ @AhanGupta13, Zhihao W., Neel Dani, @toh_tana, Tunji Ruwase, @_Minjia_Zhang_ #PyTorch #DeepSpeed #AutoSP #OpenSourceAI

PyTorch's tweet photo. Want to train LLMs on longer contexts without re-engineering your entire systems stack?

Introducing AutoSP — the first compiler-based solution that automatically optimizes LLM training for long contexts. Under the hood, AutoSP applies a series of compiler passes that trigger sequence parallelism, paired with a curated activation-checkpointing scheme tailored for long-context training. It's integrated directly into DeepSpeed, so enabling long-context training is just a config change away.

No more rewiring your stack to push context lengths. Read the blog to learn more 🖇️ https://t.co/TMjWfsO8fy

✍ @AhanGupta13, Zhihao W., Neel Dani, @toh_tana, Tunji Ruwase, @_Minjia_Zhang_

#PyTorch #DeepSpeed #AutoSP #OpenSourceAI

3

118

21

60

17K

1

4

0

764

David Wang

@_dcw02

about 1 month ago

@tenderizzation Sent from my iPhone

0

1

0

77

David Wang

@_dcw02

about 1 month ago

as AI ascends, so too may we descend back to the asm mines

Charles 🎉 Frye

@charles_irl

about 1 month ago

https://t.co/A0LexEUZG1 "Everything is open source if you can read assembly" -- @_dcw02

0

32

3

11

4K

1

9

2

3

3K

David Wang

@_dcw02

about 2 months ago

@menhguin I have a pretty good guess what rationalists think of Omelas

0

4

0

441

David Wang

@_dcw02

about 2 months ago

@vikhyatk just drink more?

0

1

0

143

David Wang

@_dcw02

3 months ago

@ekzhang1 hope you feel better soon!

0

1

0

180

_dcw02 retweeted

Zhijian Liu

@zhijianliu_

3 months ago

DFlash⚡ meets OpenClaw🦞 = FlashClaw Same Claw. >4X faster or cheaper. DFlash support for Qwen3.5 is live — outperforming native MTP by up to 2.3X. More to come! 🔥

12

199

40

158

22K

David Wang

@_dcw02

3 months ago

@vikhyatk @realmcore_ blind (codex) leading the blind (me)

1

0

33

David Wang

@_dcw02

4 months ago

@Dorialexander @AmpCode we have instructions to set up opencode with our glm5 endpoint here: https://t.co/M0u0SjWGWd :)

0

2

0

1

196

_dcw02 retweeted

Robert Clausecker @FUZxxl

4 months ago

@lisperati Just write assembly code. That has always been allowed.

0

10

1

0

3K

4 months ago

4 months ago

The GLM models by @Zai_org have been a gamechanger for me. I was reluctant to embrace coding agents before I could run the models myself. Now, with GLM-5, I have a top-quality self-hosted intelligence endpoint tightly integrated into my engineering work. https://t.co/95XSG31K0P

1

53

2

29

6K

0

7

1

2

2K

David Wang

@_dcw02

Last Seen Users on Sotwe

Trends for you

Most Popular Users