mason thompson @PDF_sorcerer - Twitter Profile

mason thompson @PDF_sorcerer

about 12 hours ago

@satyanadella @satyanadella too long, won’t read.

0

7

mason thompson @PDF_sorcerer

14 days ago

@atmoio @mrsChoocher Seasonal models

0

7

mason thompson @PDF_sorcerer

3 months ago

And just to follow up - your launcher isn’t accepting a custom path for reinstall: c:\star_citizen. Doesn’t save on commit; resets back to its default path.

0

1

0

9

mason thompson @PDF_sorcerer

3 months ago

@CloudImperium curious if error 8004 means anything to you? Deleted and reinstalled, over 1TB free on the ssd. Rebooted and all that. Error hits so quick it feels like it’s not querying my PC. Any ideas? Should I update my gpu drivers?

1

0

6

Who to follow

North American Raccoon 🇺🇸👌✝️

@RaccoonAmerican

PDF_sorcerer retweeted

Kevin Gallagher

@KevG163

5 months ago

Vin Scully and Hank Stram's CBS broadcast call of the #49ers' game-winning 89-yard touchdown drive — culminating in "The Catch — to defeat the #Cowboys in the 1981 NFC Championship at Candlestick. The ensuing kickoff and Dallas's final possession are included. January 10, 1982

83

1K

235

394

86K

mason thompson @PDF_sorcerer

5 months ago

@TimTeaFan Have my downvote.

0

1

0

21

PDF_sorcerer retweeted

Dominique Youx ☠️ @Sariel_Luna

7 months ago

@atrupar

4

478

76

25

16K

mason thompson @PDF_sorcerer

9 months ago

@LinkedIn fix your garbage app, and your garbage iOS safari ux for refuges who refuse to use your garbage app. Let your PMs do work.

PDF_sorcerer's tweet photo. @LinkedIn fix your garbage app, and your garbage iOS safari ux for refuges who refuse to use your garbage app. Let your PMs do work. https://t.co/zoWKCAwIFJ

0

24

mason thompson @PDF_sorcerer

9 months ago

@maxhbain howdy - I’m a jack of all trades product manger who likes getting into the weeds. Something appears to have changed with whisperx in last 48-72 hours; I can’t for the life of me get transcription + diarizarion and labeling running; suspect pants update and lost.

0

22

mason thompson @PDF_sorcerer

9 months ago

@neatprompts @Hesamation People who want to learn should start from scratch. False sense of security starting with someone else’s repo; too much gamification and min/maxing obsfucate the science and limitations beind this tech.

0

2

0

770

PDF_sorcerer retweeted

Gavin Newsom

@GavinNewsom

9 months ago

This aged well.

4K

80K

17K

3K

2M

mason thompson @PDF_sorcerer

9 months ago

@Python_Dv The diagram is confusing. Top concern: what do you call the overlapping region?

0

39

PDF_sorcerer retweeted

λux

@novasarc01

10 months ago

this is the most comprehensive and in-depth blog to understand vLLM. must read if you are into inference and ML systems and also helpful for beginners who want to contribute to vLLM. thank you aleksa!!

novasarc01's tweet photo. this is the most comprehensive and in-depth blog to understand vLLM. must read if you are into inference and ML systems and also helpful for beginners who want to contribute to vLLM. thank you aleksa!! https://t.co/jHXaJ620Kr

4

812

94

748

55K

PDF_sorcerer retweeted

Aleksa Gordić (水平问题)

@gordic_aleksa

10 months ago

New in-depth blog post - "Inside vLLM: Anatomy of a High-Throughput LLM Inference System". Probably the most in depth explanation of how LLM inference engines and vLLM in particular work! Took me a while to get this level of understanding of the codebase and then to write up this one - i quickly realized i understimated the effort. 😅 It could have easily been a book/booklet (lol). I covered: * Basics of inference engine flow (input/output request processing, scheduling, paged attention, continuous batching) * "Advanced" stuff: chunked prefill, prefix caching, guided decoding (grammar-constrained FSM), speculative decoding, disaggregated P/D * Scaling up: going from smaller LMs that can be hosted on a single GPU all the way to trillion+ params (via TP/PP/SP) -> multi-GPU, multi-node setup * Serving the model on the web: going from offline deployment to multiple API servers, load balancing, DP coordinator, multiple engines setup :) * Measuring perf of inference systems (latency (ttft, itl, e2e, tpot), throughput) and GPU perf roofline model Lots of examples, lots of visuals! --- I realize i've been silent on social - many of you noticed and thanks for reaching out! :) --> I'm so back! lots of things happened. Also, in general, I'm a bit sick of superficial content, it really is an equivalent of junk food (h/t @karpathy). I want to do the best/deepest technical work of my life over the next years and write much more in depth (high quality organic food ;)) so I might not be as frequent around here as i used to be (? we'll see). I'll make it a goal to share a few paper summaries a week or stuff that's relevant / in the zeitgeist. If you have any topics that happened over the past few weeks/months drop it down in the comments i might focus on some of those in my next posts. --- Huge thank you to @Hyperstackcloud for giving me an H100 node to run some of the experiments and analysis that i needed to write this up. The team there led by Christopher Starkey is amazing! Also a big thank you to Nick Hill (who did a very thorough review of the post - basically a code review lol; Nick's a core vLLM contributor and principal SWE at RedHat) and to my friends Kyle Krannen (NVIDIA Dynamo), @marksaroufim (PyTorch), and @ashVaswani (goat) for taking the time during weekend when they didn't have to!

gordic_aleksa's tweet photo. New in-depth blog post - "Inside vLLM: Anatomy of a High-Throughput LLM Inference System". Probably the most in depth explanation of how LLM inference engines and vLLM in particular work!

Took me a while to get this level of understanding of the codebase and then to write up this one - i quickly realized i understimated the effort. 😅 It could have easily been a book/booklet (lol).

I covered:

* Basics of inference engine flow (input/output request processing, scheduling, paged attention, continuous batching)

* "Advanced" stuff: chunked prefill, prefix caching, guided decoding (grammar-constrained FSM), speculative decoding, disaggregated P/D

* Scaling up: going from smaller LMs that can be hosted on a single GPU all the way to trillion+ params (via TP/PP/SP) -> multi-GPU, multi-node setup

* Serving the model on the web: going from offline deployment to multiple API servers, load balancing, DP coordinator, multiple engines setup :)

* Measuring perf of inference systems (latency (ttft, itl, e2e, tpot), throughput) and GPU perf roofline model

Lots of examples, lots of visuals!

---

I realize i've been silent on social - many of you noticed and thanks for reaching out! :) --> I'm so back! lots of things happened.

Also, in general, I'm a bit sick of superficial content, it really is an equivalent of junk food (h/t @karpathy).

I want to do the best/deepest technical work of my life over the next years and write much more in depth (high quality organic food ;)) so I might not be as frequent around here as i used to be (? we'll see). I'll make it a goal to share a few paper summaries a week or stuff that's relevant / in the zeitgeist.

If you have any topics that happened over the past few weeks/months drop it down in the comments i might focus on some of those in my next posts.

---

Huge thank you to @Hyperstackcloud for giving me an H100 node to run some of the experiments and analysis that i needed to write this up. The team there led by Christopher Starkey is amazing!

Also a big thank you to Nick Hill (who did a very thorough review of the post - basically a code review lol; Nick's a core vLLM contributor and principal SWE at RedHat) and to my friends Kyle Krannen (NVIDIA Dynamo), @marksaroufim (PyTorch), and @ashVaswani (goat) for taking the time during weekend when they didn't have to!

63

3K

400

3K

324K

mason thompson @PDF_sorcerer

10 months ago

@LinkedIn I noticed I can’t select more than single paragraphs in articles on mobile iOS app. Not saying your product manager wrote a user story to remove that functionality, but regardless I’m using safari on mobile as a result. Lmk when you can restore functionality - thanks!

0

10

mason thompson @PDF_sorcerer

10 months ago

Hey @theallinpod do you have a list of Besties-only episodes (JCal, Chamath, Friedberg & Sacks), including those where a Bestie is absent?

0

1

0

18

mason thompson @PDF_sorcerer

10 months ago

@Jaquatech @Hesamation Legend, thank you.

0

29

mason thompson @PDF_sorcerer

10 months ago

@Sumanth_077 Nice! Do you have the roadmap for an intuitive and useful windows 11 start menu? What about folders with thousands of files that load quickly?

0

48

mason thompson @PDF_sorcerer

10 months ago

@CloudImperium food for thought: it’d be nice to disable mfd casts as a global general setting. They’re a nice feature; team should be proud - but I prefer MFD panels. Nice to have for crafting: upgrade cockpits to support additional MFDs as well.

0

3

PDF_sorcerer retweeted