Peter Rudenko 🇺🇦 @peter_rud - Twitter Profile

peter_rud retweeted

adafruit industries

@adafruit

about 2 months ago

Free book: A Guide to CubeSat Mission and Bus Design by Frances Zhu https://t.co/aqym9BGdiO

0

11

2

8

1K

peter_rud retweeted

sadernoheart

@sadernoheart

3 months ago

I just spent the last couple of days trying to derive the orbital mechanics for the Artemis II mission. Wrote an article explaining everything you need to know(You'll realize that this is a HUGE leap for humanity): https://t.co/mkILstlDFG

sadernoheart's tweet photo. I just spent the last couple of days trying to derive the orbital mechanics for the Artemis II mission.

Wrote an article explaining everything you need to know(You'll realize that this is a HUGE leap for humanity):

https://t.co/mkILstlDFG https://t.co/7FTsnWey9u

6

89

16

53

7K

peter_rud retweeted

Roman Sheremeta 🇺🇸🇺🇦

@rshereme

4 months ago

This video breaks my heart. It was taken at the beginning of the russian invasion in 2022. A Ukrainian boy was separated from his family at the boarder. Russia is a terrorist state.

343

9K

2K

233

132K

peter_rud retweeted

Daractenus @Daractenus

6 months ago

Kyiv endured 10h of this today, over 40 missiles and 500 drones. Nearly a third of the city without power and heating. All because tens of millions of Russians have nothing else to live for, nothing to aspire to, no plans or dreams but to kill and die for their fucking tsar.

422

12K

3K

363

420K

Who to follow

Xavier (Xavi) Amatriain

@xamat

Leading AI @ExpediaGroup. Former AI Product @Google. Cofounder @CuraiHQ, @LinkedIn @Quora and @Netflix. Catalan in the Valley. Runner & Ironman.

Mathieu Blondel

@mblondel_ml

Research scientist at Google DeepMind. Current research interests: differentiable programming, LLMs, Transformers.

christopher e moody

@chrisemoody

built style shuffle at sfix, from zero to millions. I prototype ML products. ex-physicist, phd @ ucsc & caltech. https://t.co/3OcMQpRV39

peter_rud retweeted

Olena Halushka

@OlenaHalushka

6 months ago

Not the first time I've seen people spoiled by the safety of democracy casually argue that russian occupation would change nothing in their lives. Under russian occupation, you don't vote — you disappear. Raped, persecuted, tortured, thrown into a basement prisons, mobilized to fight russia's wars of conquest. Your children taken, your language banned. Yes. Exactly the same life, ffs. russian occupation is not peace.

OlenaHalushka's tweet photo. Not the first time I've seen people spoiled by the safety of democracy casually argue that russian occupation would change nothing in their lives.

Under russian occupation, you don't vote — you disappear. Raped, persecuted, tortured, thrown into a basement prisons, mobilized to fight russia's wars of conquest. Your children taken, your language banned.

Yes. Exactly the same life, ffs.

russian occupation is not peace.

44

3K

999

113

143K

peter_rud retweeted

Денис Казанський

@den_kazansky

6 months ago

Russia attacked a grocery store in Zaporizhia. Many civilians were wounded, including children. Whenever Putin faces difficulty on the front line, he begins killing innocent people in Ukraine with particular brutality. His army retreated in Kupyansk. Now he's taking revenge on civilians again.

5

787

397

14

20K

Peter Rudenko 🇺🇦 @peter_rud

6 months ago

@Fidias0 Yes, absolutely

0

7

peter_rud retweeted

Luigi Cruz

@luigifcruz

7 months ago

I’ve built a prototype installer for the Stelline development image that lets you try it locally with no extra infrastructure. It’s powered by Docker and Jupyter Notebook, and includes scripts to simulate an observatory I/Q streaming network. Once installed, you can reuse the built-in Stelline operators (Transport, Beamformer, Correlator, etc.) or develop your own Holoscan operators and plug them into the pipeline. The installer was originally built for the DGX Spark, but it should work on any machine with an NVIDIA GPU. A ConnectX card is recommended for networking operators, but not required if you don’t plan to run them. Preview: https://t.co/4gqJXt5feY

0

8

1

2

902

peter_rud retweeted

Aleksa Gordić (水平问题)

@gordic_aleksa

9 months ago

New in-depth blog post time: "Inside NVIDIA GPUs: Anatomy of high performance matmul kernels". If you want to deeply understand how one writes state of the art matmul kernels in CUDA read along. (Remember matmul is the single most important operation that transformers execute both during training and inference. Most of NVIDIA compute is spent on it. Gaining 1% in efficiency translates to massive savings in the order of many nuclear reactors :P) I, yet again, realized i underestimated the effort. 😅 Here is one more booklet (lol). 47 figures! I covered: * The fundamentals of the GPU architecture with an emphasis on the memory hierarchy, building mental models for GMEM, SMEM, and L1/L2, and then connecting them to the CUDA programming model. Along the way we also looked at the "speed of light," how it's bounded by power, with hardware reality leaking into our model. * PTX/SASS, and how to steer the compiler into generating what we actually want (is that loop being unrolled, are we using vectorized loads like LDG.128, etc.). I've annotated one PTX/SASS example for a simple matmul kernel in excruciating detail. Even if you're new to compilers you should find this useful. (i actually found various inefficiencies in both compilers - fun!) * Many core concepts such as tile/wave quantization, occupancy, ILP (instruction-level parallelism), roofline model, etc. Also building intuition around fundamental equivalences: dot product as a sum of partial outer products, why square tiles are the right shape for high arithmetic intensity, etc. * The warp tiling method - which is near SOTA assuming you can't use tensor cores, TMA, async mem instructions, and bf16. Just maximizing GPU's performance using nothing but CUDA cores, registers and shared memory. * Finally, we step into Hopper (H100): TMA, swizzling, tensor cores and the wgmma instruction, async load/store pipelines, scheduling policies like Hilbert curves, clusters with TMA multicast, faster PTX barriers, and more. As always lots of examples, lots of visuals. This is the first time i could see warp tiling kernel and be like "oh i get it completely". I just needed my mental image transformed into an actual image. A few years ago I was really inspired by @Si_Boehm's excellent blog post on how matmul works, but I also found it had several errors, some unclear explanations, and it was quite outdated. Building on @pranjalssh amazing work (who did a great job building sota kernels for H100) and my own research, this is the final result. --- Again a huge thank you to @Hyperstackcloud (GPU cloud) for giving me an H100 (PCIe) node to run some of the experiments and analysis that i needed to write this up. Also a big thank you to my friends Aroun (who did a very thorough review of the post; Aroun's doing cool GPU/AI stuff at Magic and was previously GPU architect at Apple and Imagine, he's one of the best GPU people i know and we worked together on llm.c w/ @karpathy) and the amazing @marksaroufim! (PyTorch) for taking the time during weekend when they didn't have to. :)

gordic_aleksa's tweet photo. New in-depth blog post time: "Inside NVIDIA GPUs: Anatomy of high performance matmul kernels". If you want to deeply understand how one writes state of the art matmul kernels in CUDA read along.

(Remember matmul is the single most important operation that transformers execute both during training and inference. Most of NVIDIA compute is spent on it. Gaining 1% in efficiency translates to massive savings in the order of many nuclear reactors :P)

I, yet again, realized i underestimated the effort. 😅 Here is one more booklet (lol). 47 figures!

I covered:

* The fundamentals of the GPU architecture with an emphasis on the memory hierarchy, building mental models for GMEM, SMEM, and L1/L2, and then connecting them to the CUDA programming model. Along the way we also looked at the "speed of light," how it's bounded by power, with hardware reality leaking into our model.

* PTX/SASS, and how to steer the compiler into generating what we actually want (is that loop being unrolled, are we using vectorized loads like LDG.128, etc.). I've annotated one PTX/SASS example for a simple matmul kernel in excruciating detail. Even if you're new to compilers you should find this useful.

(i actually found various inefficiencies in both compilers - fun!)

* Many core concepts such as tile/wave quantization, occupancy, ILP (instruction-level parallelism), roofline model, etc. Also building intuition around fundamental equivalences: dot product as a sum of partial outer products, why square tiles are the right shape for high arithmetic intensity, etc.

* The warp tiling method - which is near SOTA assuming you can't use tensor cores, TMA, async mem instructions, and bf16. Just maximizing GPU's performance using nothing but CUDA cores, registers and shared memory.

* Finally, we step into Hopper (H100): TMA, swizzling, tensor cores and the wgmma instruction, async load/store pipelines, scheduling policies like Hilbert curves, clusters with TMA multicast, faster PTX barriers, and more.

As always lots of examples, lots of visuals. This is the first time i could see warp tiling kernel and be like "oh i get it completely". I just needed my mental image transformed into an actual image.

A few years ago I was really inspired by @Si_Boehm's excellent blog post on how matmul works, but I also found it had several errors, some unclear explanations, and it was quite outdated. Building on @pranjalssh amazing work (who did a great job building sota kernels for H100) and my own research, this is the final result.

---

Again a huge thank you to @Hyperstackcloud (GPU cloud) for giving me an H100 (PCIe) node to run some of the experiments and analysis that i needed to write this up.

Also a big thank you to my friends Aroun (who did a very thorough review of the post; Aroun's doing cool GPU/AI stuff at Magic and was previously GPU architect at Apple and Imagine, he's one of the best GPU people i know and we worked together on llm.c w/ @karpathy) and the amazing @marksaroufim! (PyTorch) for taking the time during weekend when they didn't have to. :)

50

3K

389

3K

282K

peter_rud retweeted

himanshu

@himanshustwts

9 months ago

quite an interesting read

6

1K

109

1K

73K

peter_rud retweeted

olexander scherba🇺🇦

@olex_scherba

10 months ago

11 years ago, 🇺🇦 Olena Kulish & Volodymyr Alyokhin got executed mafia-style in #Donbas. “Guilty” of supplying 🇺🇦 soldiers with food. He was in IT. She was a popular radio host & animal rights activist. Russians killed her 6 dogs too. Just for the fun of it. #RussiaUkraineWar

olex_scherba's tweet photo. 11 years ago, 🇺🇦 Olena Kulish & Volodymyr Alyokhin got executed mafia-style in #Donbas. “Guilty” of supplying 🇺🇦 soldiers with food.

He was in IT. She was a popular radio host & animal rights activist. Russians killed her 6 dogs too. Just for the fun of it.

#RussiaUkraineWar https://t.co/2ILFfMli2y

230

8K

3K

210

199K

peter_rud retweeted

Alina Sarnatska

@ASarnatska

about 1 year ago

Today, russia killed these children. They were 8, 12, and 17. A russian missile hit the Martyniuk family home in a small town in central Ukraine last night. Father Ihor and mother Olena are in the hospital. She’s in critical condition.

ASarnatska's tweet photo. Today, russia killed these children. They were 8, 12, and 17.

A russian missile hit the Martyniuk family home in a small town in central Ukraine last night.
Father Ihor and mother Olena are in the hospital. She’s in critical condition. https://t.co/hscqaqvHvf

27

578

313

10

49K

peter_rud retweeted

Tymofiy Mylovanov

@Mylovanov

about 1 year ago

Ukrainian journalist Viktoriia Roshchyna was tortured to death in Russian captivity, The Guardian Her body was returned without eyes, brain, or larynx. Burn marks on her feet. Stab wounds. Broken rib. Signs of strangulation. 1/

Mylovanov's tweet photo. Ukrainian journalist Viktoriia Roshchyna was tortured to death in Russian captivity, The Guardian

Her body was returned without eyes, brain, or larynx. Burn marks on her feet. Stab wounds. Broken rib. Signs of strangulation. 1/ https://t.co/g4nmcIIpPe

1K

39K

11K

3K

3M

Peter Rudenko 🇺🇦 @peter_rud

about 1 year ago

Ukrainian astrophysicist Olena Kompaniiets explores light and darkness under the shadow of the war https://t.co/ffpRApBmvF

0

63

peter_rud retweeted

Bihus.Info

@bihusinfo

about 1 year ago

“russia really wants peace” Sumy, Ukraine Christian’s holiday before Easter.

66

5K

2K

128

135K

peter_rud retweeted

Ihor Lachenkov

@igorlachenkov

about 1 year ago

14 people killed, including 6 children: Russia launched a ballistic strike on a residential area of Kryvyi Rih near a children's playground. Those who talk about peace are showing their true intentions to the world. Russia wants nothing but war.