Vadim Markovtsev @vadimlearning - Twitter Profile

8 days ago

@JordanNanos 2/2 - ~100 perf metrics / rank / step to detect problems as early as possible on the application level, including NCCL ops bandwidth control. - Distributed fault sim with ~30 different problems to test recovering from.

0

2

0

20

Vadim Markovtsev @vadimlearning

8 days ago

@JordanNanos Hey Jordan, we intentionally skipped many technical details from the report. Regarding fault tolerance: 1/2 - We do in-pod recovery. There is no in-process recovery yet, so we lose a few min to reinit. - Checkpoint locally when we break (you call it just-in-time).

0

3

0

29

Vadim Markovtsev @vadimlearning

almost 3 years ago

@eisokant @BenTheEgg @poolsideai Will be happy to talk about our modifications to adapt it to training LLMs with bf16!

1

0

168

vadimlearning retweeted

Eiso Kant

@eisokant

almost 3 years ago

@BenTheEgg thank you from @poolsideai for your reversed mode code from Reformer! @vadimlearning worked on it and it led to 25% memory reduction with only a bit more data needed to converge. Next time you're in Paris, dinner is on us!

1

2

1

543

Who to follow

prof-g

@prof_g

Robert Ghrist = mathematician; engineer; educator; assoc. dean of undergraduate education Penn Engineering; illustrator; animator; acta non verba

John D. Cook

@JohnDCook

Consultant in applied math and data privacy

Datawrapper

@Datawrapper

Enrich your stories with charts, maps, and tables – interactive, responsive, and on brand. Questions? Write us: [email protected]

Vadim Markovtsev @vadimlearning

about 3 years ago

@viglovikov We are building one.

0

43

Vadim Markovtsev @vadimlearning

over 3 years ago

GitHub API sends a "gollum" notification when somebody edits a wiki page. Who knows why LOTR here?

2

1

0

706

Vadim Markovtsev @vadimlearning

over 3 years ago

I am pleased to announce that I will speak in Python devroom on @fosdem about our experience at @athenian doing low- and high-level Python performance optimizations. Will cover some CPython internals, Cython with C++20, arena allocations with mimalloc.

0

4

1

0

492

Vadim Markovtsev @vadimlearning

over 3 years ago

@TUXEDOComputers how about fixing BIOS of InfinityBook Pro Gen7 to repair waking from deep sleep? https://t.co/DGHXi2HmP7

1

0

Vadim Markovtsev @vadimlearning

over 3 years ago

I understood CPython arenas thanks to this post: https://t.co/Q1DiPGex3y

0

1

0

Vadim Markovtsev @vadimlearning

almost 4 years ago

+1 https://t.co/FCmHKemoNE

0

1

0

Vadim Markovtsev @vadimlearning

almost 4 years ago

Cool projects from the latest LinkedIn connections: https://t.co/RQSGrrMIb6 and https://t.co/3PILwlsqwn

0

1

0

1

0

Vadim Markovtsev @vadimlearning

about 4 years ago

Wrote a blog post about supercharged asyncpg: https://t.co/uaqPZnf6Su

0

11

2

0

vadimlearning retweeted

Eiso Kant

@eisokant

about 4 years ago

@jasoncwarner @ncantunes @devleadership_ Here's the full write up, with examples and links to the episode 👇 https://t.co/FmSBr1N3n2

0

5

1

0

Vadim Markovtsev @vadimlearning

about 4 years ago

Everybody who thinks that Python should ditch reference counting, read Chris Lattner's praise https://t.co/AzGOU203u4

1

2

0

vadimlearning retweeted

Justin Duke @jmduke

about 4 years ago

congratulations to this google docs PM who is singlehandedly responsible for engendering more developer goodwill than any other individual at Alphabet in the past five years

jmduke's tweet photo. congratulations to this google docs PM who is singlehandedly responsible for engendering more developer goodwill than any other individual at Alphabet in the past five years https://t.co/1mOoMMNy1b

33

4K

393

218

0

Vadim Markovtsev @vadimlearning

about 4 years ago

Big thank you to @xkcd for granting permission to use the remixed image in my paywalled blog post https://t.co/OP2Zl5zOAZ

vadimlearning's tweet photo. Big thank you to @xkcd for granting permission to use the remixed image in my paywalled blog post https://t.co/OP2Zl5zOAZ https://t.co/5oCl3HJyhQ

0

1

0

vadimlearning retweeted

Lou Marvin Caraig @LMCaraig

about 4 years ago

It’s been a while since my last blog post, in the meantime I changed both UI and domain, and added support for comments! And this is my first post in the new blog! I hope you’ll like it! https://t.co/PzuFw4KdsA #devops #monitoring #kubernetes #prometheus

0

3

4

1

0

Vadim Markovtsev @vadimlearning

about 4 years ago

My new blog post is out! I focus on CI optimizations this time.

Athenian @athenian

about 4 years ago

"My Continuous Integration Takes Too Much Time. How Do I Fix It?" - @vadimlearning explains how to solve this very common issue... so you don't have to twiddle your thumbs (or read r/programming) while you wait for CI checks to finish! 👇 https://t.co/9ZsCZmVCji