Keshi Dai @daikeshi - Twitter Profile

daikeshi retweeted

about 2 years ago

I probably have some credibility as a person who has worked on @TensorFlow and @PyTorch both (also, caffe / @onnxai / distbelief / a few others that never saw the light), so here are my two cents: (1) Speed doesn't really matter today as long as it is not particularly bad. Under the hood, many are just calling optimized CUDA code anyway. (2) For ultimate speed, especially in the LLM field, many are writing their own runtime engines, and not vanilla framework. For example vLLM, by @zhuohan123 @woosuk_k et al, is a great open source solution. (Lepton builds its own too, and is probably the fastest not-a-framework llm engine right now) (3) TensorFlow and PyTorch are successful in their own terms. For me, TF taught me how to think like a system architect, and PyTorch taught me how to prioritize users' needs over optimization. (4) Simply listing a table showing performance is relatively useless, because every use case come with their own needs. (4.1) For example, if you are running ads and feed recommendation systems, then 1% performance matters, and you pretty much need to write even your own FusedGemmAndReluAndMyWeirdOpsAfterThatOp to make things fast. (4.2) For research though, the flexibility was of ultimate importance, and losing 20-30%, sometimes even 100% speed, is fine - you gain from reducing people hours. (4.3) I want to express thanks to @soumithchintala and the FAIR team, because we had long-lasting arguments back at Meta: I was serving all ads/feed needs so 4.1 was my only priority, and those arguments were a window into the 4.2 land. In retrospect, those was some of the most rewarding experiences in my career. (5) People hours are important for a simple reason: salaries are increasing every year, and NVidia is reducing the dollar cost per TFlops very quickly. (6) A side note... Keras PyTorch wrapper exhibits a universal 2x overhead over native PyTorch. This is... not good. A wrapper abstraction shouldn't bring such a big overhead. (7) Ah the old good war of framework days. I feel nostalgic to have been part of it, and I am excited to move beyond frameworks and to build a truly AI-native cloud at @LeptonAI .

13

453

65

301

128K

daikeshi retweeted

Spotify Engineering

@SpotifyEng

over 2 years ago

Build a robust Ray platform ✅. Make its developer experience frictionless ✅. Done, but how did we do it? #MachineLearning engineers, @daikeshi and @davidxia_, will be giving you the inside scoop at #RaySummit on 9/19. Register to watch it live here: https://t.co/IH08h6LwC0

SpotifyEng's tweet photo. Build a robust Ray platform ✅. Make its developer experience frictionless ✅. Done, but how did we do it? #MachineLearning engineers, @daikeshi and @davidxia_, will be giving you the inside scoop at #RaySummit on 9/19. Register to watch it live here: https://t.co/IH08h6LwC0

0

12

5

0

7K

Keshi Dai @daikeshi

over 2 years ago

@omnific9 Oh nice! I’ll be there too! Haven’t seen you for years!

1

0

60

daikeshi retweeted

Robert Nishihara

@robertnishihara

about 3 years ago

We've built a ton of #LLM applications recently. Reasoning about performance & feasibility is painful without reference points. Here are the reference points we use to anchor our intuition (inspired by @JeffDean's "Numbers every engineer should know"). https://t.co/586bDcAgJ8

robertnishihara's tweet photo. We've built a ton of #LLM applications recently. Reasoning about performance & feasibility is painful without reference points.

Here are the reference points we use to anchor our intuition (inspired by @JeffDean's "Numbers every engineer should know").

https://t.co/586bDcAgJ8 https://t.co/2aXWyXJCgw

12

377

86

264

81K

Who to follow

Richard Whitcomb

@rwhitcomb

Digitizing smell @osmo_labs 👃past: @nvidia, @spotify, @twitter, @bluefinlabs

Marc Tollin

@marctollin

Product @Spotify : https://t.co/zbUr7lOFOv. Twin Dad.

David Xia

@davidxia_

I write code and sometimes prose.

Keshi Dai @daikeshi

about 3 years ago

Llama, Alpaca, Vicuña, or Guanaco

0

1

0

119

Keshi Dai @daikeshi

about 3 years ago

If Brighton is a “no-frills" ski resort, Deer Valley is completely on the opposite with all the amazing services. The vibes are so different, but with the Ikon pass, you can access both!

0

197

daikeshi retweeted

NASA Webb Telescope

@NASAWebb

about 3 years ago

Stars: always making a dramatic exit! 🌟 Webb’s powerful infrared eye has captured never-before-seen detail of Cassiopeia A (Cas A). 11,000 light-years away, it is the remnant of a massive star that exploded about 340 years ago: https://t.co/LLQsFQJwVQ

NASAWebb's tweet photo. Stars: always making a dramatic exit! 🌟

Webb’s powerful infrared eye has captured never-before-seen detail of Cassiopeia A (Cas A). 11,000 light-years away, it is the remnant of a massive star that exploded about 340 years ago: https://t.co/LLQsFQJwVQ https://t.co/xqlGFzhYoy

261

18K

3K

332

3M

Keshi Dai @daikeshi

about 3 years ago

#坂本龍一 R.I.P https://t.co/RbfMZq6jhw

0

93

Keshi Dai @daikeshi

about 3 years ago

4DX movie experience was too much. Why did I get pushed in the back when two guys were fighting on the screen. I was just watching them.

0

119

Keshi Dai @daikeshi

about 3 years ago

This is very true. We start learning world models through hearing, vision, and other perceptions even at a very early stage of life before we know any languages.

Michael C. Frank @mcxfrank

about 3 years ago

Factor 2: multi-modal grounding. Human language input is (often) grounded in one or more perceptual modalities, especially for young children. This grounding connects language to rich information for world models that can be used for broader reasoning.

1

69

0

3

11K

0

166

daikeshi retweeted

Jay Hack

@mathemagic1an

over 3 years ago

My thoughts on Toolformer IMO the most important paper in the past few weeks. https://t.co/4IDciigbkc Teach an LLM to use tools, like a calculator or search engine, in a *self-supervised manner* Interesting hack to resolve many blind spots of current LLMs Here's how 👇

mathemagic1an's tweet photo. My thoughts on Toolformer

IMO the most important paper in the past few weeks.

https://t.co/4IDciigbkc

Teach an LLM to use tools, like a calculator or search engine, in a *self-supervised manner*

Interesting hack to resolve many blind spots of current LLMs

Here's how 👇 https://t.co/zXveKAfQDa

39

2K

349

1K

542K

daikeshi retweeted

Richard Liaw @richliaw

over 3 years ago

I'm so excited about this blog post from @DivitaVohra, @daikeshi, @davidxia_ at @SpotifyEng about their experiences leveraging Ray for Spotify's next generation ML platform -- it aligns so well with the vision of @raydistributed. https://t.co/bOgKViG9vM 1/n

1

43

9

6K

Keshi Dai @daikeshi

over 3 years ago

This is what we have been working on over the past year. I'm super proud of this!

Divita Vohra @DivitaVohra

over 3 years ago

I’m so proud of the work we’re doing at Spotify, and even more proud of the team I get to grow with through it all. Check out how we’re unleashing ML innovation at Spotify! @daikeshi @davidxia_ @pchandarr BIG shoutout to the @anyscalecompute team for being awesome partners.

1

30

8

1

7K

1

15

1

0

990

daikeshi retweeted

François Chollet

@fchollet

over 3 years ago

Trust is your most important asset. It's easy to destroy, hard to rebuild. In periods of fast change and extreme uncertainty, focus on preserving trust.

12

462

64

53

0

daikeshi retweeted

Vala Afshar

@ValaAfshar

over 4 years ago

An octopus changing colors in her sleep may be an indication she is dreaming

127

13K

2K

354

0

Keshi Dai @daikeshi

over 3 years ago

I had to explicitly ask VS Code to go down 5 levels to fix my imports "python.analysis.packageIndexDepths": [["", 5],]

0

1

0

daikeshi retweeted

Josh Baer @jbx____

almost 4 years ago

We're a few years into our ML Platform journey at Spotify, building infrastructure powering Machine Learning that makes Spotify "magic" for many of our users. And I'm thrilled to have a Product Manager opening on one of our cornerstone teams focusing o…https://t.co/H5LinM7h8x

0

3

2

0

Keshi Dai @daikeshi

almost 4 years ago

😱

Karen Hao

@_KarenHao

almost 4 years ago

The other four confirmed basic information like their names before hanging up. One man, upon hearing why we had his information, sighed in resignation: “We are all running naked,” he said, using popular Chinese slang for a lack of privacy.

3

247

29

7

0

1

0