Bas Büller @BasBuller - Twitter Profile

3 months ago

Hey all, I will be at GTC next week talking about all the work my team and I did on large-scale MoE training in JAX on GPUs! We decided early on to have a fully dropless training stack to avoid token dropping. (1/7)

2

103

11

55

16K

Bas Büller @BasBuller

8 months ago

@__tinygrad__ Tinygrad will use thunderkittens as a backend?

0

44

BasBuller retweeted

Dwarak

@DwaraknathG

8 months ago

I am hiring highly skilled performance engineers for my team! You will be working on optimising pretraining for models >100B params on O(1000s) of GPUs, and hardware-aligned architecture design. We are cooking a lot of very exciting projects and I can safely say you will have a lot of fun! Link in thread. <3

14

454

44

239

67K

Bas Büller @BasBuller

12 months ago

@BlackHC My thoughts

0

43

Who to follow

Dropped Out of Grad School (DOGS) // assume replies are "yes, and" // maybe we could try writing good software? // we are going to win

Bas Büller @BasBuller

12 months ago

@levelsio Get that 5090 if it’s available. More flops more ram

0

14

Bas Büller @BasBuller

12 months ago

@levelsio Vcs really are their customers

0

1

0

13

BasBuller retweeted

martin_casado

@martin_casado

about 1 year ago

Knuth shows us the way. Again:

12

185

18

50

18K

BasBuller retweeted

Benjamin F Spector

@bfspector

about 1 year ago

(1/5) We’ve never enjoyed watching people chop Llamas into tiny pieces. So, we’re excited to be releasing our Low-Latency-Llama Megakernel! We run the whole forward pass in single kernel. Megakernels are faster & more humane. Here’s how to treat your Llamas ethically: (Joint with @jordanjuravsky, @stuart_sul, @OwenDugan, @dylan__lim, @realDanFu, @simran_s_arora, and @HazyResearch)

bfspector's tweet photo. (1/5) We’ve never enjoyed watching people chop Llamas into tiny pieces.

So, we’re excited to be releasing our Low-Latency-Llama Megakernel! We run the whole forward pass in single kernel.

Megakernels are faster & more humane. Here’s how to treat your Llamas ethically:

(Joint with @jordanjuravsky, @stuart_sul, @OwenDugan, @dylan__lim, @realDanFu, @simran_s_arora, and @HazyResearch)

33

877

142

523

384K

Bas Büller @BasBuller

about 1 year ago

@dhh Same here

0

18

Bas Büller @BasBuller

about 1 year ago

@JVal48103 Yes

0

4

Bas Büller @BasBuller

about 1 year ago

@yacineMTB Human powered ones ;)

0

1

0

8

Bas Büller @BasBuller

about 1 year ago

@_rchaves_ Just Python, pretty sure it will be more convenient

0

43

Bas Büller @BasBuller

about 1 year ago

@wordgrammer Hmm, I think it’s mostly that Python is easy to use as the frontend of your DSL. Python calling out to native code is really simple because of CPython and a simple and relatively unassuming runtime. For these reasons I’d love to see more LISP as it does better on those aspects.

0

21

BasBuller retweeted

Lucas Beyer (bl16)

@giffmana

about 1 year ago

@andersonbcdefg For some absurd reason 95% of programmers only go for frameworks and are scared of anything below. They somehow don't see that frameworks usually enshittify the whole thing!

7

204

3

11

9K

Bas Büller @BasBuller

about 1 year ago

@andersonbcdefg Call it shiny toy syndrome. This is also a general pattern I find with hf code. It looks simple at first glance, try do anything serious with it and the spaghetti code hits you.

0

100

Bas Büller @BasBuller

about 1 year ago

@tsoding Tsoding lives both in the future and the past

0

27

Bas Büller @BasBuller

about 1 year ago

@_xjdr Hold my beer, seems like I’ve got new plans for tonight.

0

206

Bas Büller @BasBuller

about 1 year ago

@ludwigABAP Synergies between model and cli will probably become the differentiating factor. Claude 3.7 supposedly is already better in Claude cli than cursor, will only get worse

0

1

0

62

Bas Büller @BasBuller

about 1 year ago

@abacaj Although I agree Gemini is in a league of its own right now, the ease with which people switched to Gemini can also happen in reverse. But Google’s massive infra lead is what I’d be afraid of if I were one of its competitors, now that they understand how to build sota models.

0

34

BasBuller retweeted

martin_casado

@martin_casado

about 1 year ago

I think AI is going to usher in a gold age of infra, not obviate it. It's just so clear that good CS fundamentals result in better AI built systems. Vibe coding works better with type safety, languages where syntax maps closely to semantics, referential transparency, tight scoping etc. These approaches have never been widely adopted in CS because they are hard for humans, and in particular novices. But they're not hard for AI. And they map so much better from natural language descriptions.

55

884

43

315

82K

Bas Büller

@BasBuller

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users