Divya Makkar

Verified account

@_DivyaMakkar

@LumaLabsAI | cs @uwaterloo, prev reasoning @cohere

Joined May 2024

272 Following

479 Followers

18 Posts

13 days ago

@baaadas @LumaLabsAI It was great working with you! Excited to see what's next!

0

1

0

0

396

25 days ago

@_pavidhiman Woah, this is so cool!

0

1

0

0

72

about 2 months ago

@_rajanagarwal @ProximalHQ very cool!

0

1

0

0

121

3 months ago

@_pavidhiman so cool!

0

1

0

0

100

3 months ago

@AahaanMaini @socraticainfo 🔥🔥🔥

0

2

0

0

187

4 months ago

@adiprasadd @modal @hackwithtrees @Waymo @TencentGlobal @kyanyang_ @kevinjosethomas @shreybirmiwal congrats!

1

1

0

0

108

4 months ago

@aarush @Meta Congrats Aarush!

0

1

0

0

272

5 months ago

@ElijahKurien Congrats Elijah!

0

1

0

0

151

5 months ago

@chinmayjindal_ @GroqInc @nvidia cfbr

0

3

0

0

4K

6 months ago

@emm4x3 @xai @angelafeliciawu @stevenzhang Amazing!! Congrats!

0

0

0

0

50

9 months ago

@adiprasadd 🔥🔥

0

1

0

0

71

9 months ago

@LeoMq06 @rox_ai @windsurf @cohere So fire Leo

0

1

0

0

77

9 months ago

I worked on this with @AdityaMakkar000 and @chinmayjindal_. Take a look at our blog to see how you can go from zero-to-one scaling a transformer from scratch. GitHub: https://t.co/pngCWGjAj0 Blog: https://t.co/N2hmijqwXp

0

48

1

47

3K

9 months ago

I spent the past few months building JAXformer: One of the first open source guides on how to scale modern transformers in JAX. Trained entirely on TPUs, it supports distributed ML, Ray tokenization, MoE, n-D parallelism and end-to-end inference. Here’s how to do it:

_DivyaMakkar's tweet photo. I spent the past few months building JAXformer: One of the first open source guides on how to scale modern transformers in JAX.

Trained entirely on TPUs, it supports distributed ML, Ray tokenization, MoE, n-D parallelism and end-to-end inference.

Here’s how to do it: https://t.co/0suvNXm7HF

22

816

68

757

73K

9 months ago

We trained until we hit a 3.28 val loss (inspired by nanoGPT speedrun) over 26,100 steps with 6.5 billion tokens.

_DivyaMakkar's tweet photo. We trained until we hit a 3.28 val loss (inspired by nanoGPT speedrun) over 26,100 steps with 6.5 billion tokens. https://t.co/5eRr0ZY4Ee

1

27

0

5

4K

about 1 year ago

@_pavidhiman @PragmaClin @vanar_robots 🔥🔥🔥

0

1

0

0

143

Last Seen Users on Sotwe

Trends for you

Most Popular Users