gabriel teston

Verified account

@GabrielTeston

Solving language @Google Search

Joined December 2023

82 Following

56 Followers

40 Posts

Pinned Tweet

7 months ago

Training LLMs across multiple datacenters is hard. 🛑 Synchronization demands often cause massive slowdowns as we scale up. If you're at @NeurIPSConf, come see how we tackle this! Our work, "Scaling Laws for DiLoCo," shows how DiLoCo relax synchronization without compromising model quality, allowing training to scale incredibly well. Come chat with me and @NovaFallen8: 🗓️ Thu, Dec 4 ⏰ 11 AM – 2 PM PST 📍 Exhibit Hall C,D,E, #811 #NeurIPS2025 #LLMs #DistributedTraining #ScalingLaws

Zachary Charles

@MatharyCharles

over 1 year ago

We just put out a key step for making distributed training work at larger and larger models: Scaling Laws for DiLoCo TL;DR: We can do LLM training across datacenters in a way that scales incredibly well to larger and larger models!

MatharyCharles's tweet photo. We just put out a key step for making distributed training work at larger and larger models: Scaling Laws for DiLoCo

TL;DR: We can do LLM training across datacenters in a way that scales incredibly well to larger and larger models! https://t.co/tNMAMQjVeJ

12

378

79

243

143K

0

10

3

6

6K

7 months ago

It is today! ⏰ 11 AM – 2 PM PST 📍 Exhibit Hall C,D,E, #811

7 months ago

Training LLMs across multiple datacenters is hard. 🛑 Synchronization demands often cause massive slowdowns as we scale up. If you're at @NeurIPSConf, come see how we tackle this! Our work, "Scaling Laws for DiLoCo," shows how DiLoCo relax synchronization without compromising model quality, allowing training to scale incredibly well. Come chat with me and @NovaFallen8: 🗓️ Thu, Dec 4 ⏰ 11 AM – 2 PM PST 📍 Exhibit Hall C,D,E, #811 #NeurIPS2025 #LLMs #DistributedTraining #ScalingLaws

0

10

3

6

6K

0

0

0

0

48

7 months ago

Heading to @NeurIPSConf in San Diego. I’ve got some DiLoCo stickers to give away! 👾 ❤️ Come check out our poster. 🗓️ Thu, Dec 4 ⏰ 11 AM – 2 PM PST 📍 Exhibit Hall C,D,E, #811 #NeurIPS2025

GabrielTeston's tweet photo. Heading to @NeurIPSConf in San Diego.

I’ve got some DiLoCo stickers to give away! 👾 ❤️

Come check out our poster.

🗓️ Thu, Dec 4
⏰ 11 AM – 2 PM PST
📍 Exhibit Hall C,D,E, #811

#NeurIPS2025 https://t.co/sLPigppDnO

0

5

1

0

604

GabrielTeston retweeted

7 months ago

Attending @NeurIPSConf and interested in distributed, modular, and/or open AI? Hadn't seen someone put together a list of poster presentations in this area so took it upon myself to thread out who I'm excited to talk to next week🧵

5

48

5

31

5K

7 months ago

@yacinelearning 😭

0

1

0

0

17

7 months ago

@yacinelearning Are you going to NeurIPS boss?

1

1

0

0

46

8 months ago

@eliebakouch @Ar_Douillard should we start calling it “planets” instead of “islands” already?

0

2

0

0

54

GabrielTeston retweeted

Arthur Douillard

8 months ago

We have TPUs in space. I have a DiLoCo implem running on TPUs. Cosmic Distributed Learning when?

9

58

3

5

8K

8 months ago

@gabriel1 @nikitabier @gabriel @elonmusk Would it be like a UFC, can I challenge you after? (If you win)

0

0

0

0

57

8 months ago

@yacinelearning @DanAdvantage Living my best life 🙏

1

2

0

0

24

8 months ago

@DanAdvantage @yacinelearning One of the best. Btw I was here when I got the news about the promotion

GabrielTeston's tweet photo. @DanAdvantage @yacinelearning One of the best. Btw I was here when I got the news about the promotion https://t.co/Gyy5y2sqHd

1

3

0

0

51

8 months ago

@yacinelearning @DanAdvantage Thanks boss

1

2

0

0

22

8 months ago

@Ar_Douillard Super deserved!!!

0

1

0

0

90

8 months ago

I would say probably get a full time role as RS/RE, maybe spend some time abroad and connect to more smart people 🤓

@yacinelearning

8 months ago

what are you 5 years goal folks

yacinelearning's tweet photo. what are you 5 years goal folks https://t.co/wsaTxMfk6H

42

269

7

45

84K

1

5

0

0

721

8 months ago

@eliebakouch @yacinelearning Nous is looking for a dj for their after party…🤭

1

6

0

0

284

8 months ago

@Ar_Douillard Huge!

0

1

0

0

54

8 months ago

@riseofreh @VictorTaelin Que porra é essa?

1

3

0

1

169

8 months ago

@eliebakouch 😮

0

1

0

0

78

8 months ago

Want to learn how to train models across the world, with 400x less bits exchanged and a huge latency tolerance? 🌎 I’ll be presenting our work on how to efficiently scale distributed training at @COLM_conf. 🗓️ TODAY: Tuesday, 11:00 - 13:00 📍 Room 710 #COLM2025

0

10

2

3

4K

Last Seen Users on Sotwe

Trends for you

Most Popular Users