Roberto Barroso Luque

your friendly neighborhood satejsoman

2 months ago

At 30 trillion tokens a day, we're the largest inference provider outside the frontier labs' own APIs. We're hiring. Come build the infrastructure behind the top AI workloads: https://t.co/bUweMxMq9L 📸 @Wing VC's 2026 Enterprise Tech 30 list (thanks @WingVC @ericnewcomer)

FireworksAI_HQ's tweet photo. At 30 trillion tokens a day, we're the largest inference provider outside the frontier labs' own APIs.

We're hiring.

Come build the infrastructure behind the top AI workloads: https://t.co/bUweMxMq9L

📸 @Wing VC's 2026 Enterprise Tech 30 list (thanks @WingVC @ericnewcomer) https://t.co/7ui4nHFHlE

0

113

5

19

7K

Who to follow

@satejsoman

this is 📻⚔️, from 🏴‍☠️🛰️ irl: phd-ing @berkeleyischool, {ML, remote sensing} × {cities, infra, inequality} url: hot takes, bad puns, surprising etymologies

BobBarroso retweeted

Dmytro Dzhulgakov

@dzhulgakov

2 months ago

Kimi has been the most popular model on Fireworks, both out of the box and as a fine-tuning base (including Composer 2) Now Kimi K2.6 is live with huge jumps (10+%) in coding, long-running agents and deep research Try it out for inference or fine-tuning!

dzhulgakov's tweet photo. Kimi has been the most popular model on Fireworks, both out of the box and as a fine-tuning base (including Composer 2)

Now Kimi K2.6 is live with huge jumps (10+%) in coding, long-running agents and deep research

Try it out for inference or fine-tuning! https://t.co/50aAtSNGJe

7

191

12

20

15K

BobBarroso retweeted

3 months ago

Fireworks Training is now in preview. You can now full-parameter fine-tune Kimi K2.5 (1T params, 256k context) with custom loss functions (GRPO, DRO, DAPO, or bring your own) on managed infra. @genspark_ai built their proprietary model stack in four weeks. @vercel hit 93% error-free generation with RFT. @cursor_ai runs their RL rollout fleet on Fireworks. Full-parameter from 8B to 1T. Multi-LoRA serving. Managed or bring your own training loop. Your model is your product. Your data is your moat. https://t.co/kyz7HzihC1

FireworksAI_HQ's tweet photo. Fireworks Training is now in preview.

You can now full-parameter fine-tune Kimi K2.5 (1T params, 256k context) with custom loss functions (GRPO, DRO, DAPO, or bring your own) on managed infra.

@genspark_ai built their proprietary model stack in four weeks. @vercel hit 93% error-free generation with RFT. @cursor_ai runs their RL rollout fleet on Fireworks.

Full-parameter from 8B to 1T. Multi-LoRA serving. Managed or bring your own training loop.
Your model is your product. Your data is your moat.
https://t.co/kyz7HzihC1

6

203

16

69

34K

BobBarroso retweeted

3 months ago

Training trillion-parameter MoEs is an infra problem disguised as a modeling problem. So we built the infra solution. Cursor used it to train Composer 2. Now it's available for Kimi K2.5, Qwen3.5 397B, MiniMax M2.5, and more: →Fused RL loss (~2x faster PPO) →MXFP8 expert kernels on Blackwell →Composable 4D parallelism →1M+ token context training validated Here's how it all works ↓ https://t.co/PA20I8EFaD

FireworksAI_HQ's tweet photo. Training trillion-parameter MoEs is an infra problem disguised as a modeling problem.

So we built the infra solution. Cursor used it to train Composer 2. Now it's available for Kimi K2.5, Qwen3.5 397B, MiniMax M2.5, and more:

→Fused RL loss (~2x faster PPO)
→MXFP8 expert kernels on Blackwell
→Composable 4D parallelism
→1M+ token context training validated

Here's how it all works ↓
https://t.co/PA20I8EFaD

6

263

28

169

19K

BobBarroso retweeted

3 months ago

Most teams don’t hit a fine-tuning bottleneck because they picked the wrong algorithm. They hit the bottleneck because the loop around training is too slow: -data plumbing -eval setup -deployment friction -too much guesswork between runs

4

26

2

12

4K

over 3 years ago

@gonzalo__nunez retweet to amplify some quality science please

1

2

0

bioRxiv Neuroscience @biorxiv_neursci

about 4 years ago

People of the virtual sphere, check out the recently published "Shared and specialized coding across posterior cortical areas for dynamic navigation decisions", quality work from the Harvey Lab's @ShihYiTseng1 @selmaanchettih @CharlotteArlt! https://t.co/Aqt0xxew5B

1

2

0

BobBarroso retweeted

over 4 years ago

Cognitive experience alters cortical involvement in navigation decisions https://t.co/DA49zmcAps #biorxiv_neursci

0

33

7

8

0

Charlotte Arlt @CharlotteArlt

over 4 years ago

Could not be happier for @CharlotteArlt and the Harvey Lab @harvardmed. Some quality science I had the fortune of working on out on pre-print now! Check it out!

over 4 years ago

Excited to share some of my postdoc work as a @biorxivpreprint from the Harvey lab @harvardmed. We show that mice rely on different cortical areas for navigation decisions, depending on what other complex tasks they have previously learned. Training history matters!

2

110

24

7

0

3

0

BobBarroso retweeted

Chi Hack Night @chihacknight

over 4 years ago

Dig into digital influence operations with us *tonight* @ 7pm CST. @BobBarroso & @nick__tallant of Omelas will discuss the map they've created of the online information environment to detect digital influence campaigns used to undermine democratic values. https://t.co/WshtIrm84F

0

5

6

2

0

almost 5 years ago

Interested in information operations and geopolitics? Would recommend subscribing to the Omelas Newsletter https://t.co/yRhb6rEbWP

0

over 5 years ago

Also big shout out to @_mcnutt_ who instructed us through this journey.

0

3

0

over 5 years ago

People of the virtual world, check out some quality work from the students of @UChicagoCAPP using data viz to understand a multitude of relevant policy issues. https://t.co/IZUCiwLBKx

1

7

0

1

0