Sid Sharma @phylera14 - Twitter Profile

Pinned Tweet

about 2 months ago

huge respect for justin bieber switching from autoregressive to diffusion-based text generation while headlining coachella. most artists just soundcheck. justin swapped out his LLM provider backstage... you can see him here (real photo!) spinning up mercury 2 from @_inception_ai on a dedicated instance, watching tokens materialize in parallel instead of one at a time like some kind of animal, refreshing his p99 latency graphs and whispering "discrete diffusion" to himself before performing for 100,000+ people. most performers have a vocal warmup routine. justin's is curl -X POST https://t.co/2c2yfycASz. justin ships nothing at 2500ms. neither should you.

phylera14's tweet photo. huge respect for justin bieber switching from autoregressive to diffusion-based text generation while headlining coachella.

most artists just soundcheck. justin swapped out his LLM provider backstage... you can see him here (real photo!) spinning up mercury 2 from @_inception_ai on a dedicated instance, watching tokens materialize in parallel instead of one at a time like some kind of animal, refreshing his p99 latency graphs and whispering "discrete diffusion" to himself before performing for 100,000+ people.

most performers have a vocal warmup routine. justin's is curl -X POST https://t.co/2c2yfycASz.

justin ships nothing at 2500ms. neither should you.

0

11

1

792

Sid Sharma

@phylera14

about 5 hours ago

@cline Give Mercury 2 from @_inception_ai a try

0

28

Sid Sharma

@phylera14

about 9 hours ago

@amiruci 🤝

0

1

0

18

Sid Sharma

@phylera14

about 9 hours ago

@tuhinone 🤝

0

1

0

19

Who to follow

Mackenzie Burnett

@ciaomack

ceo and cofounder @ambrookag

Patryce J

@LuluJones001

Pivoting to Data Analyst…make room!

John Hallman

@johnohallman

Post-training @OpenAI, previously pre-training/bio

Sid Sharma

@phylera14

about 9 hours ago

@DannieHerz 🤝

0

1

0

18

Sid Sharma

@phylera14

about 13 hours ago

Mercury 2 is now available on @baseten! If you're building a multi-agent system, coding tool, voice application, or anything where you're currently routing all traffic to a single expensive LLM, Mercury is worth testing. Same intelligence tier as Haiku and Flash, 5-7x faster, sub-500ms to first token. We’re running a limited batch of free POCs. If you want to see the numbers on your own traffic, start here: https://t.co/2XIciEkh6K

Baseten

@baseten

about 14 hours ago

https://t.co/HGVpRPEhBo

1

37

5

9

13K

0

8

0

1

365

Sid Sharma

@phylera14

13 days ago

@hu_yifei 🕺

0

1

0

28

phylera14 retweeted

David

@DavidSHolz

15 days ago

Most researchers agree that autoregression is best when memory bandwidth is cheap and diffusion is best when FLOPS are cheap. They also admit the future of compute is all FLOPS because memory scaling is hard and scaling FLOPS is easy. So why not go all in on diffusion????

76

1K

75

622

235K

Sid Sharma

@phylera14

15 days ago

@lennysan @_inception_ai

0

36

phylera14 retweeted

Stefano Ermon

@StefanoErmon

15 days ago

@DavidSHolz That’s exactly the bet we’re making at @_inception_ai We’re already matching speed-optimized models from frontier labs on quality, while being faster and more cost efficient. That gap will only widen as we continue to scale.

2

112

9

30

9K

Sid Sharma

@phylera14

16 days ago

🔗 https://t.co/FkdKXsAm6T

0

69

Sid Sharma

@phylera14

16 days ago

DM to skip the line

3

7

1

1K

phylera14 retweeted

Kelly Greer

@kellyjgreer

18 days ago

going to start threading the autoregressive model killer candidates emerging 1/ @_inception_ai diffusion LLM Mercury 2 rips 4x the tokens per second vs autoregressive LLMs (@StefanoErmon & @phylera14)

1

2

1

2

1K

Sid Sharma

@phylera14

24 days ago

@FeinbergVlad Getting the crowd warm before the big bang tomorrow I see 😉

0

1

0

545

Sid Sharma

@phylera14

24 days ago

@samlambert

0

26

Sid Sharma

@phylera14

29 days ago

Your company's two highest-paid inference engineers just landed in Maui for the company offsite

1

3

0

111

Sid Sharma

@phylera14

30 days ago

Inception is hiring a Head of Product This is a hands-on role for a technical product lead who wants to help build the next generation of LLMs. You'd work directly with S-tier AI researchers at the frontier of model architecture, inference, and enterprise deployment. We're one of the only AI labs where the product is live in production with enterprises and AI-native companies today - and the valuation is at a stage where your equity has real upside (not financial advice). The bar is high. The role is not a walk in the park. But if you’ve been watching the frontier AI labs from the sidelines and waiting for the seat where you can help build foundational AI infrastructure before the category is obvious, this is it. DM me. Bay Area only. https://t.co/Uj6zvC8zJ2

2

14

2

5

2K

Sid Sharma

@phylera14

about 1 month ago

The best AI agents in production aren't one model. They're 5-10 specialized subagents running in parallel, each matched to the right task/cost/speed tradeoff. @augmentcode's architecture is one of the cleanest examples of this shift. We wrote up how they do it.

Inception

@_inception_ai

about 1 month ago

@augmentcode rebuilt their context compaction layer around Mercury 2. 82% latency cut. 90% cost cut. Comparable quality to Opus 4.7. Running in production today. "We took a counter-intuitive bet. We decoupled summarization entirely, offloading it to Mercury 2 as a dedicated subagent. Mercury 2 is the highly efficient engine powering our most critical workflows." -@RustagiAnkur & @jm1234567890, Members of Technical Staff at Augment Code The subagent layer needs the most efficient model. Full methodology and eval setup in the writeup. https://t.co/LPVTdaMjli

_inception_ai's tweet photo. @augmentcode rebuilt their context compaction layer around Mercury 2. 82% latency cut. 90% cost cut. Comparable quality to Opus 4.7. Running in production today.

"We took a counter-intuitive bet. We decoupled summarization entirely, offloading it to Mercury 2 as a dedicated subagent. Mercury 2 is the highly efficient engine powering our most critical workflows."

-@RustagiAnkur & @jm1234567890, Members of Technical Staff at Augment Code

The subagent layer needs the most efficient model. Full methodology and eval setup in the writeup.

https://t.co/LPVTdaMjli

2

81

15

42

28K

1

2

1

2

182

Sid Sharma

@phylera14

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users