Baseten @Baseten - Twitter Profile

Pinned Tweet

28 days ago

Intelligence should be defined by the people closest to the work. Intelligence should be owned by all of us. Let’s build a many model future!

Tuhin Srivastava

@tuhinone

28 days ago

https://t.co/YPONx4IZSz

23

536

93

580

249K

3

24

3

8

10K

Baseten

@baseten

1 minute ago

The longer the context, the more memory your LLM needs. We introduce research techniques to compress that memory 200x on the fly without changing the base model.

Charlie O'Neill

@oneill_c

6 minutes ago

1/ You can shrink a language model's KV cache by 200×, in a single forward pass, and it still answers correctly. At 256k context that's 36 GiB of cache down to ~360 MiB, with no change to the base model. Here's how we did it 👇

oneill_c's tweet photo. 1/ You can shrink a language model's KV cache by 200×, in a single forward pass, and it still answers correctly.

At 256k context that's 36 GiB of cache down to ~360 MiB, with no change to the base model.

Here's how we did it 👇 https://t.co/He1ucvxGyf

1

6

0

1

62

0

1

0

3

Baseten

@baseten

about 22 hours ago

https://t.co/6um1x3Km4y

0

3

0

148

Baseten

@baseten

about 22 hours ago

Baseten is live on the Respan Gateway. Congratulations to the @RespanAI team on their Gateway launch as they bring observability, evals, and routing to agents. Try Baseten Model APIs now on Respan.

baseten's tweet photo. Baseten is live on the Respan Gateway.

Congratulations to the @RespanAI team on their Gateway launch as they bring observability, evals, and routing to agents.

Try Baseten Model APIs now on Respan. https://t.co/cfQgoSSnlI

2

12

5

2

685

Who to follow

Reka

@RekaAILabs

An AI research and product company 🫠. We are a team of scientists and engineers building state-of-the-art multimodal models 😻

SkyPilot

@skypilot_org

Run, manage, and scale AI workloads on any AI infrastructure. Open-source system for all your AI compute — Kubernetes, Slurm, VMs, 20+ clouds.

Arena.ai

@arena

Where AI meets the real world. Formerly LMArena. We measure and advance the frontier of AI through community-driven evaluation. We’re hiring → https://t.co/XBZCrseaWF

baseten retweeted

Sarah Sachs

@sarahmsachs

2 days ago

Model selection isn't just a fancy term for "looking at benchmarks". If you're just auto-updating and going off twitter vibes, you're not really adding any value to your business or your customers. To do this well, it means you need to deeply understand your use cases, how much value your customers ascribe to a problem, how much margin you want to make on that product, and how much time you want to invest into growing that margin. Came here me rant more on June 25 https://t.co/4GI8G8XFGW

1

23

1

3

2K

Baseten

@baseten

2 days ago

@thatsjonsense @sarahmsachs @GammaApp @NotionHQ Sign up here: https://t.co/DzE0q3Y4Lz

0

2

0

1K

Baseten

@baseten

2 days ago

Join Charlie for a conversation with @thatsjonsense and @sarahmsachs on how @GammaApp and @NotionHQ think about model selection on June 25th.

Charlie O'Neill

@oneill_c

2 days ago

Working in the Training team at Baseten, I often see companies agonize over which model to use. So many people worry about how to keep up with benchmarks and new releases But with post-training and specialization, and as we see a rising tide in the intelligence of many open-source models, what really matters is your learning signal. Do you have the right user metrics to say whether a model is doing poorly or well at your task, and to use that to learn and hillclimb the task? If you want to learn more, I’m moderating a panel on June 25th in SF at 6 PM with Gamma co-founder Jon Noronha (@thatsjonsense) and Notion AI lead Sarah Sachs (@sarahmsachs) on model selection in a multi-model landscape.

5

64

4

28

14K

4

15

0

1

2K

Baseten

@baseten

2 days ago

@NotionHQ @oneill_c @sarahmsachs 👀

0

119

Baseten

@baseten

5 days ago

Check it out here: https://t.co/WpZDxyvNHO

0

6

0

706

Baseten

@baseten

5 days ago

GLM 5.1 now achieves 160+ TPS and <2-second TTFT on Baseten. Ideal for agentic workloads that need high throughput and low latency.

baseten's tweet photo. GLM 5.1 now achieves 160+ TPS and <2-second TTFT on Baseten.

Ideal for agentic workloads that need high throughput and low latency. https://t.co/2rbF4DCJ5O

8

91

2

15

6K

Baseten

@baseten

6 days ago

Read our full write-up: https://t.co/KM7eD6Y1no

0

6

0

381

Baseten

@baseten

6 days ago

Are you tired of waiting 17 minutes for an AI agent to finish a code change? As an agent’s context grows, standard transformer attention can turn long runs into a bottleneck. @NVIDIAAI Nemotron 3 Ultra addresses this with a hybrid architecture that replaces several attention-heavy layers with Mamba layers. This makes long-context inference far more efficient. In benchmarked settings, this means: → step 300 runs as fast as step 3 → up to 5x higher throughput → up to 30% lower cost Today, Nemotron 3 Ultra, Nemotron 3.5 ASR, and Nemotron 3.5 Content Safety are available on Baseten for production AI teams.

NVIDIA

@nvidia

6 days ago

Introducing NVIDIA Nemotron 3 Ultra. A frontier smart open model built for long-running agents that need to plan, reason, use tools and keep working across complex coding, research and enterprise workflows. Up to 5x faster inference and up to 30% lower cost for agentic tasks. Learn more: https://t.co/h9XLqqYPFf

120

2K

268

448

222K

2

19

0

3

1K

baseten retweeted

Tuhin Srivastava

@tuhinone

8 days ago

Today we're announcing MAI-Thinking-1 with Microsoft and it will be available on Baseten soon. Microsoft built something genuinely different here: a commercial-grade thinking model trained on clean data with no distillation from third-party models and designed to be fine-tuned by the enterprises using it. Microsoft AI guarantees 100% eyes-off on post-training data and Baseten will handle the fine-tuning and deployment at scale. The future isn't one model. It's many models, each owned by the businesses that shaped them and MAI-Thinking-1 is a big step in that direction. https://t.co/8w9k4jwrgq

tuhinone's tweet photo. Today we're announcing MAI-Thinking-1 with Microsoft and it will be available on Baseten soon.

Microsoft built something genuinely different here: a commercial-grade thinking model trained on clean data with no distillation from third-party models and designed to be fine-tuned by the enterprises using it. Microsoft AI guarantees 100% eyes-off on post-training data and Baseten will handle the fine-tuning and deployment at scale.

The future isn't one model. It's many models, each owned by the businesses that shaped them and MAI-Thinking-1 is a big step in that direction.

https://t.co/8w9k4jwrgq

10

328

24

81

38K

baseten retweeted

Dannie Herzberg

@DannieHerz

6 days ago

I’m thrilled to welcome Gabe Stern to Baseten to lead Legal. Gabe is the whole package: deeply experienced, sharp, highly trusted, and commercially minded. We first got to work together at Slack, where he was an exceptional partner and played a critical role through Slack's hyper-growth & IPO. I’m personally very happy to be reunited with Gabe, and even happier that Baseten gets to benefit from his judgment, partnership, and instincts. Welcome, Gabe!

1

26

1

4K

baseten retweeted

Tuhin Srivastava

@tuhinone

6 days ago

The next wave of AI companies will be built on fast, reliable infrastructure, and the trust to deploy it in production. Gabe has helped iconic technology companies scale through this exact phase. I'm excited to welcome him to Baseten as our General Counsel.

0

33

3

0

3K

Baseten

@baseten

6 days ago

https://t.co/5cVo3BlesH

0

2

0

243

Baseten

@baseten

6 days ago

We are excited to welcome Gabe Stern as General Counsel. Welcome, Gabe!

2

14

0

2

7K

Baseten

@baseten

6 days ago

Agents append to their own context. But attention is quadratic, so 2x context = 4x work per step. Nemotron 3 Ultra swaps most attention for Mamba, so state is fixed-size and compute cost is linear. That means 5x faster inference that's 30% cheaper.

Rachel Rapp

@rachelrapp

6 days ago

https://t.co/UHjLV8fqBl

5

23

3

7

3K

3

32

4

12

2K

Baseten

@baseten

7 days ago

Baseten

@baseten

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users