brian stevens

about 1 year ago

Introducing llm-d on stage at Red Hat Summit was truly a privilege ...

about 1 year ago

LLM inference is too slow, too expensive, and too hard to scale. 🚨 Introducing llm-d, a Kubernetes-native distributed inference framework, to change that—using vLLM (@vllm_project), smart scheduling, and disaggregated compute. Here’s how it works—and how you can use it today:

5

539

87

470

70K

2

33

8

5

4K

addvin retweeted

17 days ago

🇬🇧 London, June 10. @vllm_project & @_llm_d_ Inference Meetup, hosted by Red Hat AI, @nvidia, and @SteliaAI. Talks on vLLM updates, speculative decoding, llm-d in production, AI safety, and more. Plus food, drinks, and the people building this stuff. https://t.co/QC4d1yKPbc

1

14

5

2

6K

addvin retweeted

about 1 month ago

Llama 70B as a cloud endpoint costs exponentially more than Llama 8B. For teams where a smaller model meets the quality bar, that gap is hard to ignore. And with INT4 quantization: 4x smaller, 2x faster, less than 1% accuracy loss. The right model isn't always the biggest one. https://t.co/23IHcSmDkk

2

14

3

2

1K

addvin retweeted

Open Source Contributor, Technologist and CEO @RedHat

5 months ago

Calling Boston area startups building with AI. 🤙 We're kicking off 2026 with the first event in a new monthly, in person hackathon series hosted by @RedHat and @IBM in Boston’s Seaport District. This one day hackathon is designed specifically for local startups that want to move faster from idea to working prototype. Instead of a fixed theme, you bring a real AI problem your team is actively facing. We help you build a proof of concept using open source, enterprise ready templates from https://t.co/byiy4bdAZa, including MCP Server, AI Agent, and UI templates. What you will get: ⚡ Rapid prototyping without boilerplate 🧠 Hands on guidance from Red Hat AI architects 🤝 Connections with other Boston based AI startups and ecosystem partners If you are a Boston startup looking to turn an AI challenge into something real, this is for you. Event details are shared after registration. Register now: https://t.co/K9WJmqfBjt

3

29

5

3

2K

Who to follow

Matt Hicks

@matthicksj

Chris Wright

@kernelcdub

Red Hat CTO. Tezos Foundation council member. Passion for open source SW innovation. Father and husband. Cyclist. Human.

Aparna Sinha

@aparnabsinha

Building, Investing, Teaching. Host EnterpriseAlignedAI - on how large customers implement AI. Current / Prev: Pear VC, Google...PhD, Stanford EE

addvin retweeted

SemiAnalysis

@SemiAnalysis_

7 months ago

The @RedHat_AI team contributes a lot to vLLM and does amazing work for the open-source community. Great to see vLLM performing so well compared to TRT-LLM on H200! vLLM comes pretty close to B200, with the @NVIDIAAI team working on closing the gap for GPTOSS within the next couple of updates.

SemiAnalysis_'s tweet photo. The @RedHat_AI team contributes a lot to vLLM and does amazing work for the open-source community. Great to see vLLM performing so well compared to TRT-LLM on H200! vLLM comes pretty close to B200, with the @NVIDIAAI team working on closing the gap for GPTOSS within the next couple of updates.

3

95

15

25

21K

addvin retweeted

7 months ago

InferenceMAX, vLLM TPU, compressed-tensors, MoE support via transformers, DeepSeek-OCR, and more. Here’s what’s new in the @vllm_project community over the past two weeks:

1

42

8

16

24K

addvin retweeted

8 months ago

4 tracks. 12 sessions. 1 day of learning. Join us on Oct. 16 for Red Hat AI Day of Learning, a free virtual event for developers, engineers & practitioners. Tracks: ⚡ Fast & efficient inference 🎯 Model customization 🤖 Agentic AI 🌐 Scaling AI over hybrid cloud Sessions include: · Intro to vLLM and how to get started · Model optimization with LLM Compressor · Lossless LLM inference acceleration w/ Speculators · End-to-end model customization · Synthetic data generation and data processing · Continual learning of LLMs with Training Hub · Build open source agentic AI solutions · Intro to Model Context Protocol (MCP) · Intro to Llama Stack · Intro to distributed inference · Distributed inference with llm-d · Scaling AI Infrastructure 👉 Register free: https://t.co/47t6ts4A4c

1

39

17

12

7K

addvin retweeted

9 months ago

Qwen3-Next dropped yesterday and you can run it with Red Hat AI today. ✅ Day-zero support in vLLM ✅ Day-one deployment with Red Hat AI Step-by-step guide: https://t.co/ZjLJyfmMJm The future of AI is open.

0

18

5

3

2K

10 months ago

@charles_irl @modal_labs @mgoin_ @vllm_project @shariqmobin Well done!

0

2

0

91

addvin retweeted

about 1 year ago

Thanks to the @lmcache team for joining forces with Red Hat on llm-d! llm-d is a new open source project for scalable, efficient distributed LLM inference with @vllm_project. Learn more about our collaboration here: https://t.co/bGLTJoHGmi

0

26

8

7

2K

Mark Collier 柯理怀 @sparkycollier

about 1 year ago

@RedHat_AI Adding a shoutout to the @IBMResearch team working jointly with AMD team on contributing Triton attention kernels in vLLM v1 that improved decode throughput by 3x on Llama and Granite models.

0

4

0

49

addvin retweeted

about 1 year ago

Really excited to see the emergence of llm-d @addvin ! Inference is the biggest workload in human history and the open source tools need to keep evolving to serve it

0

11

2

1

809

addvin retweeted

NVIDIA AI Developer

@NVIDIAAIDev

about 1 year ago

The llm-d project is a major step forward for the #opensource AI ecosystem, and we are proud to be one of the founding contributors, reflecting our commitment to collaboration as a catalyst for innovation in generative AI. As generative and agentic AI continue to evolve, scalable, high-performance inference will be critical to unlocking their full potential. That’s why we’re partnering with @RedHat and other contributors to grow the llm-d community and accelerate its capabilities—powered by our contributions, including innovations from NVIDIA Dynamo such as NIXL. 🔗 Explore and contribute on GitHub: https://t.co/U7OgK2PgMl 📰 Read the launch blog: https://t.co/u8Nyhxj2w2 🎙️ Hear from NVIDIA’s VP of Engineering & AI Frameworks, Ujval Kapasi → https://t.co/cfA8hlTWeT

NVIDIAAIDev's tweet photo. The llm-d project is a major step forward for the #opensource AI ecosystem, and we are proud to be one of the founding contributors, reflecting our commitment to collaboration as a catalyst for innovation in generative AI.

As generative and agentic AI continue to evolve, scalable, high-performance inference will be critical to unlocking their full potential.

That’s why we’re partnering with @RedHat and other contributors to grow the llm-d community and accelerate its capabilities—powered by our contributions, including innovations from NVIDIA Dynamo such as NIXL.

🔗 Explore and contribute on GitHub: https://t.co/U7OgK2PgMl
📰 Read the launch blog: https://t.co/u8Nyhxj2w2
🎙️ Hear from NVIDIA’s VP of Engineering & AI Frameworks, Ujval Kapasi → https://t.co/cfA8hlTWeT

1

31

17

10

9K

about 1 year ago

And was great to see the Red Hat and Google effort announced by my friend the brilliant Amin Vahdat.

Woosuk Kwon

@woosuk_k

about 1 year ago

Huge congrats to all the @googlecloud and @RedHat_AI team members who drove this effort!

0

62

3

4

4K

0

6

0

756

about 1 year ago

@RedHat_AI @vllm_project What an amazing release!

0

3

0

354

addvin retweeted

over 1 year ago

DeepSeek’s Open Source Week drops A LOT of exciting goodies! We’re hosting vLLM Office Hours tomorrow—learn what they are, how they integrate with vLLM, & ask questions! Date: Thursday, Thu, Feb 27 Time: 2PM ET / 11AM PT Register: https://t.co/zTjNvaFusp #DeepSeek #AI

RedHat_AI's tweet photo. DeepSeek’s Open Source Week drops A LOT of exciting goodies! We’re hosting vLLM Office Hours tomorrow—learn what they are, how they integrate with vLLM, & ask questions!

Date: Thursday, Thu, Feb 27
Time: 2PM ET / 11AM PT

Register: https://t.co/zTjNvaFusp #DeepSeek #AI

0

9

2

0

930

addvin retweeted

Matt Hicks

@matthicksj

over 1 year ago

At @RedHat, we believe the future of AI is open. That's why I'm incredibly excited about our acquisition of @NeuralMagic. Together, we're furthering our commitment to our customers and the open source community to deliver on the future of AI—and that starts today.

0

80

28

3

4K

over 1 year ago

Today it become official, Neural Magic now a part of Red Hat.

Red Hat @RedHat

over 1 year ago

Today, Red Hat completed the acquisition of @NeuralMagic, a pioneer in software and algorithms that accelerate #GenAI inference workloads. Read how we are accelerating our vision for #AI’s future: https://t.co/PkGfC48tAt.

2

111

38

7

30K

2

38

8

0

3K

addvin retweeted