Red Hat AI @RedHat_AI - Twitter Profile

Pinned Tweet

16 days ago

🇬🇧 London, June 10. @vllm_project & @_llm_d_ Inference Meetup, hosted by Red Hat AI, @nvidia, and @SteliaAI. Talks on vLLM updates, speculative decoding, llm-d in production, AI safety, and more. Plus food, drinks, and the people building this stuff. https://t.co/QC4d1yKPbc

1

14

5

2

6K

RedHat_AI retweeted

vLLM

@vllm_project

about 6 hours ago

Congrats to the @googlegemma team on the Gemma 4 12B launch 🎉 Day-0 support on vLLM is ready to go. It's an encoder-free unified multimodal model — text, image, audio, and video all project straight into the LLM's embedding space, no separate vision or audio towers. 256K context, built-in thinking, native tool calling. Reasoning + tool parsers (`gemma4`), vision, and audio all served through the OpenAI-compatible API. 🔗 Recipe: https://t.co/MGJcoQkwzz

vllm_project's tweet photo. Congrats to the @googlegemma team on the Gemma 4 12B launch 🎉 Day-0 support on vLLM is ready to go.

It's an encoder-free unified multimodal model — text, image, audio, and video all project straight into the LLM's embedding space, no separate vision or audio towers. 256K context, built-in thinking, native tool calling.

Reasoning + tool parsers (`gemma4`), vision, and audio all served through the OpenAI-compatible API.

🔗 Recipe: https://t.co/MGJcoQkwzz

6

190

15

46

8K

Red Hat AI

@RedHat_AI

about 8 hours ago

@cedricclyburn @vllm_project @DeepLearningAI Great things are not built overnight. Worth the wait. 🙌

0

3

0

18

Red Hat AI

@RedHat_AI

about 8 hours ago

This one has been in the works for a while. @cedricclyburn teaching LLM inference, compression, and benchmarking with @vllm_project -- free course with @DeepLearningAI. Proud of this one.

DeepLearning.AI

@DeepLearningAI

about 8 hours ago

New short course: Fast & Efficient LLM Inference with vLLM, built in partnership with @RedHat and taught by @cedricclyburn. Learn to quantize an open-source LLM, serve it with vLLM, and benchmark your deployment across speed, cost, and accuracy. Free to enroll: https://t.co/czVwJBnLZ6

6

140

27

100

9K

3

29

7

6

2K

Who to follow

clem 🤗

@ClementDelangue

Co-founder & CEO @HuggingFace 🤗, the open and collaborative platform for AI builders

Dhruv Batra

@DhruvBatra_

Co-founder & Chief Scientist @yutori_ai. Prev: Senior Director leading FAIR Embodied AI @MetaAI and Professor @GeorgiaTech.

Databricks AI Research

@DbrxMosaicAI

We remove the barriers to state-of-the-art generative AI model development and make data + AI available to all.

RedHat_AI retweeted

Cedric Clyburn

@cedricclyburn

about 8 hours ago

@RedHat_AI @vllm_project @DeepLearningAI this course may be on @vllm_project but all the content around LLM fundamentals and compression is applicable to ANY model serving engine. ENJOY :D

0

4

2

1

203

Red Hat AI

@RedHat_AI

about 9 hours ago

Just in: @NVIDIA is giving away a DGX Spark to a lucky meetup participant in London next week. See you there!

Red Hat AI

@RedHat_AI

2 days ago

🇬🇧 London, June 10. @vllm_project & @_llm_d_ Inference Meetup hosted by Red Hat AI, @nvidia, and @SteliaAI at Sustainable Ventures, County Hall. On the agenda: vLLM project update, speculative decoding, llm-d in production, and AI safety evaluation. https://t.co/QC4d1yKPbc

2

14

3

1

2K

0

11

3

0

976

RedHat_AI retweeted

Poolside

@poolsideai

1 day ago

Love seeing the work @RedHat_AI and @vllm_project are doing to make Laguna XS.2 easier to run. Red Hat AI trained a DFlash speculator: a 0.6B drafter that predicts 8 tokens per pass, with Laguna verifying the output. So builders get faster generation without changing output quality. With vLLM support and FP8/NVFP4/INT4 checkpoints through LLM Compressor, it’s also easier to tune for different latency, memory, and hardware constraints. Grateful for the team building the infra that makes open models easier to use, serve, and improve!

2

25

4

3

2K

Red Hat AI

@RedHat_AI

1 day ago

🇹🇷 Istanbul, 17 June. @vllm_project & @_llm_d_ meetup hosted by Red Hat AI, @nvidia, and BeyondGuard at İTÜ Taşkışla. On the agenda: vLLM project update, distributed inference, speculative decoding, securing vLLM in production, live demos, and more. https://t.co/1ZidCjwPdS

0

27

2

6

2K

Red Hat AI

@RedHat_AI

2 days ago

🇬🇧 London, June 10. @vllm_project & @_llm_d_ Inference Meetup hosted by Red Hat AI, @nvidia, and @SteliaAI at Sustainable Ventures, County Hall. On the agenda: vLLM project update, speculative decoding, llm-d in production, and AI safety evaluation. https://t.co/QC4d1yKPbc

2

14

3

1

2K

Red Hat AI

@RedHat_AI

2 days ago

Here's a 6.5 minute vLLM v0.21.0 update by @mgoin_. ⚠️ transformers v4 deprecated. Migrate to v5. 🧠 Spec decode now respects thinking budget for reasoning models ⚡ NVFP4 KV cache + ModelOpt W4A16 support 🔴 DeepSeek V4 on AMD/ROCm + pipeline parallelism 📦 Docker image 2.5 GB smaller 🔄 Disaggregated serving: bi-directional P/D KV transfers ...and much more. v0.22 is already out: https://t.co/xcPapV8dPX. Breaking it down next week during our regular vLLM office hours: https://t.co/KkF6DfGs1q @vllm_project

3

61

11

33

6K

RedHat_AI retweeted

Red Hat @RedHat

3 days ago

Red Hat and @NVIDIA are integrating NVIDIA OpenShell into the full-stack @RedHat_AI platform. The work brings oversight and policy to the infrastructure level, while contributing to the open source OpenShell project to standardize how agents are governed on enterprise platforms. Learn more: https://t.co/NiGNNWDIWF

RedHat's tweet photo. Red Hat and @NVIDIA are integrating NVIDIA OpenShell into the full-stack @RedHat_AI platform.

The work brings oversight and policy to the infrastructure level, while contributing to the open source OpenShell project to standardize how agents are governed on enterprise platforms.

Learn more: https://t.co/NiGNNWDIWF

4

132

23

19

8K

RedHat_AI retweeted

Michael Goin @mgoin_

4 days ago

We’ve open sourced all aspects of our stack for training SOTA speculators like DFlash, and keep publishing our own checkpoints validating it for everyone to benefit and learn from. Check out the latest ones for Laguna!

0

21

5

10

3K

RedHat_AI retweeted

vLLM

@vllm_project

4 days ago

vLLM v0.22.0 is out! 459 commits from 230 contributors (63 new). 🎉 Highlights: DeepSeek V4 hardening (NVFP4 fused MoE, full+piecewise CUDA graph, ROCm support), experimental Rust frontend in-Tree, batch-invariant Cutlass FP8 (28.9% lower e2e latency), Model Runner V2 advances, multi-tier KV cache offloading. Thread 👇

vllm_project's tweet photo. vLLM v0.22.0 is out! 459 commits from 230 contributors (63 new). 🎉

Highlights: DeepSeek V4 hardening (NVFP4 fused MoE, full+piecewise CUDA graph, ROCm support), experimental Rust frontend in-Tree, batch-invariant Cutlass FP8 (28.9% lower e2e latency), Model Runner V2 advances, multi-tier KV cache offloading.

Thread 👇

14

508

53

120

54K

Red Hat AI

@RedHat_AI

4 days ago

Blog: https://t.co/dH6JkpbC3p DFlash speculator: https://t.co/4zHIM4m2c3 FP8: https://t.co/dCEe7OIltc NVFP4: https://t.co/Ca49ziXvl3 INT4: https://t.co/mWHRGsKn8E Compress your own LLM: https://t.co/FMZqtkQGkl Train your own speculator: https://t.co/HklxENIiLu

0

15

1

8

750

Red Hat AI

@RedHat_AI

4 days ago

Laguna XS.2 from @poolsideai is a 33B MoE built for agentic coding. Red Hat AI trained a DFlash speculator for it: 0.6B drafter, 8 tokens per pass, no quality loss. FP8, NVFP4, and INT4 checkpoints via LLM Compressor. Models in comments. Speedup with @vllm_project:

6

56

13

23

20K

Red Hat AI

@RedHat_AI

5 days ago

Speculators v0.5.0 just dropped with 3 big updates: - DFlash training support. Draft all tokens in one pass via block diffusion - Unified online/offline training powered by @vllm_project's hidden states extraction system - Docs & tutorials overhaul for faster onboarding https://t.co/tCMZIrgQf2

0

48

7

12

4K

RedHat_AI retweeted

Red Hat AI

@RedHat_AI

6 days ago

[vLLM Office Hours #50] GenAI with vLLM on Intel CPUs - May 28, 2026 https://t.co/QaNIromANe

0

22

4

3

882

Red Hat AI

@RedHat_AI

6 days ago

[vLLM Office Hours #50] GenAI with vLLM on Intel CPUs - May 28, 2026 https://t.co/QaNIromANe

0

22

4

3

882

Red Hat AI

@RedHat_AI

6 days ago

Alternately, join live here on X or via YouTube Live: https://t.co/wHqvLiCWK9

0

3

1

2

2K

Red Hat AI

@RedHat_AI

6 days ago

Can you run Gen AI workloads without GPUs with @vllm_project? Join vLLM Office Hours today at 2PM ET to learn what’s new in vLLM v0.21.0 from @mgoin_, followed by a deep dive from Intel on running AI use cases on Intel Xeon CPUs. Get a 🗓️ invite: https://t.co/X8hAHYR3rl

RedHat_AI's tweet photo. Can you run Gen AI workloads without GPUs with @vllm_project?

Join vLLM Office Hours today at 2PM ET to learn what’s new in vLLM v0.21.0 from @mgoin_, followed by a deep dive from Intel on running AI use cases on Intel Xeon CPUs.

Get a 🗓️ invite: https://t.co/X8hAHYR3rl https://t.co/OfNSejdrfY

2

31

3

8

2K

RedHat_AI retweeted

Red Hat @RedHat

7 days ago

Red Hat supports any model, on any accelerator, in any cloud. Now, we’re adding "any agent" to that list with @RedHat_AI 3.4. Control your AI journey with a foundation built for scale and security: https://t.co/SFALA9WpV7 #RHSummit

RedHat's tweet photo. Red Hat supports any model, on any accelerator, in any cloud. Now, we’re adding "any agent" to that list with @RedHat_AI 3.4. Control your AI journey with a foundation built for scale and security: https://t.co/SFALA9WpV7 #RHSummit https://t.co/iDDOdYK1Lv

1

24

6

2

2K

Red Hat AI

@RedHat_AI

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users