Shaoting Feng @shaoting_feng - Twitter Profile

9 months ago

Join us at SIGCOMM 2025(https://t.co/KS9k6dPZ98) for our full-day LMCache Tutorial — an intelligent caching middleware that makes LLM inference faster & cheaper! 📅 Sept 8, 2025 8:45 AM – 6:00 PM (Portugal Time / WEST) = 12:45 AM – 10:00 AM (PDT) What you’ll learn: 🔹 KV-cache offloading & reuse for LLMs 🔹 Cutting GPU memory + compute costs 🔹 Real-world integrations with vLLM & beyond ✅ Register here https://t.co/MxVRQcariy #SIGCOMM2025 #LMCache #LLM #vLLM

lmcache's tweet photo. Join us at SIGCOMM 2025(https://t.co/KS9k6dPZ98) for our full-day LMCache Tutorial — an intelligent caching middleware that makes LLM inference faster & cheaper!

📅 Sept 8, 2025
8:45 AM – 6:00 PM (Portugal Time / WEST)
= 12:45 AM – 10:00 AM (PDT)

What you’ll learn:
🔹 KV-cache offloading & reuse for LLMs
🔹 Cutting GPU memory + compute costs
🔹 Real-world integrations with vLLM & beyond

✅ Register here https://t.co/MxVRQcariy

#SIGCOMM2025 #LMCache #LLM #vLLM

0

20

3

4

1K

shaoting_feng retweeted

Siddhant Ray @siddhantrayyy

11 months ago

With RAG and agents becoming ubiquitous in LLM systems, tuning quality and performance JOINTLY is essential to achieve the best LLM quality-of-experience. Our paper at SOSP this year, addresses this exact tradeoff!🔥

siddhantrayyy's tweet photo. With RAG and agents becoming ubiquitous in LLM systems, tuning quality and performance JOINTLY is essential to achieve the best LLM quality-of-experience.

Our paper at SOSP this year, addresses this exact tradeoff!🔥 https://t.co/OCR7IVVcl2

1

17

6

1

2K

shaoting_feng retweeted

LMCache Lab

@lmcache

11 months ago

🚨 LMCache now turbocharges multimodal models in vLLM! By caching image-token KV pairs, repeated images now get ~100% cache hit rate — cutting latency from 18s to ~1s. Works out of the box. Check the blog: https://t.co/WUiCF7adRN Try it 👉 https://t.co/JaKbQCXFd3 #vLLM #MLLM #AIinfra #LMCache

lmcache's tweet photo. 🚨 LMCache now turbocharges multimodal models in vLLM!

By caching image-token KV pairs, repeated images now get ~100% cache hit rate — cutting latency from 18s to ~1s.

Works out of the box.

Check the blog: https://t.co/WUiCF7adRN
Try it 👉 https://t.co/JaKbQCXFd3

#vLLM #MLLM #AIinfra #LMCache

0

41

12

14

2K

shaoting_feng retweeted

LMCache Lab

@lmcache

11 months ago

Perks of building LMCache with us 😋 WE DO NOT ONLY FUEL YOUR LLMs

1

6

1

0

503

shaoting_feng retweeted

LMCache Lab

@lmcache

about 1 year ago

Our open-source LLM cluster deployment solution is 10x faster than SOTA OSS solution. Check out the vLLM Production-Stack!🤩🤩🤩 Since Jan 2025, vLLM Production Stack has been the reference open-source vLLM inference cluster solution with advanced KV cache offloading and K8s native support. Today, our benchmarks show that it is: ✅10x better performance than SOTA OSS solution (AIBrix) in multi-turn chat ✅More stable after set up Reproduce it yourself: 📝Blog post and benchmark: https://t.co/S2SJNmKfnC 🔗Github repo: https://t.co/6RSJK1AZJx 📺30s demo: https://t.co/T2PHba6Yqu #vLLM #LLM #GenAI #OpenSource #Inference #AI

lmcache's tweet photo. Our open-source LLM cluster deployment solution is 10x faster than SOTA OSS solution. Check out the vLLM Production-Stack!🤩🤩🤩

Since Jan 2025, vLLM Production Stack has been the reference open-source vLLM inference cluster solution with advanced KV cache offloading and K8s native support. Today, our benchmarks show that it is:

✅10x better performance than SOTA OSS solution (AIBrix) in multi-turn chat
✅More stable after set up

Reproduce it yourself:
📝Blog post and benchmark: https://t.co/S2SJNmKfnC
🔗Github repo: https://t.co/6RSJK1AZJx
📺30s demo: https://t.co/T2PHba6Yqu

#vLLM #LLM #GenAI #OpenSource #Inference #AI

0

18

6

4

1K

shaoting_feng retweeted

LMCache Lab

@lmcache

over 1 year ago

🚀 The LMCache docs website are now live! 🎉 Whether you're new to LLMs or a pro, our doc covers your need! 📚 Getting Started guides 🔍 Small examples 👨‍💻 Code documentations Boost your LLM deployment today! Check our blogpost! https://t.co/rZok91jWKy

0

7

8

3

672

Shaoting Feng

@shaoting_feng

Last Seen Users on Sotwe

Trends for you

Most Popular Users