Chien-Yu Lin

@cylinbao

PhD student in UW CSE

Seattle

Joined May 2019

66 Following

75 Followers

4 Posts

Chien-Yu Lin @cylinbao

22 days ago

@YichuanM Wow cleaver and neat!

260

Chien-Yu Lin @cylinbao

about 1 year ago

@yi_xin_dong @tqchenml Congrats! It’s a super cool work!

159

cylinbao retweeted

Shanli Xing @shanli_xing

over 1 year ago

🚀Meet flashinfer.sampling—our sorting-free GPU kernels for lightning-fast #LLM sampling. Our implementation achieves over 50% reduction in sampling time. Blog post: https://t.co/R780Rth03x

shanli_xing's tweet photo. 🚀Meet flashinfer.sampling—our sorting-free GPU kernels for lightning-fast #LLM sampling.

Our implementation achieves over 50% reduction in sampling time.

Blog post: https://t.co/R780Rth03x https://t.co/KQbc9RS4aF

180

31K

cylinbao retweeted

Zihao Ye @ye_combinator

over 1 year ago

We are excite to announce FlashInfer v0.2! Core contributions of this release include: - Block/Vector Sparse (Paged) Attention on FlashAttention-3 - JIT compilation for customized attention variants - Fused Multi-head Latent Attention (MLA) decoding kernel - Lots of bugfix and improvements involving CUDAGraph compatibility, RMSNorm/RoPE numerical issue, etc. blog post: https://t.co/tMBFmCfAc0