🚀 Introducing NSA: A Hardware-Aligned and Natively Trainable Sparse Attention mechanism for ultra-fast long-context training & inference!
Core components of NSA:
• Dynamic hierarchical sparse strategy
• Coarse-grained token compression
• Fine-grained token selection
💡 With optimized design for modern hardware, NSA speeds up inference while reducing pre-training costs—without compromising performance. It matches or outperforms Full Attention models on general benchmarks, long-context tasks, and instruction-based reasoning.
📖 For more details, check out our paper here: https://t.co/HJiqzwnUV7
CONCORD SHUT DOWN!
On my last stream, I gave Concord 6 months before it was shut down.
It happened much much faster than that.
AAA Publishers are bleeding money and cutting costs aggressively. This is why you need to PUSH on Ubisoft this year.
2 Years to stop the spread. Play your back catalog.
Black Myth: Wukong has sold 10 million copies across all platforms.
(Data as of 21:00 Beijing time, August 23, 2024)
Thanks to all players worldwide for your support and love.
Have a great gaming weekend!
#BlackMythWukong