i just beat @GoogleDeepMind's turboquant
introducing Shard. 10x KV cache compression on Llama-3.1-8B. zero quality loss
- 10x @ 8K context, 11.2x @ 32K
- NIAH recall 1.000 across 4K-32K
- LongBench Δ ≈ 0 vs FP16
turboquant tops out at 4-6x at the same quality. we doubled it.
read more: https://t.co/PAV5WdAzN6
@kirrithan
CAPTCHAs are supposed to prove you’re human… so I trained a robot to solve them.
The setup:
→ SO-101 robotic arm (@huggingface)
→ Camera feedback loop
→ Teleoperated demonstrations
→ Fine-tuned π0 VLA policy (@physical_int)
What should I teach this robot to do next?