Sharing our latest work (StreamMA): exploring how to make multi-agent reasoning faster, more accurate, and cheaper. Hop on HuggingFace for an upvote, GitHub for a star 🤗:
HuggingFace: https://t.co/BgajgGFziE
Project: https://t.co/5qWvp3o1Ek
GitHub: https://t.co/sZrXjL97CT
Hard numbers: ① +7.3pp avg over 8 benchmarks across math / science / code (Claude Opus 4.6-high); ② 26.9× wall-clock speedup at A=64, S=64 (83% of the theoretical bound); ③ Stream×4 at half the price ($2.75 vs $5.46) beats Serial×16 (90.9% vs 89.4%).
🤩🎬 HoloCine is here!
The first open-source multi-shot long video model, generating minute-long cinematic narratives as stunning as Sora 2.
Watch the demo ↓
Introducing GSM8K-V: Can vision-language models solve grade school math when problems are shown visually instead of text? 🧮👁️
We converted all 1,319 GSM8K problems into comic-style multi-image sequences (5,343 images total).
The results are surprising! 🧵
What if LLMs already had the right answer—but erased it before finishing? 🤯
New work on diffusion LLMs (dLLMs) uncovers temporal oscillation: correct answers often appear mid-denoising, only to vanish in later steps.
Two fixes that harness temporal consistency:
- Temporal Self-Consistency Voting → training-free decoding that aggregates stable predictions across steps
- Temporal Consistency Reinforcement → post-training with Temporal Semantic Entropy (TSE) as a reward for semantic stability
Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models
"Our work here reveals a critical phenomenon, temporal oscillation, where correct answers often emerge in the middle process, but are overwritten in later denoising steps. To address this issue, we introduce two complementary methods that exploit temporal consistency: 1) Temporal Self-Consistency Voting, a training-free, test-time decoding strategy that aggregates predictions across denoising steps to select the most consistent output; and 2) a post-training method termed Temporal Consistency Reinforcement, which uses Temporal Semantic Entropy (TSE), a measure of semantic stability across intermediate predictions, as a reward signal to encourage stable generations."