1/ Thrilled to introduce T³: a corpus for RAG over reasoning tasks, built from thinking traces.
We show that surprisingly RAG can improve reasoning— with the right corpus.
Rag with Transformed Thinking Traces T³ gain by up to 43.9% on AIME 2025-2026.
🔗 https://t.co/9GPxKnszte 🧵
Excited to announce that our paper “From Noise to Order: Learning to Rank via Denoising Diffusion” has been accepted to #ICTIR2026! 🎉📚
📄Paper: https://t.co/7yIdF2VBJ1
💻 Code: https://t.co/JR5xpDLtQc
Grateful that my PhD thesis was recognized as one of the top dissertations in the 2026 Faculty of Mathematics Doctoral Prize at the @UWaterloo ! 🎉
And it is always especially nice to hear kind words from your PhD supervisor @claclarke . I guess that feeling never really goes away, even after you graduate. 😊
https://t.co/P6huzVj0Y9
Happy to share that our @icmlconf paper "Measuring Agents in Production" received an Oral Presentation spot! 🌟
https://t.co/Gwiy1LFP1S
See you all in Seoul! 🇰🇷
Excited to share: MAP has been accepted as 🌟 ICML Spotlight 🌟
We hope MAP can provide data-driven insights that help the communities to work on various under-explored research directions around agent systems!
Huge thanks & congrats to my amazing co-authors. See you all at Seoul! 🫡
Excited to share that MAP has been selected for ✨ICML Oral✨
We look forward to sharing the insights in the paper with the community
And much much appreciations to everyone who participated in our study ❤️ MAP won’t be possible without your contribution to open science
There are two layers of contamination control: (1) the trace corpus was built before AIME 2025 and 2026 were released, and (2) we additionally ran full decontamination against all eval benchmarks using a 13-gram Jaccard similarity.
On AIME 2025–2026 specifically, we see up to +56% relative gain on Gemini-2.5-Flash (53.3 → 83.3) and+7.6% on GPT-5.
1/ Thrilled to introduce T³: a corpus for RAG over reasoning tasks, built from thinking traces.
We show that surprisingly RAG can improve reasoning— with the right corpus.
Rag with Transformed Thinking Traces T³ gain by up to 43.9% on AIME 2025-2026.
🔗 https://t.co/9GPxKnszte 🧵
6/ Code, corpora, prompts — all open:
🔗 https://t.co/0IOwa8U2T1
Transformed corpora available on Hugging Face.
Thanks to my amazing coauthors @wenjie_ma , @sewon__min , and @matei_zaharia 🙏
5/ Interestingly, RAG over T³ can be cheaper than No RAG.
Retrieved reasoning shifts work from expensive output tokens to cheap input tokens — the model thinks less and reads more.
Think less. Retrieve thinking. 🧠