[3/3] Big thanks to @zijian42chen for presenting, @xueguang_ma and the BrowseComp-Plus team for the dataset, and @lintool for the guidance throughout.
๐ https://t.co/8vmSyvGmV7
๐ป https://t.co/rzRlBko7sH
๐ผ๏ธhttps://t.co/SGw16TMHay
[1/3] Our agentic search work is at #ACL2026! ๐ I can't make it to San Diego in person, but luckily @zijian42chen is presenting the poster โ go say hi!
The one-line takeaway: ๐น๐๐๐๐๐ ๐๐๐๐๐๐ ๐๐๐ ๐๐๐๐๐๐. ๐งต
[2/3] Thinking more isn't always how LLM search agents should spend their tokens. A light reranking step beats extra reasoning โ better accuracy, faster runs, and lower total token cost (measured with ๐๐๐๐๐๐๐๐๐ ๐๐๐๐๐ ๐๐๐๐).
๐ Huge thanks to @zijian42chen, @xueguang_ma and the rest of the BrowseComp-Plus team for their valuable dataset, and to my supervisor @lintool for all the guidance along the way.
Would love to hear your thoughts and feedback on this work ๐
๐ค How should LLM search agents spend their reasoning tokens?
โ Thinking more isnโt always the answer ...
โ ๐น๐๐๐๐๐๐๐๐ often is.
๐งต New paper: https://t.co/oZDWTQ0Z2N
Reranking before reasoning can significantly reduce token usage in deep search agents. We introduce ๐๐๐๐๐๐๐๐๐ ๐๐๐๐๐ ๐๐๐๐ to measure the true cost tradeoff across different settings.
1/7 Exciting news! Our new paper is out: Lighting the Way for BRIGHT: Reproducible Baselines with Anserini, Pyserini, and RankLLM
๐Paper: https://t.co/yjYAJPfzlu
Itโs about retrieval, RAG, and why a twist on BM25 might matter more than you think. ๐งต
Thrilled to be heading to #SIGIR2025 with ๐ฅ๐ฎ๐ป๐ธ๐๐๐ ! Iโll also be stepping in for a few other exciting projects from the #UniversityOfWaterloo team.
If youโll be there, letโs chat! ๐
๐ ๐๐ผ๐ผ๐ธ๐บ๐ฎ๐ฟ๐ธ ๐ผ๐ณ ๐ฎ๐น๐น ๐จ๐ช ๐ฝ๐ฎ๐ฝ๐ฒ๐ฟ๐ available in the images below:
5/5
We hope RankLLM becomes a go-to community resource for reproducible LLM-based ranking research. Check it out, give it a โญ, contribute, and let us know what you build with it!
๐ https://t.co/T7IN1iXnoQ
๐ https://t.co/O42F2lSaDl
#SIGIR2025
4/5
Huge thanks to my amazing collaborators โ @rpradeep42, Andre Slavescu, Ryan Nguyen, Andrew Xu, Zijian Chen, Yilin Zhang, Yidi Chen, Jasper Xian, and @lintool! ๐
๐ We are also incredibly grateful to all the generous open-source contributors who made this possible!
๐More explainable diagnostics for LLM battle outcomes โ now at scale!
Work done with @UWaterloo folks @Ushivani3, @beirmug, @rpradeep42 and @lintool!
Check out full details in our arXiv preprint:
๐ https://t.co/WK9cimDcEz
Voters in LLM arenas choose battle winners, but donโt always explain!
๐ Building on a recent work from our group, ๐ง๐ต๐ฒ ๐๐ฟ๐ฒ๐ฎ๐ ๐ก๐๐ด๐ด๐ฒ๐ ๐ฅ๐ฒ๐ฐ๐ฎ๐น๐น, we bring nugget evaluation to ๐ณ๐ battles from the ๐ฆ๐ฒ๐ฎ๐ฟ๐ฐ๐ต๐๐ฟ๐ฒ๐ป๐ฎ ๐ฑ๐ฎ๐๐ฎ๐๐ฒ๐!
๐ We also extend to ๐พ๐๐ฒ๐ฟ๐ ๐๐๐ฝ๐ฒ ๐ฏ๐ฟ๐ฒ๐ฎ๐ธ๐ฑ๐ผ๐๐ป๐, and ๐บ๐๐น๐๐ถ๐น๐ถ๐ป๐ด๐๐ฎ๐น ๐ฎ๐ป๐ฎ๐น๐๐๐ถ๐ identifying where ๐ฝ๐ฟ๐ฒ๐ณ๐ฒ๐ฟ๐ฒ๐ป๐ฐ๐ฒ ๐ถ๐ป๐๐ฒ๐ฟ๐๐ถ๐ผ๐ป๐ arise (e.g., ambiguous or incomplete queries).