Does retrieval help RAG or did the LLM already memorize the answer? ๐ค Too often, the overlap between RAG corpora and what LLMs โknowโ is unclear
Better RAG evaluation needs tighter alignment between NLP and IR
๐ That's why for RAG 2026 we are using @nvidia's ClimbMix corpus
๐ค What kind of metrics do you think accurately measure retrieval for agentic search?
Interested? Donโt forget to:
1๏ธโฃ Sign up for TREC: https://t.co/uqDO2Vucrd
โจ2๏ธโฃ Join our mailing list: https://t.co/90jtWJdi6S
โจ3๏ธโฃ Join us on SIGIR Slack!
Stay tuned for more updates soon! ๐
Search is no longer just a ranked list...LLM agents can now query, inspect, reformulate, and decide when to stop ๐ค
At TREC RAG 2026, weโre introducing new metrics for agentic search: evaluating not only final results, but the search process itself ๐
Stay tuned!
Interested? Donโt forget to:
1๏ธโฃ Sign up for TREC: https://t.co/uqDO2Vucrd
2๏ธโฃ Join our mailing list: https://t.co/90jtWJdi6S
3๏ธโฃ Join us on SIGIR Slack!
Stay tuned for more updates soon! ๐
๐คจ Is your agent confused about what to build because it says there arenโt any guidelines?
Now your agent has no more excuses - track guidelines for TREC RAG 2026 are out ๐ฅ
And yes, theyโre available via SKILLz ๐
Tell your agents to showcase your agentic search system!
๐ค As we always say: progress is best through collaboration.
This is still a work in progress, and weโd love your feedback.
Guidelines can be found here: https://t.co/mb5GxgTOIz
Interested? Donโt forget to:
1๏ธโฃ Sign up for TREC: https://t.co/uqDO2Vucrd
2๏ธโฃ Join our mailing list: https://t.co/90jtWJdi6S
3๏ธโฃ Join us on SIGIR Slack!
Stay tuned for more updates soon! ๐
Does retrieval help RAG or did the LLM already memorize the answer? ๐ค Too often, the overlap between RAG corpora and what LLMs โknowโ is unclear
Better RAG evaluation needs tighter alignment between NLP and IR
๐ That's why for RAG 2026 we are using @nvidia's ClimbMix corpus
In the meantime, what can you expect in the coming weeks? ๐
We will be providing ClimbMix ๐งโโ๏ธ โprojectedโ relevance assessments for RAG 24/25 topics so you can begin evaluating your RAG systems, along with baseline RAG systems ๐
Interested? Play around with the SKILLz here ๐
https://t.co/I4R5XxS4tP
What else can you do in the meantime? ๐ซต
1๏ธโฃ Sign up for TREC (https://t.co/uqDO2Vucrd)!
2๏ธโฃ Join our mailing list (https://t.co/90jtWJdi6S)
3๏ธโฃ Join us on SIGIR Slack!
So...how well can your agent do? ๐
TREC RAG is returning for 2026! ๐
This yearโs iteration is special because agents ๐ค can join the funโฆ but what might agent-first community evaluation look like? ๐งต๐
๐คWe believe in progress through collaboration
๐ฌPlease test out the SKILLz โน๏ธand give us your thoughts and feedback!
Your insights will play a pivotal role in shaping the TREC 2026 RAG track!
TREC RAG 2025 official retrieval baselines are available now! ๐ฅ๐ฅ๐ฅ
Time to start generating those answers and submit them to eval base before August 17th! ๐๏ธ
Let the games begin, well you have less than a month remaining to submit! ๐ป