In the final post of the Adaptive RAG series, we explore how to treat selective retrieval as a core, learned skill, moving from passive observation to active, intelligent decision-making.
https://t.co/MyjupeCBOS
SAILRec: Steering LLM Attention to Dual-Side Semantically Aligned Collaborative Embeddings for Recommendation
Proposes an LLM-based recommender that improves the use of injected collaborative embeddings through dual-side semantic alignment.
๐ https://t.co/LLXCIqYmo5
ANN Search: Recall What Matters
Argues that Recall@k overstates the cost of approximation in nearest neighbor search and proposes 1/Ratio@k, a judge-free, hyperparameter-free quality measure that tracks downstream task quality more faithfully.
๐ https://t.co/zudyptX0hr
Cartridges at Scale: Training Modular KV Caches over Large Document Collections
@mhardalov et al. at Amazon present a training framework for scalable multi-cartridge learning that distills document collections into reusable KV caches.
๐ https://t.co/rCNEm9MaSZ
ARBOR: Online Process Rewards via a Reusable Rubric Buffer for Search Agents
Alibaba presents a reusable process-reward framework that maintains a shared rubric memory to supervise the search process.
๐ https://t.co/wXdiXGc6dj
Attention Calibration for Position-Fair Dense Information Retrieval
Introduces an inference-time attention calibration method with a tunable strength coefficient to reduce positional bias in dense retrieval.
๐ https://t.co/cNmqglKzuQ
๐จ๐ฝโ๐ป https://t.co/2mdJThQC62
Do Neural Retrievers Prefer Certain Documents? Evidence of Learned Relevance Priors
Shows that supervised dense retrievers implicitly learn a query-independent relevance prior from annotation biases, making relevant but niche docs harder to retrieve.
๐ https://t.co/rZyuez4Far
LLM-Assisted Reranking to Operationalize Nuanced Objectives in Recommender Systems
Investigates how LLM-based reranking of news recommendations can amplify exposure to extreme or conspiratorial political content.
๐ https://t.co/vCUSHxveaj
Slipstream: Locality-Aware Graph Index Construction for Streaming Approximate Nearest Neighbor Search
Speeds up insertions in graph indexes for streaming nearest neighbor search by reusing candidates from previous insertions
๐ https://t.co/adYrwih5Wd
๐จ๐ฝโ๐ป https://t.co/lJs69BgTeF
VirtualMLE: A Virtual ML Engineer that Optimizes Sequential Recommenders
Introduces an LLM-agent framework that tunes sequential recommenders through a closed loop of execution, reflection, and memory.
๐ https://t.co/qLfklrD9Ir
๐จ๐ฝโ๐ป https://t.co/lHQdTEW9rw
Structures Facilitate Retrieve, Rerank, and Generate
Integrates document structural information across retrieval, reranking, and generation for document-grounded dialogue systems in both Chinese and English.
๐ https://t.co/yjcLth3HtO
Can LLM Rerankers Predict Their Own Ranking Performance?
@Shictyu et al. introduce reranker-internal query performance prediction, showing self-consistency is well-calibrated while verbalized confidence is overconfident & propose 2 methods to fix it.
๐ https://t.co/yWtA14EBbB
Skill Is Not Document: A Query-Conditional Benchmark and Two-Stage Retriever for LLM Agent Skill Routing
Tencent introduces a bilingual benchmark and two-stage retriever for agent skill routing.
๐ https://t.co/LS7cDMxuHk
Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories
Introduces a benchmark for span-level error localization in deep-research agent trajectories, and a claim-centric auditing framework that tracks agent claims
๐ https://t.co/WmSYtD6y9W
TVIR: Building Deep Research Agents Towards Text-Visual Interleaved Report Generation
Introduces a hierarchical multi-agent framework for deep research reports that interleave text with semantically grounded charts and images
๐https://t.co/JQV7RGEcoq
๐จ๐ฝโ๐ปhttps://t.co/jt7ywU8uJX
When Is 0.1% Enough? Analyzing the Combined Effects of Dimensionality Reduction and Quantization on Text Embedding Compression
Systematically studies combining dimensionality reduction and quantization for text embeddings.
๐ https://t.co/NhyGj2PxEy
FineVerify: Scaling Test-Time Compute with Fine-Grained Self-Verification for Agentic Search
Decomposes each question into checkable sub-questions, verifies sampled candidates against each & selects the highest-scoring answer.
๐https://t.co/JEgIuiquhF
๐จ๐ฝโ๐ปhttps://t.co/c8KU9HxL0N
OCC-RAG: Optimal Cognitive Core for Faithful Question Answering
Introduces a family of small language models for faithful context-grounded QA, mid-trained on a synthetic corpus of multi-hop examples.
๐https://t.co/3cLfGcFITg
๐จ๐ฝโ๐ปhttps://t.co/xkRnsd2HhA
๐คhttps://t.co/ErnNUU7oT1