VLDB’25: We will be presenting this work in the Data Management on Novel Hardware session this afternoon. Join us at the talk and the poster session afterwards!
For the first time, we show that GPU-accelerated database systems can be both faster AND cheaper than their CPU counterparts, with a proof-of-concept on Microsoft SQL Server in Azure running TPC-H 1TB with a single A100/H100! Our VLDB'25 paper, Scaling GPU-Accelerated Databases beyond GPU Memory Size ( https://t.co/D9nlrGn7fG), introduces query processing optimizations that mitigate CPU-GPU interconnect limitations, the primary barrier to scaling GPU-accelerated databases beyond available GPU memory. Our techniques enable efficient processing of datasets 10x larger than GPU memory while preserving the performance benefits that make GPU acceleration compelling for analytical workloads.
We will present this work at VLDB'25 in the Abbey room, Wednesday 9/3 from 3:45-5:15 PM. Join us to learn more about the details!
For the first time, we show that GPU-accelerated database systems can be both faster AND cheaper than their CPU counterparts, with a proof-of-concept on Microsoft SQL Server in Azure running TPC-H 1TB with a single A100/H100! Our VLDB'25 paper, Scaling GPU-Accelerated Databases beyond GPU Memory Size ( https://t.co/D9nlrGn7fG), introduces query processing optimizations that mitigate CPU-GPU interconnect limitations, the primary barrier to scaling GPU-accelerated databases beyond available GPU memory. Our techniques enable efficient processing of datasets 10x larger than GPU memory while preserving the performance benefits that make GPU acceleration compelling for analytical workloads.
We will present this work at VLDB'25 in the Abbey room, Wednesday 9/3 from 3:45-5:15 PM. Join us to learn more about the details!
For the first time, we show that GPU-accelerated database systems can be both faster AND cheaper than their CPU counterparts, with a proof-of-concept on Microsoft SQL Server in Azure running TPC-H 1TB with a single A100/H100! Our VLDB'25 paper, Scaling GPU-Accelerated Databases beyond GPU Memory Size ( https://t.co/D9nlrGn7fG), introduces query processing optimizations that mitigate CPU-GPU interconnect limitations, the primary barrier to scaling GPU-accelerated databases beyond available GPU memory. Our techniques enable efficient processing of datasets 10x larger than GPU memory while preserving the performance benefits that make GPU acceleration compelling for analytical workloads.
We will present this work at VLDB'25 in the Abbey room, Wednesday 9/3 from 3:45-5:15 PM. Join us to learn more about the details!
Recently I was invited to give a talk on Extensible Query Optimizers in Practice for the NWDS talk series at Univerisity of Washtington. The talk provides an overview of basics for query optimization and search in Volcano/Cascades frameworks. The video is here https://t.co/uobqdWbYUE #Database #UW #SQL
We will be presenting our work on Retrieval with Learned Similarities at WWW 2025 tomorrow! Come to our session and check out our poster this Friday afternoon as well! @Lunarmony#www2025#thewebconf2025#recsys#vectorsearch
Thrilled to announce the publication of our book "Extensible Query Optimizers in Practice" in Foundations and Trends® in Databases! You can download it for free until February 17 at https://t.co/dXG0EOconk. The book delves deep into the architecture of extensible query optimizers, focusing on the Volcano/Cascades framework used by several database systems including Microsoft SQL Server. We cover essential aspects of query optimization, including plan search, transformation rules, cost estimation, and plan management. We hope this book serves as a valuable resource for students, practitioners, and anyone interested in the internals of query optimization. We (Surajit Chaudhuri, @VivekNarasayya , and I) would love to hear back from you for opportunities to improve the monograph for clarity. Please let us know of any technical errors you detect – we would be especially grateful for such input. In addition to our collaborators in MSR data systems group and optimizer gurus in Azure Data, special thanks to Joe Hellerstein (@joe_hellerstein), Jignesh Patel (@pateljm), Andy Pavlo (@andy_pavlo) and their students at UC Berkeley and CMU for their thoughtful feedback from using (earlier versions of) this book in their classes. Finally, a condensed version will be available on arXiv soon for quick reference — stay tuned!
Excited to share our recent work on Retrieval with Learned Similarities is accepted as an oral presentation (155 out of 2062 submissions) at WWW'25! We show both the theoretical expressiveness of Mixture-of-Logits and its empirical superiority on various retrieval tasks. This work provides strong justifications to move from dot products to learned similarities for retrieval. And it opens up new research opportunities on supporting efficient vector search with learned similarities. Check out our repo at https://t.co/hZz4QM1FXD! #TheWebConf25 #www2025 @TheWebConf
🧵(1/4) Interested in the next generation retrieval paradigm or making your recommendations, search, or RAG applications 20-30% better? Check out Retrieval with Learned Similarities, a collaboration with Microsoft Research, accepted as an oral presentation (155 out of 2062 submissions) at #TheWebConf25!
Excited to share that our paper RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval has received the best paper award at the ENSLP workshop@NeurIPS! This work will be presented at 1:30PM today. Kudos to all my colleagues and our interns! #NeurIPS2024
Why use learned similarities/MoL over embeddings in retrieval?
1/ MoL matches MIPS in latency, utilizing GPU compute at no extra cost. https://t.co/nVsx5kWeZY arxiv:2306.04039 2/ strong gains; +27%/+24% in HR@Meta/LinkedIn. arxiv:2407.13218
3/ MoL are universal approximators, enabling retrieval with universal learned similarities. https://t.co/Hr5BqP3n6v
Efficient Retrieval with Learned Similarities
Introduces Mixture-of-Logits (MoL) as a universal approximator for learned similarity functions in retrieval tasks, proposing efficient techniques for approximate top-K retrieval.
📝https://t.co/QN8UicT8NG