Workload compression is crucial for database optimization—powering knob tuning, indexing, MV selection, and more. But most prior work targets OLAP with query-only methods.
In our new VLDB paper, we present SCompression, a time-sliced workload compression technique for OLTP. The key idea: don’t just choose which queries to keep—preserve concurrency and transaction context so the compressed trace still reflects real system behavior.
On a knob-tuning task across real-world and benchmark workloads, SCompression accelerates tuning up to 40× with only ~5% performance impact.
Paper: https://t.co/nghkvQaVmA
This work was led by visiting PhD student in our group, Baoqing Cai, who will present it at VLDB soon. Happy to discuss—online or in person!
Honored to receive a Google ML & Systems Junior Faculty Award! Grateful for @Google's support as our young lab advances AI- and GPU-driven database systems. This support and recognition will help us turn new ideas into open-source tools and a faster, smarter data infrastructure. Huge thanks to my students and collaborators—their creativity and hard work made this possible.
https://t.co/ntsCYIz21b
#GoogleResearch #MLSystems #GPUDatabases #AI
Thrilled to share that our project "Accelerating Large-Scale Data Analytics with Multi-GPU Resource Redistribution" in collaboration with @TalatiNishil, has been selected for the @nvidia Academic Grant Program (https://t.co/BblEAV8YNU)! 🚀
Shout-out to @NVIDIAAIDev for their generous allocation of A100 GPU hours! These resources will supercharge our research on GPU-accelerated databases and help us push the boundaries of large-scale data analytics.
#NVIDIAGrant #GPU #A100 #DataAnalytics #GPUDatabases
GPU memory capacity and I/O bandwidth limitations have been major bottlenecks for GPU-accelerated databases. We tackle such a challenge in our latest @VLDBconf 2025 paper: https://t.co/ATzbQgtQL6
We present a novel resource-sharing method for multi-GPU systems that enables GPU acceleration on datasets that far exceed GPU memory capacity, without relying on GPU-side data caching. The main idea is to repurpose underutilized IO resources from AI workloads to accelerate data movement for data analytics. Our approach delivers over 2x better price-performance than CPU-based solutions. This work is led by Yichao Yuan (@UMichCSE PhD student) in collaboration with @TalatiNishil.
We will be presenting this at VLDB 2025 in London later this summer. Hope to see you there!
Buckle up because we're crashing into the new year with my annual database retrospective: License change blowbacks! @databricks vs. @SnowflakeDB gangwar! @DuckDB shotgun weddings! Buying a college quarterback with database money for your new lover! https://t.co/NnFHGElFNy
The @SIGMODConf 2023 conference program is now posted online. We have a packed program. Early registration deadline is May 1st. Looking forward to seeing you all in person in Seattle in June. https://t.co/jGthb6BGGA
I'm recruiting PhD students for my lab @UMichCSE on applied ML for databases, starting Fall 2023. My research and industry experience convinces me that database automation and simplification are more important than ever, and ML/AI can play a big role. Please apply and reach out!
@tech31842@UMichCSE Sure! Maybe you can start with this summary blog from @andy_pavlo: https://t.co/AiuTOHlIEq
Some early and recent work from @MSFTResearch:
https://t.co/PeMNVeXgyx
https://t.co/7MDT9yjFEn
Or just reference my PhD thesis on self-driving databases:)
https://t.co/rTxZ9umiRz
Personal update: I'll be joining the University of Michigan, Ann Arbor in the Division of Computer Science and Engineering @UMichCSE as an Assistant Professor in Fall 2023! Thrilled to work on exciting database research with everyone at @UMich!
Just to add that, as part of this transition, I recently joined @databricks to work on the Delta Lake/Lakehouse. I've moved to Ann Arbor, but will visit the DB SF/SV office periodically. Look forward to hanging out and collaborating with friends both in A2 and the bay area:)
Personal update: I'll be joining the University of Michigan, Ann Arbor in the Division of Computer Science and Engineering @UMichCSE as an Assistant Professor in Fall 2023! Thrilled to work on exciting database research with everyone at @UMich!
It's official! I'm on the job market!
My research is on the intersection of databases and machine learning. I've spent the last five years working on the NoisePage self-driving DBMS project at CMU.
Please find my C.V. and research/teaching statements at: https://t.co/57pqnXEMbJ
.@CMUDB Intro to Database Systems 2021 ➜ Lecture #22: Distributed OLTP Database Systems
Video: https://t.co/hGxpbEOVo2
Slides: https://t.co/ekTsK1xhoH
Commit protocol, replication, and CAP theorem.
.@CMUDB Intro to Database Systems 2021 ➜ Lecture #21: Introduction to Distributed Databases
Video: https://t.co/HMDOyAB0WZ
Slides: https://t.co/GKOXgDY0gT
Architecture and partitioning.