How did Google build the world’s most scalable database?
Spanner is an incredibly impressive system–the first data store to provide globally distributed transactions at scale. However, when it was first built, it was hard to use, offering only key-value semantics. This paper tells the story of how Spanner evolved into a full database, with support for SQL and all the features application developers expect, at a scale at which they had never been offered before.
Like any database, Spanner compiles a SQL query into a physical query plan then optimizes it. However, to run queries at massive scale, it needs new distributed operators. The most important of these is distributed union, which ships a subquery to each shard of the underlying data and concatenates the results. This is a building block for performing distributed aggregations or joins over sharded tables. To make Spanner work, a distributed union has to be inserted above every table in a query plan. Because a distributed union is expensive, they do a lot of work to push operations into the union (particularly filters). Moreover, joins are aggressively rewritten to minimize the number and size of the distributed unions performed.
Spanner optimizes the performance of distributed queries using a coprocessor framework: each remote call is addressed not to a particular server, but to a range of data. This gives the runtime leeway to execute each query in the most efficient manner, routing each subquery request to the nearest replica that can serve the request. It also gives the runtime freedom to filter which shards are queried based on the requested keys, so shards don’t receive irrelevant requests. Moreover, it allows transparent masking of transient failures, as any subquery is automatically served from an available replica, even if other replicas are offline.
What are the main takeaways? First, SQL semantics help adoption. Spanner always allowed users to reliably store and query data at scale, but adding SQL made it easier to write faster queries (thanks to the optimizer) and more complex queries. Second, even with full SQL support, using databases at scale is hard. The optimizer makes it much easier to write performant queries, but even then there are many pitfalls, and the paper makes it clear that (as of 2017), the Spanner team still needs to work with internal customers to make sure their queries won’t’ hit scaling bottlenecks.
📢📢We have a new data systems faculty position @ITUkbh@dasyaITU. Application deadline: Nov 28. For more information, see the link below. Reach out to me if you have any questions.
https://t.co/Er5TZ5kQQb
CedarDB: The 🇩🇪German-powered, PostgreSQL-compatible freak-of-nature database management system based on TUM's Umbra (Thomas Neumann + team) is out of stealth and now available: https://t.co/c7BoxGnD02
/cc @cedar_db
#NEDB24 program is out https://t.co/5y0GiFG8wg
We have 3 exciting keynotes, 14 talks, and 44 posters!
Registration is now open: https://t.co/Zli6x5UNoV
Enjoy this wonderful program at NEDB24, held at BU's new iconic CCDS!
#NEDB#BU#CCDS
photo credit: https://t.co/7dczdmHnVA
Accepted paper with @SRinderleMa in Information Systems journal @ElsevierConnect: "Responsible composition and optimization of integration processes under correctness preserving guarantees": https://t.co/sylW5Yja78. More information on: https://t.co/pBHPKV9K8P. @sapbtp@SAP
Just returned from an awesome (and my first) Dagstuhl seminar on "Robust Query Processing in the Cloud" last week - lost of memories, new friendships, and great topics.
https://t.co/2RpK6jFPhL
We have an opening for a professorship @CS_TUDarmstadt @Hessian_AI in the intersection of #Systems & #AI. We seek outstanding researchers (open rank) who work on Systems for AI / AI for Systems: https://t.co/Qi9ZV6gEhT. Please share or reach out to me if you have any questions.
Our book on "Data Structures for Data-Intensive Applications" co-authored with Stratos Idreos (@HarvardDASlab) and @DennisShasha can be downloaded for free from the publisher for the next few days (until Feb 12): https://t.co/0lsOjSC6hy
#FNT#DataStructures#Textbook#Databases
I am happy and proud to share that my first recruited and supervised PhD student @SAP Jonas Dann successfully defended his thesis at @UniHeidelberg: see full article here https://t.co/oRpjEATkaM
I'm back again with my annual retrospective of the last year in the world of databases. Major highlights include vector databases, @MariaDB problems, SQL:2023, the FAA database crash, and the most expensive password change ever: https://t.co/BoHTfX5QOW
Don't miss out on our new work on "Elastic Use of Far Memory for In-Memory Database Management Systems" @SIGMODConf conference in Seattle: https://t.co/kO7PLY1gTy, which will be presented at DaMoN https://t.co/At6kSkPjqD this week.
We had a great keynote today by @andy_pavlo from @OtterTuneAI and @CMUDB at @BTWconference in @dresden on "Why Machine Learning for Automatically Optimizing Databases Doesn't Work" or does it? https://t.co/ymdy2S2vVG
Many attended the memorial, but many also missed it, so it is good this is online. Martin Kersten instilled in me core values (e.g. systems impact over papers, and DB architecture as foundational research).
Also 🙏 @arjenpdevries Arno @ailamaki Yannis & @andy_pavlo for speaking!
@cidrdb is a wrap! Keynotes by Gustavo Alonso on DB research in the hardware age and @hfmuehleisen on creating DB systems from academia; that were thought provoking. Great in-person presentations of 32 papers and 6 sponsor talks, a gong show+quiz by @andy_pavlo & a startup panel.