Distinguished Lecture Series

David F. Bacon

Google, New York City Office

02 March 2023, 04:15pm

“Evolution of the Storage Engine for Spanner, an Exabyte-scale Database System”

Piloty Building S2|02 room C110

Abstract:

I'll describe the design of Spanner's new storage engine, Ressi, which replaced untyped sorted string tables (inherited from Bigtable) with a strongly typed SQL-native representation. Live migration of 6 exabytes of data and multiple billion-user products to the new engine posed unique challenges. Sound methodology from experimental computer science was the key to its success.

The simplicity and power of declarative queries combined with strongly consistent transactional semantics has scaled to many thousands of machines running an aggregate of over 2 billion queries per second for some of the largest applications in the world. While challenges emerge as we continue to scale, I argue that the dominant obstacle to achieving zettabyte scale databases is in experimental methodology rather than in the underlying technical problems themselves.

Bio:

David F. Bacon leads Google’s Spanner storage engine team, responsible for over 70% of the total fleet-wide cost of Spanner. His current work includes compression, RAM efficiency, ASIC support for databases, protection against “mercurial cores”, and tools for predicting fine-grained impact of software and hardware changes.

Prior to Google, he worked at IBM Research on programming language design, optimization, and hardware synthesis. He was named an ACM Fellow for pioneering work on real-time garbage collection.

He holds a Ph.D. from UC Berkeley, and his thesis work on optimizing virtual functions is used in most modern C++ and Java compilers. He has published over 80 papers and holds 29 patents.