Spanner: Google's Globally-Distributed Database
I and many others have been working for the last few years on building a large-scale storage system that can manage data across all of Google's datacenters. This system underlies Google's advertising system, among other products. We'll be presenting a paper describing the system (with 26 co-authors!) at OSDI 2012 next month. We've now put up a web page with a link to the PDF of the final version of the paper.
Feedback is welcome, of course.
Here's the abstract of the paper:
Spanner is Google's scalable, multi-version, globally-distributed, and synchronously-replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. This paper describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a novel time API that exposes clock uncertainty. This API and its implementation are critical to supporting external consistency and a variety of powerful features: non-blocking reads in the past, lock-free read-only transactions, and atomic schema changes, across all of Spanner.