|Jeff Dean (Google)|
Over the past several years, we have built a collection of systems and tools that simplify the storing and processing of large-scale data sets, and the construction of heavily-used public services based on these data sets. These systems are intended to work well in Google's computational environment, which consists of large numbers of commodity machines connected by commodity networking hardware. Our systems handle issues like storage reliability and availability in the face of machine failures, and our processing tools make it relatively easy to write robust computations that run reliably and efficiently on thousands of machines. In this talk I'll highlight some of the systems we have built, and discuss some challenges and future directions for new systems.