We recently hosted Data @Scale, our invitation-only technical conference for engineers working on large-scale storage systems and analytics. For the second time, we hosted the event in Seattle, where the community of engineers working on high-scale systems is growing. We had a goal of bringing these vibrant engineers together to openly discuss challenges and collaborate on the development of new solutions when building services that serve millions — or even billions — of people. And we did just that by hearing a wide range of topics from speakers.
We had speakers from Dropbox, Facebook, Google, Microsoft, Qumulo, Tableau, Twitter, and the University of Wisconsin-Madison. They spoke about visual data analysis, exabyte storage systems, batch and streaming data processing, and key-value stores. While the day touched on a variety of topics, one thing was the same: All the speakers were excited to share their insights on building and operating systems in practice.
The videos and talk descriptions from the event are posted below. If you're interested in joining the next event or have topic ideas for next time, please join the @Scale Facebook page.
Dropbox, James Cowling — “Dropbox's Exabyte Storage System”
We started the day with James Cowling from Dropbox talking about how the company's exabyte-scale system protects all of its documents. Instead of describing the system architecture, which he wrote about in a blog post, he talked about what it takes to preserve data in the face of all kinds of faults — including system components, processes, and human error. He drove home the point that basic data redundancy techniques are just a starting point; meaningful reliability at scale comes from simplicity of design and relentless verification. He also shared a fascinating story of how the team moved 2 EB of data from S3 into their new store over a short period of time. The insights shared in this talk are widely applicable to a variety of high-scale systems that have high availability constraints.
Qumulo, Molly Brown — “Qumulo Core: Using Analytics to Make Data Visible at Scale”
Next up, Molly Brown from Qumulo spoke about analytics that are built intrinsically into a storage system. Typically, questions like “Where did my storage capacity go?” and “When will I run out of storage space and why?” are hard to answer in any large-scale system because analytics are often an afterthought and not well integrated. This can lead to increases in operational costs and get in the way of fully utilizing system resources. Molly took us through how Qumulo's storage system solves this problem.
Facebook, Jay Tang — “Presto Raptor: MPP Shared-Nothing Database on Flash”
Before lunch, Jay Tang from Facebook took us behind the scenes of a new storage system, Raptor, which is being built to run interactive queries over trillions of rows within seconds. Raptor is a columnar store on flash that is designed to fit natively with the popular open source distributed SQL engine Presto. Its multilayered storage architecture allows it to leverage flash for high performance while utilizing a disk-based backing system for durability and restoration from node failure. The system has been written in a way that it can be extended with other backing systems in the future.
The University of Wisconsin-Madison, Dr. Remzi Arpaci-Dusseau — “Your Storage is Broken: Lessons from Studying Modern Databases and Key-Value Stores”
After lunch, Dr. Remzi Arpaci-Dusseau from the University of Wisconsin-Madison showed us how the simplest of things, like updating a file on a disk using a file system, is subtly complex and highly error-prone. File systems differ in semantics and the handling of edge cases. Through automated tools, he and his students have uncovered hidden assumptions in highly deployed systems, like Git, on specific file system behaviors that are not always correct. Assumptions can be about atomicity, failure handling, crash consistency guarantees, and more. This talk took us through the nuts and bolts of the lowest levels of storage systems.
Twitter, Boaz Avital — "Providing Flexible Consistency Levels with Manhattan"
From the nuts and bolts of a storage system, we went to a conceptual talk about consistency models and trade-offs with availability. The topic was expertly handled by Boaz Avital from Twitter, who took us on a tour of the Manhattan distributed key-value store. With Manhattan, Twitter achieved higher levels of consistency alongside eventual consistency in the same store. The best part about this talk was that Boaz beautifully and easily separated these concepts from the implementation of the storage system itself, thus making the material applicable to other systems that have to maintain a distributed state.
Microsoft, Sriram Rao — “Scale Out Resource Management: YARN++”
We then changed gears to a topic that is relevant to many: how to manage resources within a large-scale compute cluster in a way that optimizes system resources, all while balancing the competing needs of the various processing jobs. Microsoft’s Big Data team has embraced Apache YARN and is building around YARN as the resource manager in its clusters. In this talk, Sriram Rao described the team's work in building a YARN-based, scale-out architecture. The system is self-configuring and tolerates failures of subcomponents while providing continued availability. This work has been contributed back to Apache YARN and ships with various Hadoop distributions.
Google, Frances Perry — "Apache Beam: A Unified and Open Model for Batch and Streaming Data Processing”
After a short break, Frances Perry from Google spoke about Apache Beam. Through deft animations, she showed attendees how the seemingly hard problem of managing batch and streaming data sets within a common framework and system can be solved with a unified API. She framed the problem around a set of constraints and requirements on latency, completeness, and cost. This system handles both batch and streaming use cases and neatly separates properties of the data from runtime characteristics, allowing pipelines to be portable across multiple runtime environments.
Tableau, Pawel Terlecki — “Data Integration for Visual Analysis in Tableau”
In a talk that moved up the stack closer to the end user, Pawel Terlecki from Tableau returned to @Scale this year to talk about a different problem in data visualization: Users have to spend an inordinate amount of time preparing data that resides in diverse sources so it can be visualized. Pawel talked about the challenge in enabling analysis of data models defined over disparate data sources.
Facebook, Kestutis Patiejunas and Amisha Jaiswal — “Facebook's Disaggregated Storage and Compute for Map/Reduce”
To wrap up this year's Data @Scale, we returned to a classic systems engineering talk. Kestutis Patiejunas and Amisha Jaiswal from Facebook argued that disaggregation of storage and compute provides efficiency through gains in flexibility, latency, and availability by using RS encoding of data warehouse data. As part of the talk, they described warm storage, a new storage system that recently was deployed at Facebook. There was a lively discussion at the end of the talk about purpose-built hardware designs optimized for specific workloads versus generic one-size-fits-all.
A big thank-you to all the speakers who presented and to all the attendees. We look forward to an even bigger and better Data @Scale next time, and we hope to see you there. We also encourage everyone to post questions, comments, and follow-ups on the @Scale Facebook page.