Invited Speakers

  • Jeff Dean (Google, TensorFlow)
  • Ameet Talwalkar (UCLA, Spark)
  • Chris Re (Stanford University)
  • Shivaram Venkataraman (UC Berkeley, Keystone ML) / Tomer Kaftan (UW, Keystone ML)
  • John Canny (UC Berkeley, Roofline Design)
  • Andrew Tulloch (Facebook, Caffe)
  • Soumith Chintala (Facebook, Torch)
  • Tianqi Chen (UW, MxNet)
  • John Langford (Microsoft Research, VW)
  • Daniel Crankshaw (UC Berkeley, Clipper)
  • Siddhartha Sen (Microsoft Research, Decision Service)

Speaker Profiles

Jeff Dean (Google, TensorFlow)

Title: Scaling Machine Learning Using TensorFlow

Abstract:

In this talk, I'll highlight some of the ways that the Google Brain team has been tackling large-scale training of neural networks using distributed computation.  I'll outline some of our experiences in using synchronous vs. asynchronous training, improving training performance using combinations of model parallelism and data parallelism, and show how all of these different approaches can be flexibly expressed using TensorFlow.  I'll then outline some of our roadmap for improving TensorFlow performance further, including the upcoming release of TensorFlow XLA, a JIT and ahead-of-time compilation system for generating optimized code for TensorFlow programs for CPUs, GPUs, and other computational devices.

Bio:

Jeff joined Google in 1999 and is currently a Google Senior Fellow in Google's Research Group, where he co-founded and leads the Google Brain team (g.co.brain), Google's deep learning research team in Mountain View, working on systems for speech recognition, computer vision, language understanding, and various predictive tasks.  He has co-designed/implemented multiple generations of Google's large-scale machine learning systems, the most recent of which, TensorFlow, was open sourced in late 2015.  He is also a co-designer and co-implementor of Google's distributed computing infrastructure, including the MapReduce, BigTable and Spanner systems.  He received a Ph.D. in Computer Science from the University of Washington in 1996.  He is a Fellow of the ACM, a member of the U.S. National Academy of Engineering and the American Academy of Arts and Sciences, and a recipient of the Mark Weiser Award and the ACM Prize in Computing.

_________________________________

Ameet Talwalkar (UCLA, Spark)

Title: Paleo: A Performance Model for Deep Neural Networks

Abstract: 

Although various scalable deep learning software packages have been proposed, it remains unclear how to best leverage parallel and distributed computing infrastructure to accelerate their training and deployment. Moreover, the effectiveness of existing parallel and distributed systems varies widely based on the neural network architecture and dataset under consideration.  In order to efficiently explore the space of scalable deep learning systems and quickly diagnose their effectiveness for a given problem instance, we introduce an analytical performance model called Paleo. Our key observation is that a neural network architecture carries with it a declarative specification of the computational requirements associated with its training and evaluation. By extracting these requirements from a given architecture and mapping them to a specific point within the design space of software, hardware and communication strategies, Paleo can efficiently and accurately model the expected scalability and performance of a putative deep learning system.  We show that Paleo is robust to the choice of network architecture, hardware, software, communication schemes, and parallelization strategies. We further demonstrate its ability to accurately model various recently published scalability results for CNNs such as NiN, Inception and AlexNet.

Bio:

Ameet Talwalkar is an assistant professor of Computer Science at UCLA and a technical advisor for Databricks. His research addresses scalability and ease-of-use issues in the field of statistical machine learning, with applications in computational genomics. He led the initial development of the MLlib project in Apache Spark, is a co-author of the graduate-level textbook 'Foundations of Machine Learning' (2012, MIT Press), and teaches an award-winning MOOC on edX called 'Distributed Machine Learning with Apache Spark.' Prior to UCLA, he was an NSF post-doctoral fellow in the AMPLab at UC Berkeley. He obtained a B.S. from Yale University and a Ph.D. from the Courant Institute at NYU.

_________________________________

Chris Re (Stanford)

Title: You've been using asynchrony wrong your whole life!

Abstract:

This clickbait-titled talk will describe some of our group's recent work on asynchronous (parallel, lock-free) algorithms for both stochastic gradient descent and Gibbs sampling. This talk describes two technical nuggets:

First, we describe how asynchronous SGD can be viewed as introducing an implicit momentum term, even in deep learning systems; this means one needs to tune the momentum parameter according to the hardware--not only the data and model.

Second, there is a view that statistical algorithms are robust to arbitrary race conditions. We'll show this view is overly optimistic. Asynchronous Gibbs sampling, which is widely used, may actually lead to bias.

This talk is based largely on the following two papers:

  • Asynchrony begets Momentum, with an Application to Deep Learning. Ioannis Mitliagkas, Ce Zhang, Stefan Hadjis, and C. Ré. Allerton 16.
  • Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling. Chris De Sa, Kunle Olukotun, C. Ré. ICML 2016. Best Paper.

Bio:

Christopher (Chris) Ré is an assistant professor in the Department of Computer Science at Stanford University in the InfoLab who is affiliated with the Statistical Machine Learning Group, Pervasive Parallelism Lab, and Stanford AI Lab. His work's goal is to enable users and developers to build applications that more deeply understand and exploit data. His contributions span database theory, database systems, and machine learning, and his work has won best paper at a premier venue in each area, respectively, at PODS 2012, SIGMOD 2014, and ICML 2016. In addition, work from his group has been incorporated into major scientific and humanitarian efforts, including the IceCube neutrino detector, PaleoDeepDive and MEMEX in the fight against human trafficking, and into commercial products from major web and enterprise companies. He received a SIGMOD Dissertation Award in 2010, an NSF CAREER Award in 2011, an Alfred P. Sloan Fellowship in 2013, a Moore Data Driven Investigator Award in 2014, the VLDB early Career Award in 2015, the MacArthur Foundation Fellowship in 2015, and an Okawa Research Grant in 2016.

_________________________________

Shivaram Venkataraman (UC Berkeley) and Tomer Kaftan (University of Washington)

Title: Optimizing Large-Scale Machine Learning Pipelines With KeystoneML

Abstract: 

Modern machine learning applications typically combine multiple steps of domain-specific and general-purpose processing with high resource requirements. KeystoneML is a system developed at the AMPLab that captures and optimizes end-to-end large-scale machine learning applications in a distributed environment. In this talk, we will first introduce our high-level API for ML operators that make it easier to express ML pipelines. We will then discuss the KeystoneML optimizer that automatically selects the appropriate implementations for operators and performs memory management to improve performance.  Finally we present some case studies on using KeystoneML and discuss work in progress.

Bios:

Shivaram Venkataraman is a PhD candidate at the University of California, Berkeley and works with Mike Franklin and Ion Stoica at the AMP Lab. His research interests are in designing systems and algorithms for large scale machine-learning and he is a committer on the Apache Spark project. Before coming to Berkeley, he completed his M.S at the University of Illinois, Urbana-Champaign and worked at Google.

Tomer Kaftan is a first year PhD student at the University of Washington, working with Magdalena Balazinska and Alvin Cheung. His research interests are in machine learning systems, distributed systems, and query optimization. His currently exploring how to optimize workloads for scientific image analysis. Previously Tomer was a staff engineer in UC Berkeley's AMPLab, working on systems for large scale machine learning. Tomer received his Bachelor's degree in EECS from UC Berkeley. He is also a recipient of an NSF Graduate Research Fellowship.

_________________________________

 John Canny (UC Berkeley and Google Research)

Title: Optimizing Machine Learning and Deep Learning

Abstract:

Internet portals today process and store close to a petabyte of data per day. With the advent of self-driving vehicles, sensors and drones, this data volume will increase a million-fold. Dealing with this data volume requires innovation and *optimization* of the data pipeline at all levels.  This talk will describe the BIDData stack, a machine- and deep-learning toolkit which is fully hardware and network-optimized. At the single-machine level, BIDMach leverages GPUs and is competitive with other systems running on mid-to-large clusters. At the network level, BIDMach includes a communication protocol called Kylix that is optimized for commodity networks, and is naturally fault-tolerant.

Bio

John Canny is a professor in computer science at UC Berkeley. He is an ACM dissertation award winner and a Packard Fellow. He has worked in computer vision, robotics, machine learning and human-computer interation. As well as teaching and research at Berkeley, he has spent about half his time since 2002 designing and deploying machine learning systems in industry, including stints at Yahoo, Ebay, Quantcast, Microsoft and Google. His research is on next-generation tools for machine learning and deep learning, probing the limits of speed and scale. He is developing a suite of machine learning tools (BIDData) that uses roofline design against the limits of compute hardware and networks.

_________________________________

Andrew Tulloch (Facebook, Caffe)

Bio:

I'm a research engineer at Facebook, working on on the Applied Machine Learning team that helps power the large amount of AI/ML applications at Facebook. At Facebook, I've worked on the large scale event prediction models powering ads and News Feed ranking, the computer vision models powering image understanding, our on-device models for real-time style transfer, and many other machine learning projects. I'm a contributor to several deep learning frameworks, including Torch, Caffe and Caffe2. Before Facebook, I obtained a masters in mathematics from the University of Cambridge, and a bachelors in mathematics from the University of Sydney.

_________________________________

Soumith Chintala (Facebook, Torch)

Bio: 

Soumith Chintala maintains the Torch deep learning framework, and works on deep learning at Facebook AI Research. 

_________________________________

Tianqi Chen (University of Washington)

Bio:

Tianqi is a PhD student in University of Washington, working on machine learning and systems. He built many popular widely adopted learning systems, including XGBoost and MXNet. He is recipient of a Google PhD Fellowship in Machine Learning.

_________________________________

John Langford (Microsoft Research, VW)

Bio: 

John Langford is a machine learning research scientist, a field which he says "is shifting from an academic discipline to an industrial tool". He is the author of the weblog hunch.net and the principal developer of Vowpal Wabbit. John works at Microsoft Research New York, of which he was one of the founding members, and was previously affiliated with Yahoo! Research, Toyota Technological Institute, and IBM's Watson Research Center. He studied Physics and Computer Science at the California Institute of Technology, earning a double bachelor's degree in 1997, and received his Ph.D. in Computer Science from Carnegie Mellon University in 2002. He was the program co-chair for the 2012 International Conference on Machine Learning.

_________________________________

Dan Crankshaw (UC Berkeley, Clipper)

Bio: 

Dan is a graduate student in the UC Berkeley RISELab and alumni of the AMPLab. He researches systems and techniques for serving and deploying machine learning, with a particular emphasis on low-latency and interactive applications.

_________________________________

Siddhartha Sen (Microsoft Research, Decision Science)

Bio:

Siddhartha Sen is a Researcher at Microsoft Research in New York City, and previously a researcher at the MSR Silicon Valley lab. He designs and builds distributed systems that use novel data structures and algorithms to deliver new functionality or unprecedented performance. Some of his data structures have been incorporated into undergraduate textbooks and curricula. Recently, he has been using online machine learning to optimize decisions in a variety of settings, including systems infrastructure. Siddhartha received his BS and MEng degrees in computer science and mathematics from MIT in 2004, after which he spent three years at Microsoft building a network load balancer for Windows Server. He received his PhD from Princeton University in 2013. Siddhartha received the first Google Fellowship in Fault-Tolerant Computing in 2009 and the best student paper award at PODC in 2012.