Venue
ASPLOS 2016, Sunday, April 3rd (afternoon), Atlanta, GA
Abstract
Performance per Watt is no longer simply a desired goal, but rather a first-class optimization metric when programming mobile devices or high performance computing systems. In designing for user experience, developers need to take into consideration power efficiency and thermal constraints. Mobile computing is leading the shift toward exploiting heterogeneous computing as an effective way of optimizing for energy. This tutorial presents the Qualcomm Symphony SDK, which allows developers to exploit the full capabilities of mobile platforms. Symphony is designed and optimized for concurrent and heterogeneous execution on multi-core and specialized cores (GPUs and DSPs). Distributing computation across multiple heterogeneous compute resources improves performance and reduces power consumption. Symphony eases the burden to express concurrent computation by providing both a high-level API constructed around parallel programming patterns and intuitive building blocks (tasks, dependencies, and groups) to allow fine-grained decomposition of applications. In addition, Symphony introduces a novel set of power management and affinity APIs that allow programmers to express the quality of service in their applications.
Tutorial Scope and Objective
The scope of this tutorial is programming systems for heterogeneous computing platforms, as exemplified by mobile computers. The tutorial will briefly survey the landscape of existing frameworks to motivate the design and development of the Symphony SDK. The tutorial will focus primarily on the Symphony APIs for power and heterogeneous programming and the enabling runtime technology to deliver the programmer-specified end-user experiences.
Topics to be Covered
Challenges of Mobile Computing
Challenges of Heterogeneous Programming
Landscape of Programming Systems for Heterogeneous Computing
Qualcomm Symphony SDK Overview
High-level Pattern APIs
Power and Affinity APIs
Tasks, Dependencies, etc.
Power-Aware Symphony Runtime
Symphony Memory Management
Case-Studies
Target Audience
The tutorial is targeted at researchers and engineers from both academia and industry who are
grappling with heterogeneous computing systems with their diverse programmability/performance/power characteristics
new-comers to the areas of programming systems for heterogeneous computing or mobile computing and are looking to familiarize themselves with a state-of-the-art SDK for heterogeneous computing
Additionally, the tutorial is also targeted at architects of new platforms for emerging technologies such as advanced driver assistance systems, connected cameras, and drones, who will benefit from the detailed discussion on designing and programming high-performance low-power heterogeneous computing systems.
The primary objective of the tutorial is to educate attendees on how to use heterogeneous programming to effectively address performance issues arising from power and thermal constraints. Other objectives are that attendees understand the tradeoffs between using different heterogeneous computing frameworks, understand where the Symphony SDK and runtime system fit in the software stack, and how to effectively use the Symphony SDK to address performance issues related to power and thermal constraints.
Biography
Arun Raman is a Staff Engineer at Qualcomm Research Silicon Valley. At Qualcomm, he has led the development of several new heterogeneous parallel computing features in the Symphony SDK, and has filed thirteen patents in this area. Previously, Arun worked at Intel Labs for two years on the design and development of a HW/SW co-designed microprocessor based on binary translation. Arun got his PhD from Princeton University in 2012, where he developed compilers, runtime systems, and architectures for multicore. Arun has written twelve peer-reviewed publications and two book chapters on these topics.
Calin Cascaval is a Sr. Director in Qualcomm Research Silicon Valley. He is leading the parallel and heterogeneous computing initiatives in Qualcomm. His team developed Qualcomm Symphony, a framework for power aware computing, which orchestrates execution across different heterogeneous cores on Snapdragon platforms. He also led the development of the first fully concurrent mobile browser (Zoomm) and other high performance libraries. Previously, he worked at the IBM TJ Watson Research Center on systems software, programming models, and compilers for large scale parallel systems projects, including Blue Gene and PERCS. Calin has a PhD in Computer Science from the University of Illinois at Urbana-Champaign. He has more than 50 peer-reviewed publications and 23 awarded patents.