TRACK A: Introduction to Apache Spark Workshop
The Introduction to Apache Spark workshop is for users to learn the core Spark APIs. This session features hands-on technical exercises to get developers up to speed in using Spark for data exploration, analysis, and building big data applications.
The integrated lecture and lab format covers the following topics:
Overview of Big Data and Spark
Installing Spark Locally
Using Spark’s Core APIs in Scala, Java, & Python
Building Spark Applications
Deploying on a Big Data Cluster
Building Applications for Multiple Platforms
TRACK B: Advanced Apache Spark Workshop
The Advanced Apache Spark Workshop will cover advanced topics on architecture, tuning, and each of Spark’s high-level libraries (including the latest features). Attendees will have the opportunity after the lunch break to work through labs on each of the libraries.
Some familiarity with Spark or MapReduce is expected, as this workshop will not cover basic Spark programming.
Topics covered include:
Advanced Spark Internals and Tuning – Reynold Xin
Spark SQL – Michael Armburst
Spark Streaming – Tathagata Das
MLlib – Xiangrui Meng
GraphX – Ankur Dave
Building Applications for Multiple Platforms – Pat McDonough