Building Recommender Systems w/ Apache Spark 2.x Workshop

This workshop will cover the following topics:

  • Understand Spark architecture and execution model
  • Learn structured data processing with Spark SQL, DataFrames and Datasets
  • Apply powerful Spark SQL functions and user defined function (UDF)
  • Perform streaming processing with Spark Structured Streaming
  • Apply Spark MLlib to build Recommender Systems

Prerequisite:

Workshop Materials

    • tutorials.zip

Lecture Notes

  • Part 1
  • Part 2

Import Databricks Notebooks

    • Login into Databricks - https://community.cloud.databricks.com/login.html
    • Click on "Workspace" icon on left hand vertical bar
    • Next to "Workspace" label at the top of the column, click on arrow
    • Select "Import" option
    • Make "Import from" option is "File"
    • Click on the gray box with label "Drop file here to upload or click to select." to bring up file browser
    • In File browser, navigate where "qconsf.dbc" file was download to and select it
    • Click on "Import" button to complete the import process
    • Should see a folder called "qconsf" in Databricks

Import Data to Databricks

  • Click on the "Data" icon on left hand vertical bar
  • Click on "Add Data" button on the slide-out panel
  • Make sure "Data Source" is "Upload File" (first option)
  • Enter "qcon" in the text box after "/FileStore/tables"
  • Click on the browse link at the end of "Drop files to upload, or browse" statement
  • In the popup file browser, navigate to the folder where you unzip tutorials.zip,
  • Then go into data folder all select all files into there
  • Then click on the "Open" button of the popup file browser
  • Sit back and wait for the upload to complete