Data Mining

Data Mining 2021-22

This is the page of the lab section of the Data Mining course for A.Y. 2021-22.

If you are looking for the main course website, click here.

Lab Sessions

Dec 16th: Lab on GNNs

Colab notebook:

The notebook from the lab can be found here.

Getting started:

  1. Download the files youchoose-buys.dat and youchoose-clicks.dat from here.

  2. Upload them on your Google Drive and remeber the path -we will need it ;).

  3. Create a Colab Notebook and change the runtime to GPU.


Dec 9th: Lab on PyTorch

Colab notebooks:

  1. PyTorch Basics and Feedforward Neural Networks: here

  2. CNN and Image Classification: here

You may find also useful a notebook on LSTM and Text Classification.

The notebooks are inspired by Simone Scardapane's lectures on NN and cool content on Analytics Vidhya.


Oct 28th: Lab on Spark

You can find a commented version of the lab notebook here.

Getting started:

  1. Install Spark on your system:

      1. Install on Windows.

      2. Install on Linux.

  2. Install pyspark: pip install pyspark.

  3. Install Jupyter Notebook (it'll be easier to follow the lab).

  4. Download James Joyce's Ulysses here (we will use it for examples).

Office Hours

Friday 15-16:30 (email me first).

Due to the current pandemic situation, office hours are suspended.

May you need anything, just send me an emal and we'll arrange an online meeting.