Instructors: Xiangliang Zhang (Xiangliang.Zhang@kaust.edu.sa)
Meeting: Sunday/Tuesday 10:15 – 11:45AM; Zoom: https://kaust.zoom.us/j/9777029647
Teaching Assistant: Shichao Pei (Shichao.Pei@kaust.edu.sa) Qiang Yang (Qiang.Yang@kaust.edu.sa)
Office Hours: Xiangliang Zhang: Tuesday 3:00-5:00PM (by appointment); Zoom: https://kaust.zoom.us/j/9777029647
Prerequisites: Probability, Statistics, Linear Algebra , Data Analytics and Machine Learning
Materials: Reading literatures coupling with the course content will be assigned through Blackboard system. Slides will be posted on-line.
Grading: The overall grade will be determined based on the following scheme:
Total score = Homework (30%) + Quiz (30%) + Project (40%)
(see Assignments and Project for more information).
Introduction, Slides
Text Data Mining (week 1-7)
Word embedding: Neural Network Language Model and Word2vec (Slides part 1)(part 2)
GloVe, FastText, TextCNN, TextRNN (Slides) (LSTM preparation slides)
Seq2seq, Attention (Slides)
Transformer, BERT (Slides)
Graph Data Mining (week 8-12)
Recommendation System (week 13-14) (Slides)
Papers will be assigned for reading under each topic.
In-class quiz will be used to evaluate the understanding of the assigned papers under the topic of topic modeling, word embedding, and graph embedding.
Homework will include take-home questions about the assigned papers and an exercise on recommendation systems (https://github.com/wubinzzu/NeuRec).
There are two projects to complete at CS 340. One is under the topic of text data mining, to be completed by week 9.
The other is under the topic of graph data mining, to be completed by week 15.
Both projects should target on solving real data mining problems, by using what you learned from the course.
Requirements:
Project reports should be submitted in the format of a research paper.
Project presentation should be given at the class.
Project evaluation will follow: Technical quality (30) + significance (30) + novelty/impact (20) + report/ presentation (20)