Information

Course: Introduction to Big Data Algorithms

Instructors: Emilio Cruciani, Aditi Dudeja

Teaching Assistant: Antonis Skarlatos

When: Tuesdays 12:00-13:30; Thursdays 10:00-11:00, from 05/03 to 27/06

Where: Room T02

Office Hours: By Appointment (by email)

Description

Modern datasets are massive. With more than 500 million tweets and over 8.5 billion Google searches per day, how can one make sense of all the data that is generated? This course will present an introduction to advanced algorithmic tools for the analysis of such large datasets.

Exams

The exam will be oral. Algorithms and proofs will be asked. Questions similar to the assignments will be asked. Submitting the assignments gives bonus points.

Assignments

Assignments will be given every Tuesday and can be found at this link. If you have answers about them or find mistakes, please email any of the instructors. Collaboration is allowed, but write-up should be done individually. Assignments must be sent before the next Tuesday to Antonis at antonis.skarlatos@plus.ac.at

Syllabus (Tentative)

5/3/2024: Course Introduction: Big Data, Pitfalls; Review of Probability: Probability Space, Union Bound, Independence, Conditional Probability

7/3/2024: Tutorial

12/3/2024: Review of Probability: Law of Total Probability, Random Variables, Expectation, Concentration Bounds (Markov, Chebyshev, Chernoff)

14/3/2024: Tutorial

19/3/2024: Graph Sparsification: Karger's Min-Cut Algorithm

21/3/2024: NO LECTURE

EASTER BREAK

9/4/2024: Cut Sparsification: Benczur-Karger's Algorithm

11/4/2024: Tutorial

16/4/2024: Cut Sparsification: Benczur-Karger's Algorithm

18/4/2024: Tutorial

23/4/2024: Finding Similar Items; Preliminaries: Metrics; Euclidean, Jaccard, Hamming, Edit, Cosine distances

25/4/2024 (Room T.03): Tutorial

30/4/2024: Hash functions (uniform, k-wise independent, universal); Shingles; Minhashing

2/5/2024: Tutorial

7/5/2024: Finding Similar Items: Locality Sensitive Hashing

9/5/2024: NO LECTURE

14/5/2024: 

16/5/2024: Tutorial

21/5/2024: NO LECTURE

23/5/2024: Tutorial

28/5/2024: 

30/5/2024: NO LECTURE

4/6/2024: 

6/6/2024: Tutorial

11/6/2024: 

13/6/2024: Tutorial

18/6/2024: 

20/6/2024: Tutorial

25/6/2024: 

27/6/2024: Tutorial

References

Books: