Deep Learning Seminars

PhD in Data Science, 3 CFU

Organizers: Fabrizio Silvestri, Simone Scardapane

The seminars will cover several advanced topics in deep learning: meta learning (i.e., “learning to learn”), continual learning (i.e., learning from a continuous stream of tasks), and data engineering for deep learning (i.e., preparing data for being used in deep learning pipelines).

Attendance

The course is mandatory for all students of the PhD in Data Science, and open to all students of related PhD programs.

💻 Zoom link (*** Note that the zoom link has changed ***): https://uniroma1.zoom.us/j/81268470005?pwd=dDVTMHNpN2FtbG5UMXNhQnhhT3dJdz09

Exam

Students are asked to select a recent paper on the topics of the course and present it in front of an audience. See this shared document for instructions.

Timetable (tentative)

20 April 2022 (9-13) - Meta-learning

Speaker: Prof. Fabrizio Silvestri (Sapienza University)

[Slides][Zoom Recording. Passcode: semin@rs2022]

Abstract: Learning is usually seen as a method to extract patterns from data, and learn associations among these patterns and dependent variables (labels, responses, etc.). The input to a Machine Learning algorithm is usually a set of data points X along with labels Y and the goal is to learn a function f: X -> Y that associates labels with each data sample in X. Meta-learning is a fundamental paradigm shift where the input to the algorithm is not a set of data points but, instead, a set of "tasks". The goal is to learn how to efficiently learn a model for a new task starting from an existing model that has been trained on previous tasks. In this lecture, we will review the main basics of Meta-Learning and this will serve as an introduction to the topic for students who have never seen this topic before.

21 April 2022 (9-13) - Introduction to Continual Learning

Speaker: Dr. Vincenzo Lomonaco (Pisa University)

[Slides] [ContinualAI Wiki][Zoom Recording. Passcode: semin@rs2022]

Abstract: Learning continually from non-stationary data streams is a long-standing goal and a challenging problem in machine learning research. Naively fine-tuning prediction models only on the newly available data often incurs in Catastrophic Forgetting or Interference: a sudden erase of all the previously acquired knowledge. On the other hand, re-training prediction models from scratch on the accumulated data is not only inefficient but possibly unsustainable in the long-term and where fast, frequent model updates are necessary. In this lecture we will discuss recent progress and trends in making machines learn continually through architectural, regularization and replay approaches.

27/28 April 2022 (9-13) - Working with data in industrial machine learning applications

Speaker: Dr. Andreas Damianou (Spotify)

[Slides]

[1st day Zoom Recording. Passcode: H5$sicA2]
[2nd day Zoom Recording. Passcode: cY?gHE05]

Abstract: Data is a crucial aspect of today's machine learning workflows. Over the last few years machine learning (ML) methods, such as deep learning, have been made more and more efficient when it comes to using large volumes of data. This, in turn, has contributed to the numerous remarkable successes of modern ML. However, the practical application of ML in industry comes with a variety of problems and considerations which are often related to data. Indeed, real-world data are incomplete, noisy, sensitive, biased and ever changing, making it hard to train reliable ML models on them. At the same time, productionizing a ML model also means productionizing the associated data pipelines, and issues like data versioning and publishing come into play. In this lecture series I will give an overview of the important role of data in an industrial ML setting, ranging from raw data collection to practical feature transformation and engineering. Further, we will dive deep into a variety of data-related issues (and solutions) for ML, ranging from data cleaning and versioning, to scalability, bias and privacy.