Deep Learning for Speech and Language
Language and Speech technologies are rapidly evolving thanks to the current advances in Artificial Intelligence. Applications such as Machine Translation or Speech Recognition can be tackled from a neural perspective with novel Architectures that combine Convolutional (CNN) and/or Recurrent (RNN) models with Attention mechanisms. This course overviews the state of the art on Deep Learning for Speech and Language ad introduces the programming skills and techniques required to train these systems.
DLSL Lectures (2018)
Lecture 1.1: Training a Multi Layer Perceptron
Training
Instructor: Xavier Giró-i-Nieto
Lecture 1.2: CNN, RNN and Attention models
Architectures
Instructor: Xavier Giró-i-Nieto
Lecture 1.3: Embeddings
Natural Language Processing
Instructor: Antonio Bonafonte
Lecture 2.1: Language Models
Natural Language Processing
Instructor: Marta R. Costa-Jussà
Lecture 2.2: Neural Machine Translation
Natural Language Processing
Instructor: Marta R. Costa-Jussà
Lecture 2.3: Sequence to Sequence (Seq2Seq)
Natural Language Processing
Instructor: Marta R. Costa-Jussà
Lecture 2.4: Language and Vision
Multimodal Learning
Instructor: Xavier Giró-i-Nieto
Lecture 3.1: Speech Recognition
Speech Processing
Instructor: José A. R. Fonollosa
Lecture 3.2: Speaker Recognition
Speech Processing
Instructor: Javier Hernando
Lecture 4.1: Text 2 Speech
Speech Processing
Instructor: Antonio Bonafonte
Lecture 4.2: Speech to Speech
Speech Processing
Instructor: Santiago Pascual
Lecture 4.3: Audio and Vision
Speech Processing
Instructor: Xavier Giró-i-Nieto