Mirco Ravanelli - PhD
University of Montreal
Montreal Institute for Learning Algorithms (MILA)
GOOGLE SCHOLAR     LINKEDIN       GITHUB     BLOG
 


        Short Bio:
I received my master's degree in Telecommunications Engineering (full marks and honours) from the University of Trento, Italy in 2011. I then joined the SHINE research group (led by Prof. Maurizio Omologo) of the Bruno Kessler Foundation (FBK), contributing to some projects on distant-talking speech recognition in noisy and reverberant environments, such as DIRHA and DOMHOS. 
In 2013 I was visiting researcher at the International Computer Science Institute (University of California, Berkeley) working on deep neural networks for large-vocabulary speech recognition in the context of the IARPA BABEL project (led by Prof. Nelson Morgan).

I received my PhD (with cum laude distinction) in Information and Communication Technology from the University of Trento in December 2017.  During my PhD I worked on “deep learning for distant speech recognition”,  with a particular focus on recurrent and cooperative neural networks (see my PhD thesis here). In the context of my PhD I spent six months in the MILA lab of the University of Montreal.

Since January 2018, I'm a post-doc researcher at the University of Montreal, working on deep learning for speech recognition in the Mila Lab under the supervision of Prof. Yoshua Bengio


Research Interests:
  •  Deep Learning
  •  Speech Recognition
  •  Recurrent Neural Networks
  •  Cooperative Networks of DNNs
  •  Unsupervised Learning
  •  Distant-talking (Far-Field) speech recognition
  •  Multi-Microphone signal processing
  •  Robust Acoustic Scene Analysis
  •  Speaker Recoginition

Research Overview:

Building computers that understand speech represents a crucial step towards easy-to-use human-machine interfaces. During the last decade, much research has been devoted to improving Automatic Speech Recognition (ASR) technologies, resulting in several popular applications ranging from web-search to car control and radiological reporting, just to name a few.

Unfortunately, most state-of-the-art systems provide a satisfactory performance only in close-talking scenarios, where the user is forced to speak very close to a microphone-equipped device. Considering the growing interest towards speech recognition and the progressive use of this technology in everyday lives, it is easy to predict that in the future users will prefer to relax the constraint of handling or wearing any device to access speech recognition services, requiring technologies able to cope with distant-talking interactions also in challenging acoustic environments.

A challenging but worthwhile scenario is represented by far-field speech recognition in the domestic environment, where users might prefer to freely interact with their home appliances without wearing or even handling any microphone-equipped device.

To improve current distant-talking ASR systems, a promising approach concerns the use of Deep Neural Networks (DNNs). In particular, designing a proper DNN paradigm in a multi-channel far-field scenario can potentially help in overtaking the major limitations of current distant-talking technologies.

To reach this ambitious goal, my efforts are focused not only to study proper neural network architectures, but also on devising novel learning algorithms and training strategies, which can be more suitable for distant-taking speech recognition purposes.