Mirco Ravanelli - PhD Candidate
Fondazione Bruno Kessler (FBK)
University of Trento
CV      GOOGLE SCHOLAR     LINKEDIN
 


        Short Bio:
I received my master's degree in Telecommunications Engineering (full marks and honours) from the University of Trento, Italy in 2011. I then joined the SHINE research group (led by Prof. Maurizio Omologo) of the Bruno Kessler Foundation (FBK), contributing to some projects on distant-talking speech recognition in noisy and reverberant environments, such as DIRHA and DOMHOS. 
In 2013 I was visiting researcher at the International Computer Science Institute (University of California, Berkeley) working on deep neural networks for large-vocabulary speech recognition in the context of the IARPA BABEL project (led by Prof. Nelson Morgan).
I also cooperated with the Audio and Multimedia Research Group of ICSI contributing to the IARPA Aladdin Project.

In the context of my PhD I recently spent  6 months in the MILA lab led by Prof. Yoshua Bengio.

The primary focus of my PhD is on Deep Neural Networks for Distant-Talking (Far-Field) Speech Recognition, with a particular focus on the domestic environment.


Research Interests:
  •  Distant-talking (Far-Field) speech recognition
  •  Deep Neural Networks (DNNs)
  •  Multi-Microphone signal processing
  •  Robust Acoustic Scene Analysis

Research Overview:

Building computers that understand speech represents a crucial step towards easy-to-use human-machine interfaces. During the last decade, much research has been devoted to improving Automatic Speech Recognition (ASR) technologies, resulting in several popular applications ranging from web-search to car control and radiological reporting, just to name a few.

Unfortunately, most state-of-the-art systems provide a satisfactory performance only in close-talking scenarios, where the user is forced to speak very close to a microphone-equipped device. Considering the growing interest towards speech recognition and the progressive use of this technology in everyday lives, it is easy to predict that in the future users will prefer to relax the constraint of handling or wearing any device to access speech recognition services, requiring technologies able to cope with distant-talking interactions also in challenging acoustic environments.

A challenging but worthwhile scenario is represented by far-field speech recognition in the domestic environment, where users might prefer to freely interact with their home appliances without wearing or even handling any microphone-equipped device.

To improve current distant-talking ASR systems, a promising approach concerns the use of Deep Neural Networks (DNNs). In particular, designing a proper DNN paradigm in a multi-channel far-field scenario can potentially help in overtaking the major limitations of current distant-talking technologies.

To reach this ambitious goal, my efforts are focused not only to study proper neural network architectures, but also on devising novel learning algorithms and training strategies, which can be more suitable for distant-taking speech recognition purposes.