Noise robust Automatic Speech Recognition

I have created a stand-alone Matlab demo of the noise-robust Automatic Speech Recognition (ASR) techniques I worked on since 2007. All these techniques rely on finding a sparse, linear combination of noise-free speech exemplars, which is then either used to make an estimate of the clean speech, or to perform exemplar based ASR.

Highlights:
  • AURORA-2 noise robust digit recognition
  • Supports multiple methods: Sparse Imputation, Feature Enhancement (FE), Sparse Classification (SC), Hybrid SC/FE (SCFE).
  • Includes a MATLAB word-based ASR system
  • GPU acceleration
 



  

Missing data mask estimation

Missing data mask estimation is the process of estimating which spectro-temporal regions in a spectrographic representation of noisy speech remain (relatively) uncorrupted. I have compiled an Octave/MATLAB package for machine-learning (SVM) based missing data mask estimation.

Highlights:
  • Machine learning approach based on a training set of noisy speech with known underlying speech and noise
  • The SVM classifier exploits many different features, such as long-term noise floors and harmonicity
  • The package supports hyperparameter optimization, training and evaluation

 
 

Audio event detection

Top-ranking contribution to the AASP acoustic event detection challenge. You can grab the code here.

 
 

Weakly supervised learning of acoustic units

I have created a stand-alone Matlab demo of the research on using Non-negative Matrix Factorisation (NMF) to learn acoustic units (such as words, phrases, or acoustic events) with only weakly annotated material: The audio samples are annotated only with tags indicating the presence of a word or event, without segmentation or temporal ordering.