Home

I am a speech recognition researcher and engineer with significant expertise in signal processing and statistical learning (machine learning) as applied to large vocabulary continuous speech recognition (LVCSR) and other information processing applications including environmental sensing such as vehicular traffic density estimation based on cumulative roadside acoustics and multi-modal biometric classifier fusion.


I have also designed and developed a highly parallelized large vocabulary speech acoustic model training library and recognition engines from the ground up.This LVCSR library supports a suite of massive scale (upwards of 1000hrs of speech) statistical speech algorithms/adaptations – decision tree based triphone state clustering and parameter tying, Forward-backward training with pruning to speed up, Linear Discriminant Analysis (LDA) tranformed acoustic models,  Frequency warping and Maximum Accept and Reject (MARS) Discriminative Training and a one-pass bigram LM, time-synchronous, beam-pruned Viterbi decoder supporting up-to 40,000 word vocabulary LVCSR recognizer.   



Contact: vivektyagiibm _at_ gmail _dot_ com

Google Scholar Page

Affiliations
IBM Research India  
Swiss Federal Institute of Technology, Lausanne (EPFL)
Idiap Research Institute    Eurecom Research Institute
Indian Institute of Technology, Kanpur (IIT Kanpur)


Awards and Professional Recognition

Co-winner of the International Speech Communication Association (ISCA) Best Paper Award for the journal article, ”Automatic speech recognition and speech variability:A review. Speech Communication, Volume 49, Issues 10-11 (October - November 2007)” for the period 2006-2008.

IEEE Senior Member, Signal Processing Society.

Patents

“SYSTEM AND COMPUTER PROGRAM PRODUCT FOR PROTECTING AUDIO CONTENT”, Vivek Tyagi et. al., USPTO Patent number: 7978853, Issued Jul 12, 2001.

“METHOD FOR PROTECTING AUDIO CONTENT”, Vivek Tyagi et. al., USPTO Patent number: 7974411, Issued Jul 5, 2011

“VEHICULAR TRAFFIC DENSITY ESTIMATION USING CUMULATIVE ROADSIDE ACOUSTICS”, Vivek Tyagi et. al, US Patent Application # 20120188102,  July 26, 2012


PUBLICATIONS


Recent Research Report


Vivek Tyagi, "Fepstrum Features: Design and Application to Conversational Speech Recognition", IBM Research Report No. RI 11009, 6th June 2011


Journals


Vivek Tyagi, Shivkumar Kalyanaraman, Raghu Krishanpuram, " Vehicular Traffic Density State Estimation Based on Cumulative Road Acoustics", to appear in IEEE Trans on Intelligent Transportation System, 2011.

Vivek Tyagi, Herve Bourlard, Christian Wellekens, "On variable-scale piecewise stationary spectral analysis of speech signals for ASR", Speech Communication, Vol. 48 (2006), pages   1182–1191.

Vivek Tyagi, Christian Wellekens, Dirk Slock, "Least squares filtering of speech signals for robust ASR", Speech Communication Vol. 48 (2006), pages 1528–1544.

M. Benzeghiba, R. De Mori, O. Deroo, S. Dupont *, T. Erbes, D. Jouvet, L. Fissore, P. Laface, A. Mertins, C. Ris, R. Rose, V. Tyagi, C. Wellekens, "Automatic speech recognition and speech variability: A review", Speech Communication Vol. 49 (2007), pages 763–786.
Peer Reviewed Conferences

Speech Recognition

V. Tyagi, “Tandem Processing of Fepstrum Features, ” In the Proc. of Interspeech, 2008, Brisbane, Australia.


V. Tyagi, “Maximum Accept and Reject (MARS) training of HMM-GMM speech recognition systems, ”, In the Proc. of Interspeech, 2008, Brisbane, Australia.

V. Tyagi, “Fepstrum: An improved modulation spectrum for ASR, ” In the Proc. of Interspeech, 2007, Antwerp, Belgium.

V. Tyagi and C. Wellekens, “Fepstrum and Carrier Signal decomposition of Speech Signals through Homomorphic Filtering,” In the special session, “Dealing with intrinsic speech variabilities in ASR”, IEEE International Conference on Audio, Speech, and Signal Processing (ICASSP), 2006, Toulouse, France.

V. Tyagi and C.Wellekens, “Fepstrum Representation of Speech,” In the Proc. of IEEE Automatic Speech Recognition and Understanding (ASRU), November 2005, Puerto Rico, USA

V. Tyagi, H. Bourlard and C. Wellekens, “On Variable-Scale Piecewise Stationary Spectral Analysis of Speech Signals for ASR,” In the Proc. of Eurospeech, September
2005, Lisbon, Portugal.


V. Tyagi and C. Wellekens, “On Desensitizing the Mel-Cepstrum to Spurious Spectral Components for Robust Speech Recognition,” In the Proc. of IEEE International Conference
on Audio, Speech, and Signal Processing (ICASSP), March 2005, Philadelphia, USA.

V. Tyagi, I. McCowan, H. Bourlard, H. Misra, “Mel-Cepstrum Modulation Spectrum (MCMS) Features for Robust ASR,” In the Proc. of IEEE Automatic Speech Recognition
and Understanding (ASRU), December 2003, St. Thomas, US Virgin Islands.

V. Tyagi, I. McCowan, H. Bourlard, H. Misra, “On Factorizing Spectral Dynamics for Robust Speech Recognition,” In the Proc. of EUROSPEECH, Sept. 2003, Geneva, Switzerland

H. Misra, H. Bourlard, V. Tyagi, “Entropy-Based Multi-Stream Combination,” In the Proc. of IEEE International Conference on Audio, Speech, and Signal Processing (ICASSP), 2003, Hong Kong.



Biometrics/Traffic (IBM Smarter Planet Research Theme)

V. Tyagi, H. Prasad, T. Faruquie, L. V. Subramaniam, N. Ratha, “Fusing Biographical and Biometric Classifiers for Improved Person Identification”, To appear In the Proc. of IEEE International Conference on Pattern Recognition (ICPR), Nov. 2012, Japan.

V. Tyagi and N. Ratha, “Biometric Score Fusion Through Discriminative Training”, In Proc. of IEEE Computer Vision and Pattern Recognition (CVPR) Biometrics Workshop, 2011, Boulder Colorado, USA.

B. Srivastava, T. Huan, W. Shang, U. Nambiar, V. Tyagi, S. Kalyanaraman, “Towards a Sustainable Services Ecosystem for Traffic Managament,” In Proc. of Service Research and Innovation Institute (SRII) Global Conference, 2011, San Jose, USA.

J. Basak, K. Kate, V. Tyagi, N. Ratha “A Gradient descent approach for multi-modal Biometric Identification”, In the Proc. of IEEE International Conference on Pattern Recognition (ICPR), 2010, Istanbul, Turkey.

K. Kate, J. Basak, N. Ratha, V. Tyagi, “QPLC: A novel multimodal biometrics score fusion method”, In the Proc of IEEE Computer Vision and Pattern REcognition (CVPR) Biometrics Workshop, 2010, USA