I am a speech recognition researcher and engineer with significant expertise in signal processing and statistical learning (machine learning) as applied to large vocabulary continuous speech recognition (LVCSR) and other information processing applications including environmental sensing such as vehicular traffic density estimation based on cumulative roadside acoustics and multi-modal biometric classifier fusion. I have also designed and developed a highly parallelized large vocabulary speech acoustic model training library and recognition engines from the ground up.This LVCSR library supports a suite of massive scale (upwards of 1000hrs of speech) statistical speech algorithms/adaptations – decision tree based triphone state clustering and parameter tying, Forward-backward training with pruning to speed up, Linear Discriminant Analysis (LDA) tranformed acoustic models, Frequency warping and Maximum Accept and Reject (MARS) Discriminative Training and a one-pass bigram LM, time-synchronous, beam-pruned Viterbi decoder supporting up-to 40,000 word vocabulary LVCSR recognizer. Vivek Tyagi, Shivkumar Kalyanaraman, Raghu Krishanpuram, " Vehicular Traffic Density State Estimation Based on Cumulative Road Acoustics", to appear in IEEE Trans on Intelligent Transportation System, 2011.Vivek Tyagi, Herve Bourlard, Christian Wellekens, "On variable-scale piecewise stationary spectral analysis of speech signals for ASR", Speech Communication, Vol. 48 (2006), pages 1182–1191.Vivek Tyagi, Christian Wellekens, Dirk Slock, "Least squares filtering of speech signals for robust ASR", Speech Communication Vol. 48 (2006), pages 1528–1544.
M. Benzeghiba, R. De Mori, O. Deroo, S. Dupont *, T. Erbes, D. Jouvet, L. Fissore, P. Laface, A. Mertins, C. Ris, R. Rose, V. Tyagi, C. Wellekens, "Automatic speech recognition and speech variability: A review", Speech Communication Vol. 49 (2007), pages 763–786.
Contact: vivektyagiibm _at_ gmail _dot_ comGoogle Scholar Page
IBM Research India
Swiss Federal Institute of Technology, Lausanne (EPFL)
Idiap Research Institute Eurecom Research Institute
Indian Institute of Technology, Kanpur (IIT Kanpur)
Awards and Professional Recognition
Co-winner of the International Speech Communication Association (ISCA) Best Paper Award for the journal article, ”Automatic speech recognition and speech variability:A review. Speech Communication, Volume 49, Issues 10-11 (October - November 2007)” for the period 2006-2008.
IEEE Senior Member, Signal Processing Society.
“SYSTEM AND COMPUTER PROGRAM PRODUCT FOR PROTECTING AUDIO CONTENT”, Vivek Tyagi et. al., USPTO Patent number: 7978853, Issued Jul 12, 2001.
“METHOD FOR PROTECTING AUDIO CONTENT”, Vivek Tyagi et. al., USPTO Patent number: 7974411, Issued Jul 5, 2011
“VEHICULAR TRAFFIC DENSITY ESTIMATION USING CUMULATIVE ROADSIDE ACOUSTICS”, Vivek Tyagi et. al, US Patent Application # 20120188102, July 26, 2012
Recent Research ReportVivek Tyagi, "Fepstrum Features: Design and Application to Conversational Speech Recognition", IBM Research Report No. RI 11009, 6th June 2011
Peer Reviewed Conferences
V. Tyagi, “Tandem Processing of Fepstrum Features, ” In the Proc. of Interspeech, 2008, Brisbane, Australia.
Tyagi, “Maximum Accept and Reject (MARS) training of HMM-GMM speech
recognition systems, ”, In the Proc. of Interspeech, 2008, Brisbane,
V. Tyagi, “Fepstrum: An improved modulation spectrum for ASR, ” In the Proc. of Interspeech, 2007, Antwerp, Belgium.
Tyagi and C. Wellekens, “Fepstrum and Carrier Signal decomposition of
Speech Signals through Homomorphic Filtering,” In the special session,
“Dealing with intrinsic speech variabilities in ASR”, IEEE International Conference on Audio,
Speech, and Signal Processing (ICASSP), 2006, Toulouse, France.
Tyagi and C.Wellekens, “Fepstrum Representation of Speech,” In the
Proc. of IEEE Automatic Speech Recognition and Understanding (ASRU),
November 2005, Puerto Rico, USA
Tyagi, H. Bourlard and C. Wellekens, “On Variable-Scale Piecewise
Stationary Spectral Analysis of Speech Signals for ASR,” In the Proc. of
2005, Lisbon, Portugal.
Tyagi and C. Wellekens, “On Desensitizing the Mel-Cepstrum to Spurious
Spectral Components for Robust Speech Recognition,” In the Proc. of IEEE
on Audio, Speech, and Signal Processing (ICASSP), March 2005, Philadelphia, USA.
Tyagi, I. McCowan, H. Bourlard, H. Misra, “Mel-Cepstrum Modulation
Spectrum (MCMS) Features for Robust ASR,” In the Proc. of IEEE Automatic
and Understanding (ASRU), December 2003, St. Thomas, US Virgin Islands.
V. Tyagi, I. McCowan, H. Bourlard, H. Misra, “On Factorizing Spectral
Dynamics for Robust Speech Recognition,” In the Proc. of EUROSPEECH,
Sept. 2003, Geneva, Switzerland
H. Misra, H. Bourlard, V. Tyagi, “Entropy-Based Multi-Stream
Combination,” In the Proc. of IEEE International Conference on Audio,
Speech, and Signal Processing (ICASSP), 2003, Hong Kong.
Biometrics/Traffic (IBM Smarter Planet Research Theme)
V. Tyagi, H. Prasad, T. Faruquie, L. V. Subramaniam, N. Ratha, “Fusing Biographical and Biometric Classifiers for Improved Person Identification”, To appear In the Proc. of IEEE International Conference on Pattern Recognition (ICPR), Nov. 2012, Japan.
V. Tyagi and N. Ratha, “Biometric Score Fusion Through Discriminative Training”, In Proc. of IEEE Computer Vision and Pattern Recognition (CVPR) Biometrics Workshop, 2011, Boulder Colorado, USA.
B. Srivastava, T. Huan, W. Shang, U. Nambiar, V. Tyagi, S. Kalyanaraman, “Towards a Sustainable Services Ecosystem for Traffic Managament,” In Proc. of Service Research and Innovation Institute (SRII) Global Conference, 2011, San Jose, USA.
J. Basak, K. Kate, V. Tyagi, N. Ratha “A Gradient descent approach for multi-modal Biometric Identification”, In the Proc. of IEEE International Conference on Pattern Recognition (ICPR), 2010, Istanbul, Turkey.
K. Kate, J. Basak, N. Ratha, V. Tyagi, “QPLC: A novel multimodal biometrics score fusion method”, In the Proc of IEEE Computer Vision and Pattern REcognition (CVPR) Biometrics Workshop, 2010, USA