Tara Sainath

Biography

I received my PhD in Electrical Engineering and Computer Science from MIT in 2009. The main focus of my PhD work was in acoustic modeling for noise robust speech recognition. After my PhD, I spent 5 years at the Speech and Language Algorithms group at IBM T.J. Watson Research Center, before joining Google Research. I have served as a Program Chair for ICLR in 2017 and 2018. Also, I have co-organized numerous special sessions and workshops, including Interspeech 2010, ICML 2013, Interspeech 2016 and ICML 2017. In addition, I am a member of the IEEE Speech and Language Processing Technical Committee (SLTC) as well as the Associate Editor for IEEE/ACM Transactions on Audio, Speech, and Language Processing. My research interests are mainly in acoustic modeling, including deep neural networks, sparse representations and adaptation methods.

Invited Talks

Publications

2020

2019

2018

2017

2016

2015

2014

  • T. N. Sainath, B. Kingsbury, G. Saon, H. Soltau, A. Mohamed, G. Dahl and B. Ramabhadran, "Deep Convolutional Neural Networks for Large-Scale Speech Tasks," in Elsevier, Special Issue in Deep Learning, November 2014.
  • I. Chung, T. N. Sainath, B. Ramabhadran, M. Picheny, J. Gunnels, V. Austel, U. Chaudhari and B. Kingsbury, "Parallel Deep Neural Network Training for Big Data on Blue Gene/Q," in Proc. of the International Conference on High Performance Computing, Networking, Storage and Analysis, November 2014.
  • T. N. Sainath, V. Peddinti, B. Kingsbury, P. Fousek, D. Nahamoo and B. Ramabhadhran, "Deep Scattering Spectra with Deep Neural Networks for LVCSR Tasks," in Proc. Interspeech, September 2014.
  • T. N. Sainath, I. Chung, B. Ramabhadran, M. Picheny, J. Gunnels, B. Kingsbury, G. Saon, V. Austel and U. Chaudhari, "Parallel Deep Neural Network Training for LVCSR using Blue Gene/Q," in Proc. Interspeech, September 2014.
  • T. N. Sainath, B. Kingsbury, A. Mohamed, G. Saon and B. Ramabhadran, "Improvements to Filterbank and Delta Learning within a Deep Neural Network Framework," in Proc. ICASSP, May 2014.
  • V. Peddinti, T. N. Sainath, S. Maymon, B. Ramabhadran, D. Nahamoo, V. Goel, "Deep Scattering Spectrum with Deep Neural Networks," in Proc. ICASSP, May 2014.
  • P. Huang, H. Avron, T. N. Sainath, V. Sindhwani and B. Ramabhadran, "Kernel Methods Match Deep Neural Networks on TIMIT: Scalable Learning in High-Dimensional Random Fourier Spaces," in Proc. ICASSP, May 2014. [Best Student Paper Award]
  • H. Soltau, G. Saon and T. N. Sainath, "Joint Training of Convoutional and Non-Convoutional Neural Networks," in Proc. ICASSP, May 2014.

2013

  • T. N. Sainath, B. Kingsbury, A. Mohamed and B. Ramabhadran, "Learning Filter Banks within a Deep Neural Network Framework," in Proc. ASRU, December 2013.
  • T. N. Sainath, L. Horesh, B. Kingsbury, A. Aravkin and B. Ramabhadran, "Accelerating Hessian-Free Optimization for Deep Neural Networks by Implicit Preconditioning and Sampling," in Proc. ASRU, December 2013.
  • T. N. Sainath, B. Kingsbury, A. Mohamed, G. Dahl, G. Saon, H. Soltau, T. Beran, A. Aravkin and B. Ramabhadran, "Improvements to Deep Convolutional Neural Networks for LVCSR," in Proc. ASRU, December 2013.
  • T. N. Sainath, B. Kingsbury, H. Soltau and B. Ramabhadran, "Optimization Techniques to Improve Training Speed of Deep Neural Networks for Large Speech Tasks," in Transactions on Audio, Speech and Language Processing, November 2013.
  • T. N. Sainath, A. Mohamed, B. Kingsbury and B. Ramabhadran, "Deep Convolutional Neural Networks for LVCSR," in Proc. ICASSP, May 2013.
  • T. N. Sainath, B. Kingsbury, V. Sindhwani, E. Arisoy and B. Ramabhadran, "Low-Rank Matrix Factorization for Deep Neural Network Training with High-Dimensional Output Targets," in Proc. ICASSP, May 2013.
  • G. Dahl, T. N. Sainath and G. Hinton, "Improving Deep Neural Networks for LVCSR using Rectified Linear Units and Dropout," in Proc. ICASSP, May 2013.
  • R. Prabhavalkar, T. N. Sainath, D. Nahamoo, B. Ramabhadran and D. Kanevsky, "An Evaluation Of Posterior Modeling Techniques for Phonetic Recognition," in Proc. ICASSP, May 2013.
  • J. Cui, X. Cui, B. Ramabhadran, J. Kim, B. Kingsbury, J. Mamou, L. Mangu, M. Picheny, T. N. Sainath, A. Sethy, "Developing Speech Recognition Systems for Corpus Indexing Under the IARPA Babel Program," in Proc. ICASSP, May 2013.

2012

  • T. N. Sainath, B. Kingsbury and B. Ramabhadran, "Improving Training Time of Deep Belief Networks Through Hybrid Pre-Training And Larger Batch Sizes," in Proc. NIPS Workshop on Log-linear Models, Dec. 2012.
  • G. Hinton, L. Deng, D. Yu, G. Dahl, A.Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, and B. Kingsbury, "Deep Neural Networks for Acoustic Modeling in Speech Recognition," in IEEE Signal Processing Magazine, 29, November 2012.
  • T. N. Sainath, B. Ramabhadran, D. Nahamoo, D. Kanevsky, D. Van Compernolle, K. Demuynck, J. F. Gemmeke, J. R. Bellegarda, S. Sundaram, "Exemplar-Based Processing for Speech Recognition," in IEEE Signal Processing Magazine, 29, November 2012.
  • T. N. Sainath, D. Nahamoo, B. Ramabhadran and D. Kanevsky, "Enhancing Exemplar-Based Posteriors for Speech Recognition Tasks," in Proc. Interspeech, September 2012.
  • B. Kingsbury, T. N. Sainath, and H. Soltau, "Scalable Minimum Bayes Risk Training of Deep Neural Network Acoustic Models Using Distributed Hessian-free Optimization," in Proc. Interspeech, September 2012.
  • E. Arisoy, T. N. Sainath, B. Kingsbury, and B. Ramabhadran, "Deep Neural Network Language Models," in Proc. NAACL, June 2012.
  • T. N. Sainath, B. Kingsbury, and B. Ramabhadran, "Auto-Encoder Bottleneck Features Using Deep Belief Networks," in Proc. ICASSP, March 2012.
  • C. Plahl, T. N. Sainath, B. Ramabhadran and D. Nahamoo, "Improved Pre-Training of Deep Belief Networks Using Sparse Encoding Symmetric Machines," in Proc. ICASSP, March 2012.
  • N. Itoh, T. N. Sainath, D. Jiang, J. Zhou and B. Ramabhadran, "N-best Entropy Based Data Selection for Acoustic Modeling," to appear in Proc. ICASSP, March 2012.

2011

  • T. N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak and A. Mohamed, "Making Deep Belief Networks Effective for Large Vocabulary Continuous Speech Recognition," in Proc. ASRU, December 2011.
  • T. N. Sainath, D. Nahamoo, D. Kanevsky, B. Ramabhadran and P. M. Shah, “A Convex Hull Approach to Sparse Representations for Exemplar-Based Speech Recognition,” in Proc. ASRU, December 2011.
  • T. N. Sainath, B. Ramabhadran, M. Picheny, D. Nahamoo and D. Kanevsky, “Exemplar-Based Sparse Representation Features: From TIMIT to LVCSR,” in IEEE Transactions on Speech and Audio Processing, November 2011.
  • T. N. Sainath, B. Ramabhadran, D. Nahamoo and D. Kanevsky, “Reducing Computational Complexities of Exemplar-Based Sparse Representations With Applications to Large Vocabulary Speech Recognition,” in Proc. Interspeech, August 2011.
  • D. Kanevsky, D. Nahamoo, T. N. Sainath and B. Ramabhadran, "Convergence of Line Search A-Function Methods," in Proc. Interspeech, August 2011.
  • T. N. Sainath, D. Nahamoo, D. Kanevsky, B. Ramabhadran and P. M. Shah, “A Convex Hull Approach to Sparse Representations for Exemplar-Based Speech Recognition,” Technical Report, Speech and Language Algorithm Group, IBM, April 2011.
  • T. N. Sainath, D. Nahamoo, D. Kanevsky, B. Ramabhadran and P. M. Shah, “Exemplar-Based Sparse Representation Phone Identification Features,” in Proc. ICASSP, May 2011.
  • A. Mohamed, T. N. Sainath, G. Dahl, B. Ramabhadran, G. Hinton and M. Picheny, "Deep Belief Networks using Discriminative Features for Phone Recognition," in Proc. ICASSP, May 2011.
  • D. Kanevsky, D. Nahamoo, T. N. Sainath, B. Ramabhadran and P. A. Olsen, "A-Functions: A Generalization of Extended Baum-Welch Transformations to Convex Optimization," in Proc. ICASSP, May 2011.
  • B. Zhang, A. Sethy, T. N. Sainath and B. Ramabhadran, "Application Specific Loss Minimization Using Gradient Boosting," in Proc. ICASSP, May 2011.

2010

  • T. N. Sainath, B. Ramabhadran, D. Nahamoo, D. Kanevsky and A. Sethy, “Exemplar-Based Sparse Representation Features for Speech Recognition ,” in Proc. Interspeech, September 2010.
  • T. N. Sainath, S. Maskey, D. Kanevsky, B. Ramabhadran, D. Nahamoo and J. Hirschberg, “Sparse Representations for Text Categorization,” in Proc. Interspeech, September 2010.
  • V. Goel, T. N. Sainath, B. Ramabhadran, P. A. Olsen, D. Nahamoo and D. Kanevsky, “Incorporating Sparse Representation Phone Identification Features in Automatic Speech Recognition Using Exponential Families,” in Proc. Interspeech, September 2010.
  • D. Kanevsky, T. N. Sainath, B. Ramabhadran and D. Nahamoo, "An Analysis of Sparseness and Regularization in Exemplar-Based Methods for Speech Classification,” in Proc. Interspeech, September 2010.
  • A. Sethy, T. N. Sainath, B. Ramabhadran and D. Kanevsky, “Data Selection for Language Modeling Using Sparse Representations,” in Proc. Interspeech, September 2010.
  • D. Kanevsky, A. Carmi, L. Horesh, P. Gurfil, B. Ramabhadran and T.N. Sainath, "Kalman Filtering for Compressed Sensing," in Proc. Information Fusion, Edinburgh, UK, July 2010.
  • T. N. Sainath, D. Nahamoo, B. Ramabhadran and D. Kanevsky, “Sparse Representation Phone Identification Features for Speech Recognition,” Technical Report, Speech and Language Algorithm Group, IBM, April 2010.
  • S. Teller, M. Walter, M. Antone, A. Correa, R. Davis, L. Fletcher, E. Frazzoli, J. Glass, J. How, A. Huange, J. Jeon, S. Karaman, B. Luders, N. Roy and T. N. Sainath, "A Voice-Commandable Robitic Forklift Working Alongside Humans in Minimally-Prepared Outdoor Environments," in Proc. ICRA, Anchorage, Alaska, May 2010.
  • T. N. Sainath, A. Carmi, D. Kanevsky and B. Ramabhadran, “Bayesian Compressive Sensing for Phonetic Classification,” in Proc. ICASSP, Dallas, Texas, March 2010.
  • A. Carmi, T. N. Sainath, P. Gurfil, D. Kanevsky, D. Nahamoo and B. Ramabhadran, “The Use Of Isometric Transformations and Bayesian Estimation In Compressive Sensing for fMRI Classification,” in Proc. ICASSP, Dallas, Texas, March 2010.

2009

2008

2007

2006

Theses