Tara Sainath


I received my PhD in Electrical Engineering and Computer Science from MIT in 2009. The main focus of my PhD work was in acoustic modeling for noise robust speech recognition. After my PhD, I spent 5 years at the Speech and Language Algorithms group at IBM T.J. Watson Research Center, before joining Google Research. I have served as a Program Chair for ICLR in 2017 and 2018. Also, I have co-organized numerous special sessions and workshops, including Interspeech 2010, ICML 2013, Interspeech 2016 and ICML 2017. In addition, I am a member of the IEEE Speech and Language Processing Technical Committee (SLTC) as well as the Associate Editor for IEEE/ACM Transactions on Audio, Speech, and Language Processing. My research interests are mainly in acoustic modeling, including deep neural networks, sparse representations and adaptation methods.

Invited Talks



  • Y. He, T. N. Sainath, R. Prabhavalkar, I. McGraw, R. Alvarez, D. Zhao, D. Rybach, A. Kannan, Y. Wu, R. Pang, Q. Liang, D. Bhatia, Y. Shangguan, B. Li, G. Pundak, K. Sim, T. Bagby, S. Chang, K. Rao, A. Gruenstein, "Streaming End-to-end Speech Recognition For Mobile Devices," in Proc. ICASSP, 2019.
  • J. Heymann, M. Bacchiani and T. N. Sainath, "Performance of mask based statistical beamforming in a smart home scenario," in Proc. ICASSP, 2018.
  • S. Chang, B. Li, G. Simko, T .N. Sainath, A. Tripathi, A. Oord, O. Vinyals, "Temporal Modeling Using Dialated Convolution and Gating For Voice Activity Detection," in Proc. ICASSP, 2018.
  • C. Kim and T. N. Sainath and A. Narayanan and A. Misra and R. Nongpiur and M. Bacchiani, "Spectral Distortion Model for Training Phase-Sensitive Deep Neural Networks for Far-field Speech Recognition," in Proc. ICASSP, 2018.


  • B. Li, T. N. Sainath, J. Caroselli, A. Narayanan, M. Bacchiani, A. Misra, I. Shafran, H. Sak, G. Pundak, K. Chin, K. Sim, R. J. Weiss, K. W. Wilson, E. Variani, C. Kim, O. Siohan, M. Weintraub, E. McDermott, R. Rose and M. Shannon, "Acoustic Modeling for Google Home," in Proc. Interspeech, 2017.
  • T. N. Sainath, R. J. Weiss, K. W. Wilson, B. Li, A. Narayanan, E. Variani, M. Bacchiani, I. Shafran, A. Senior, K. Chin, A. Misra and C. Kim "Raw Multichannel Processing Using Deep Neural Networks," chapter in New Era for Robust Speech Recognitino: Exploiting Deep Learning, 2017.






    • T. N. Sainath, B. Ramabhadran, D. Nahamoo, D. Kanevsky, D. Van Compernolle, K. Demuynck, J. F. Gemmeke, J. R. Bellegarda, S. Sundaram, "Exemplar-Based Processing for Speech Recognition," in IEEE Signal Processing Magazine, 29, November 2012.


    • D. Kanevsky, D. Nahamoo, T. N. Sainath and B. Ramabhadran, "Convergence of Line Search A-Function Methods," in Proc. Interspeech, August 2011.
    • D. Kanevsky, D. Nahamoo, T. N. Sainath, B. Ramabhadran and P. A. Olsen, "A-Functions: A Generalization of Extended Baum-Welch Transformations to Convex Optimization," in Proc. ICASSP, May 2011.