Tara Sainath


Biography

I received my PhD in Electrical Engineering and Computer Science from MIT in 2009. The main focus of my PhD work was in acoustic modeling for noise robust speech recognition. After my PhD, I spent 5 years at the Speech and Language Algorithms group at IBM T.J. Watson Research Center, before joining Google Research. I have co-organized a special session on Sparse Representations at Interspeech 2010 in Japan. I have also organized a special session on Deep Learning at ICML 2013 in Atlanta. In addition, I am a staff reporter for the IEEE Speech and Language Processing Technical Committee (SLTC) Newsletter. My research interests are mainly in acoustic modeling, including deep neural networks, sparse representations and adaptation methods.

Invited Talks

Publications

2017

  • T. N. Sainath, V. Peddinti, O. Siohan and A. Narayanan, "Annealed F-smoothing as a Mechanism to Speed up Neural Network Training," to appear in Proc. Interspeech, 2017.
  • R. Prabhavalkar, T. N. Sainath, B. Li, K. Rao and N. Jaitly, "An Analysis of "Attention" in Sequence-to-Sequence Models," to appear in Proc. Interspeech, 2017.
  • R. Prabhavalkar, K. Rao, T.N. Sainath, B. Li, L. Johnson and N. Jaitly, "A Comparison of Sequence-to-Sequence Models for Speech Recognition," to appear in Proc. Interspeech, 2017.
  • G. Pundak and T. N. Sainath, "Highway-LSTM and Recurrent Highway Networks for Speech Recognition," to appear in Proc. Interspeech, 2017.
  • B. Li, T. N. Sainath, J. Caroselli, A. Narayanan, M. Bacchiani, A. Misra, I. Shafran, H. Sak, G. Pundak, K. Chin, K. Sim, R. J. Weiss, K. W. Wilson, E. Variani, C. Kim, O. Siohan, M. Weintraub, E. McDermott, R. Rose and M. Shannon, "Acoustic Modeling for Google Home," to appear in Proc. Interspeech, 2017.
  • B. Li and T. N. Sainath, "Reducing the Computational Complexity of Two-Dimensional LSTMs," to appear in Proc. Interspeech, 2017.
  • S. Chang, B. Li, T. N. Sainath, G. Simko and C. Parada, "Endpoint Detection using Grid Long Short-term Memory Networks for Streaming Speech Recognition," to appear in Proc. Interspeech, 2017.
  • C. Kim, A. Misra, K. Chin, T. Hughes, A, Narayanan, T. N. Sainath and M. Bacchiani, "Generation of Simulated Utterances in Virtual Rooms to Train Deep Neural Networks for Far-field Speech Recognition in Google Home," to appear in Proc. Interspeech, 2017.
  • T. N. Sainath, R. J. Weiss, K. W. Wilson, B. Li, A. Narayanan, E. Variani, M. Bacchiani, I. Shafran, A. Senior, K. Chin, A. Misra and C. Kim "Raw Multichannel Processing Using Deep Neural Networks," chapter in New Era for Robust Speech Recognitino: Exploiting Deep Learning, 2017.

2016

  • R. Zazo, T. N. Sainath, G. Simko and C. Parada, "Feature Learning with Raw-Waveform CLDNNs for Voice Activity Detection," in Proc. Interspeech, 2016.

2015

    2014

    2013

    2012

    • T. N. Sainath, B. Ramabhadran, D. Nahamoo, D. Kanevsky, D. Van Compernolle, K. Demuynck, J. F. Gemmeke, J. R. Bellegarda, S. Sundaram, "Exemplar-Based Processing for Speech Recognition," in IEEE Signal Processing Magazine, 29, November 2012.

    2011

    • D. Kanevsky, D. Nahamoo, T. N. Sainath and B. Ramabhadran, "Convergence of Line Search A-Function Methods," in Proc. Interspeech, August 2011.
    • D. Kanevsky, D. Nahamoo, T. N. Sainath, B. Ramabhadran and P. A. Olsen, "A-Functions: A Generalization of Extended Baum-Welch Transformations to Convex Optimization," in Proc. ICASSP, May 2011. 

    2010    

    2009

    2008

    2007

    2006

    Theses