At IBM since 1995, I have made significant contributions to the ViaVoice line of products focusing on acoustic modeling. I have served as the Principal Investigator on two major international projects: the NSF-sponsored Multilingual Access to Large Spoken Archives (MALACH) project, developing algorithms for transcription of helderly, accented speech, and the EU-sponsored TC-STAR project, developing algorithms for recognition of EU parliamentary speeches. I was also the lead at IBM for the Spoken-Term Detection evaluation in 2006. I am currently responsible for acoustic and language modeling research for both, commercial and government projects ranging from voice search and transcription tasks to spoken term detection in multiple languages and expressive synthesis. Here is the Google Scholar Link to my publications. Recent Publications 2012 T. N. Sainath, B. Kingsbury and B. Ramabhadran, "Improving Training Time of Deep Belief Networks Through Hybrid Pre-Training And Larger Batch Sizes," in Proc. NIPS Workshop on Log-linear Models, Dec. 2012. T. N. Sainath, B. Ramabhadran, D. Nahamoo, D. Kanevsky, D. Van Compernolle, K. Demuynck, J. F. Gemmeke, J. R. Bellegarda, S. Sundaram, "Exemplar-Based Processing for Speech Recognition," in IEEE Signal Processing Magazine, 29, November 2012. T. N. Sainath, D. Nahamoo, B. Ramabhadran and D. Kanevsky, "Enhancing Exemplar-Based Posteriors for Speech Recognition Tasks," in Proc. Interspeech, September 2012. E. Arisoy, T. N. Sainath, B. Kingsbury, and B. Ramabhadran, "Deep Neural Network Language Models," in Proc. NAACL, June 2012.
T. N. Sainath, B. Kingsbury, and B. Ramabhadran, "Auto-Encoder Bottleneck Features Using Deep Belief Networks," in Proc. ICASSP, March 2012. C. Plahl, T. N. Sainath, B. Ramabhadran and D. Nahamoo, "Improved Pre-Training of Deep Belief Networks Using Sparse Encoding Symmetric Machines," in Proc. ICASSP, March 2012. N. Itoh, T. N. Sainath, D. Jiang, J. Zhou and B. Ramabhadran, "N-best Entropy Based Data Selection for Acoustic Modeling," in Proc. ICASSP, March 2012. R. Fernandez, S. Minnis, B. Ramabhadran, "Prediction of F0 contours from Symbolic and Numerical Variables using Continuous Conditional Random Fields," in Proc. ICASSP, March 2012. K. Audhkhasi, A. Sethy, B. Ramabhadran, S. Narayanan, "Creating Ensemble of Diverse Maximum Entropy Models," in Proc. ICASSP, March 2012. A. Rosenberg, R. Fernandez, B. Ramabhadran, "Phrase Boundary Assignment from Text in Multiple Domains," in Proc. Interspeech, September 2012. 2011 Ciprian Chelba, Timothy J. Hazen, Bhuvana Ramabhadran, and Murat Saraclar,
"Speech Retrieval",
Chapter 15 of Spoken Language Understanding: Systems for Extracting Semantic Information from Speech,
Gokhan Tur and Renato De Mori (Editors), John Wiley & Sons, 2011.
T. N. Sainath, B. Kingsbury, B. Ramabhadran, P. Fousek, P. Novak and A. Mohamed, "Making Deep Belief Networks Effective for Large Vocabulary Continuous Speech Recognition," in Proc. ASRU, December 2011. T. N. Sainath, D. Nahamoo, D. Kanevsky, B. Ramabhadran and P. M. Shah, “A Convex Hull Approach to Sparse Representations for Exemplar-Based Speech Recognition,” in Proc. ASRU, December 2011. T. N. Sainath, B. Ramabhadran, M. Picheny, D. Nahamoo and D. Kanevsky, “Exemplar-Based Sparse Representation Features: From TIMIT to LVCSR,” in IEEE Transactions on Speech and Audio Processing, November 2011. Stanley F. Chen, Abhinav Sethy, Bhuvana Ramabhadran, "Pruning Exponential Language Models", in Proc. ASRU, December 2011. R. Tachibana, T. Fukuda. U. Chaudhari, B. Ramabhadran, and P. Zhan, "Frame-level AnyBoost for LVCSR with the MMI Criterion", in Proc. ASRU, December 2011.
T. N. Sainath, B. Ramabhadran, D. Nahamoo and D. Kanevsky,
“Reducing Computational Complexities of Exemplar-Based Sparse Representations With Applications to Large Vocabulary Speech
Recognition,” in Proc. Interspeech, August 2011.
D. Kanevsky, D. Nahamoo, T. N. Sainath and B. Ramabhadran, "Convergence of Line Search A-Function Methods," in Proc. Interspeech, August 2011. T. N. Sainath, D. Nahamoo, D. Kanevsky, B. Ramabhadran and P. M. Shah, “A Convex Hull Approach to Sparse Representations for Exemplar-Based Speech Recognition,” Technical Report, Speech and Language Algorithm Group, IBM, April 2011. T. N. Sainath, D. Nahamoo, D. Kanevsky, B. Ramabhadran and P. M. Shah, “Exemplar-Based Sparse Representation Phone Identification Features,” in Proc. ICASSP, May 2011. A. Mohamed, T. N. Sainath, G. Dahl, B. Ramabhadran, G. Hinton and M. Picheny, "Deep Belief Networks using Discriminative Features for Phone Recognition," in Proc. ICASSP, May 2011. D. Kanevsky, D. Nahamoo, T. N. Sainath, B. Ramabhadran and P. A. Olsen, "A-Functions: A Generalization of Extended Baum-Welch Transformations to Convex Optimization," in Proc. ICASSP, May 2011. B. Zhang, A. Sethy, T. N. Sainath and B. Ramabhadran, "Application Specific Loss Minimization Using Gradient Boosting," in Proc. ICASSP, May 2011. |