Taro Watanabe

I'm a software engineer at Google, mainly working for statistical machine translation and other areas, such as machine learning (curriculum vitae). I received B.E. from Kyoto University, M.E. from Kyoto University, M.S. from CMU, and Ph. D. from Kyoto University. I was previously affiliated with ATR, NTT and NICT. You can reach me via tarow at google.com.

Book

Talks, Lectures, Tutorials

Activities

Software

  • trance: a transition-based neural network constituent parser
  • cicada: a hypergraph-based machine translation toolkit which supports {string,tree}-to-{string, tree} model
  • expgram: yet-another ngram toolkit with succinct storage
  • pialign: phrasal ITG aligner for phrase table induction
  • lader: latent derivation reorder for pre-reordering of MT input
  • a head-driven transition-based dependency parser

Supervising, Collaborating

  • Interns: Katsuhiko Hayashi (2008,2010-2012, NAIST), Graham Neubig (2010-2012, Kyoto U.),  Isamu Fujiwara (2011, Tottori U.), Daniel Flannery (2011, Kyoto U.), Lemao Liu (2012-2013, Harbin Institute of Tech.), Hidetaka Kamigaito (2013, 2014-2015, Tokyo Institute of Tech.), Hitoshi Otsuki (2014-2015, Kyoto Institute of Tech.)
  • Visiting researchers: Conghui Zhu (2012-2013, Harbin Institute of Tech.)
  • Colleagues: Chooi-Ling Goh (2009,2011), Akihiro Tamura (2011-2014), Hideya Mino (2013-2015), Youzheng Wu (2014), Shumpei Kubosawa (2014-2015), Lemao Liu (2014-2015)

Publications

2016

  • Yusuke Oda, Taku Kudo, Tetsuji Nakagawa and Taro Watanabe. 2016. Phrase-based Machine Translation using Multiple Reordering Candidates. In COLING 2016. [paper]
  • Graham Neubig and Taro Watanabe. 2016. Optimization for Statistical Machine Translation: A Survey. In Computational Linguistics. [paper]
  • Taro Watanabe. 2016. Advances in Structured Learning by Neural Networks. In Journal of the Japanese Society for Artificial Intelligence (invited paper, in Japanese).

2015

  • Xiaolin Wang, Masao Utiyama, Andrew Finch, Taro Watanabe and Eiichiro Sumita. 2015. Leave-one-out Word Alignment without Garbage Collector Effects. In EMNLP 2015. [paper]
  • Hidetaka Kamigaito, Taro Watanabe, Hiroya Takamura, Manabu Okumura andEiichiro Sumita. 2015. Hierarchical Back-off Modeling of Hiero Grammar based on Non-parametric Bayesian Model. In EMNLP 2015. [paper]
  • Taro Watanabe and Eiichiro Sumita. 2015. Transition-based Neural Constituent Parsing. In ACL 2015. [papersoftware]

2014

  • Hidetaka Kamigaito, Taro Watanabe, Hiroya Takamura and Manabu Okumura. 2014. Unsupervised Word Alignment Using Frequency Constraint in Posterior Regularized EM. In EMNLP 2014. [paper]
  • Hideya Mino, Taro Watanabe and Eiichiro Sumita. 2014. Syntax-Augmented Machine Translation using Syntax-Label Clustering. In EMNLP 2014. [paper
  • Akihiro Tamura, Taro Watanabe, Eiichiro Sumita, Hiroya Takamura and Manabu Okumura. 2014. Unsupervised Learning of Part-of-Speech in Dependency Trees for Machine Translation. In Information Processing Society of Japan. pp. 1665-1680 (Vol. 55, No. 7). [paper, Specially Selected PaperOutstanding Paper Award]
  • Youzheng Wu, Taro Watanabe and Chiori Hori. 2014. Recurrent Neural Network-based Tuple Sequence Model for Machine Translation. In COLING 2014. [paper]
  • Akihiro Tamura, Taro Watanabe and Eiichiro Sumita. 2014. Recurrent Neural Networks for Word Alignment Model. In ACL 2014. [paper]

2013

  • Lemao Liu, Tiejun Zhao, Taro Watanabe end Eiichiro Sumita. 2013. Tuning SMT with a Large Number of Features via Online Feature Grouping. In IJCNLP 2013. [paper (note that this is version 2 which corrects bugs in the original paper)]
  • Akihiro Tamura, Taro Watanabe, Eiichiro Sumita, Hiroya Takamura and Manabu Okumura. 2013. Part-of-Speech Induction in Dependency Trees for Statistical Machine Translation. In ACL 2013. [paper]
  • Lemao Liu, Taro Watanabe, Eiichiro Sumita and Tiejun Zhao. 2013. Additive Neural Networks for Statistical Machine Translation. In ACL 2013. [paper]
  • Conghui Zhu, Taro Watanabe, Eiichiro Sumita and Tiejun Zhao. 2013. Hierarchical Phrase Table Combination for Machine Translation. In ACL 2013. [paper]
  • Akihiro Tamura, Taro Watanabe, Eiichiro Sumita, Hiroya Takamura and Manabu Okumura. 2013. Extracting Translation Pairs from Comparable Corpora through Graph-based Label Propagation. In Journal of Natural Language Processing, Vol. 20 (2013), No. 2, pp. 133-160. [paper]
  • Graham Neubig, Taro Watanabe, Shinsuke Mori and Tatsuya Kawahara. 2013. Substring-based Machine Translation. in Machine Translation. March 2013. [link, code]

2012

  • Lemao Liu, Tiejun Zhao, Taro Watanabe, Hailong Cao and Conghui Zhu. 2012. Expected Error Minimization with Ultraconservative Update for SMT. In COLING 2012. [paper]
  • Akihiro Tamura, Taro Watanabe and Eiichiro Sumita. 2012. Bilingual Lexicon Extraction from Comparable Corpora Using Label Propagation. In EMNLP-CoNLL 2012. [paper]
  • Graham Neubig, Taro Watanabe and Shinsuke Mori. 2012. Inducing a Discriminative Parser to Optimize Machine Translation Reordering. In EMNLP-CoNLL 2012. [papercode]
  • Lemao Liu, Hailong Cao, Taro Watanabe, Tiejun Zhao, Mo Yu and Conghui Zhu. 2012. Locally Training the Log-Linear Model for SMT. In EMNLP-CoNLL 2012. [paper]
  • Katsuhiko Hayashi, Taro Watanabe, Masayuki Asahara and Yuji Matsumoto. 2012. Head-Driven Transition-based Parsing with Top-Down Prediction. In ACL 2012. [paper]
  • Graham Neubig, Taro Watanabe, Shinsuke Mori and Tatsuya Kawahara. 2012. Machine Translation without Words through Substring Alignment. In ACL 2012. [papercode]
  • Taro Watanabe. 2012. Optimized Online Rank Learning for Machine Translation. In NAACL 2012. [paper, postercode]
  • Taro Watanabe. 2012. Field of Statistical Machine Translation. In Journal of the Japanese Society for Artificial Intelligence (invited paper, in Japanese). pp. 288-295. Vol. 27 No. 3 May 2012. [paper]
  • Chooi-Ling Goh, Taro Watanabe, and Eiichiro Sumita. 2012. Japanese argument reordering based on dependency structure for statistical machine translation. In IEICE Transactions on Information and System, pp. 1668-1675. June 2012. [link]
  • Graham Neubig, Taro Watanabe, Eiichiro Sumita, Shinsuke Mori, Tatsuya Kawahara. 2012.
    Joint Phrase Alignment and Extraction for Statistical Machine Translation. In Journal of Information Processing, pp. 512-523. April 2012. [link, code, Outstanding Paper Award]

2011

  • Katsuhiko Hayashi, Taro Watanabe, Masayuki Asahara and Yuji Matsumoto. 2011. Third-order Variational Reranking on Packed-Shared Dependency Forests. In Proceedings of EMNLP 2011, Edinburgh, Scotland, UK, July. [paper]
  • Graham Neubig, Taro Watanabe, Eiichiro Sumita, Shinsuke Mori and Tatsuya Kawahara. 2011. An Unsupervised Model for Joint Phrase Alignment and Extraction. In ACL 2011. [paper, code]
  • Taro Watanabe and Eiichiro Sumita. 2011. Machine Translation System Combination by Confusion Forest. In ACL 2011. [paper, slide, poster, code]

2010

  • Keiji Yasuda, Taro Watanabe, Masao Utiyama and Eiichiro Sumita. 2010. System Description of NiCT SMT for NTCIR-8. In NTCIR-8. [paper]
  • Chooi-Ling Goh, Taro Watanabe, Hirofumi Yamamoto, and Eiichiro Sumita. 2010. Constraining a generative word alignment model with discriminative output. In IEICE Transactions on Information and System, pp. 1976-1983. July 2012. [link]

2009

  • Katsuhiko Hayashi, Taro Watanabe, Hajime Tsukada, and Hideki Isozaki. 2009. Structural Support Vector Machines for Log-linear approach in Statistical Machine Translation. In Proceedings of IWSLT 2009, Tokyo, Japan, pp. 144-151, Dec. 2009. [paper]
  • Taro Watanabe, Hajime Tsukada, and Hideki Isozaki. 2009. A succinct n-gram language model. In  ACL-IJCNLP 2009. pp. 341--344. [paper]

2008

  • Taro Watanabe, Hajime Tsukada and Hideki Isozaki. 2008. NTT SMT System 2008 at NTCIR-7. In NTCIR-7. [paper]
  • Katsuhito Sudoh, Taro Watanabe, Jun Suzuki, Hajime Tsukada and Hideki Isozaki. 2008. NTT Statistical Machine Translation System for IWSLT 2008. In IWSLT 2008. [paper]

2007

  • Taro Watanabe, Jun Suzuki, Katsuhito Sudoh, Hajime Tsukada and Hideki Isozaki. 2007. Larger Feature Set Approach for Machine Translation in IWSLT 2007. In IWSLT 2007. [paper, slide]
  • Taro Watanabe, Jun Suzuki, Hajime Tsukada and Hideki Isozaki. 2007. Online Large-Margin Training for Statistical Machine Translation. In EMNLP-CoNLL 2007 pp. 764-773. [paper, slide]
  • Taro Watanabe, Kenji Imamura, Eiichiro Sumita and Hiroshi G. Okuno. 2007. Statistical machine translation using hierarchical phrase alignment. In Systems and Computers in Japan. Vol. 38, Issue 6. pp. 70-79. June. [link]

2006

  • Taro Watanabe, Jun Suzuki, Hajime Tsukada and Hideki Isozaki. 2006. NTT Statistical Machine Translation for IWSLT 2006. In Proceedings of IWSLT 2006 pp. 95-102. [paper]
  • Taro Watanabe, Hajime Tsukada and Hideki Isozaki. 2006. Left-to-Right Target Generation for Hierarchical Phrase-based Translation. In Proceedings of COLING-ACL 2006 pp.777-784. [paper]
  • Taro Watanabe, Hajime Tsukada and Hideki Isozaki. 2006. NTT System Description for the WMT2006 Shared Task. In Proceedings of NAACL 2006 Workshop on Statistical Machine Translation pp.122-125. [paper, slide]

2005

  • Hajime Tsukada, Taro Watanabe, Jun Suzuki, Hideto Kazawa, Hideki Isozaki. 2005. The NTT Statistical Machine Translation System for IWSLT2005. In Proceedings of IWSLT 2005. [paper]
  • Young-Sook Hwang, Taro Watanabe and Yutaka Sasaki. 2005. Empirical Study of Utilizing Morph-Syntactic Information in SMT. In Proc. of IJCNLP-05 pp.474-485. [paper]

2004

  • Eiichiro Sumita, Yasuhiro Akiba, Takao Doi, Andrew Finch, Kenji Imamura, Hideo Okuma, Michael Paul, Mitsuo Shimohata and Taro Watanabe. 2004. EBMT, SMT, Hybrid and More: ATR Spoken Language Translation System. In Proceedings of IWSLT 2004 pp.13-20. [paper]
  • Andrew Finch, Taro Watanabe, Yasuhiro Akiba, Eiichiro Sumita. 2004. Paraphrasing as Machine Translation. In Journal of Natural Language Processing Vol.11, No.5, pp.87-111. [paper]
  • Ruiqiang Zhang, Gen-ichiro Kikui, Hirofumi Yamamoto, Frank K. Soong, Taro Watanabe, Eiichiro Sumita and Wai Kit Lo. 2004Improved spoken language translation using n-best speech recognition hypotheses. In INTERSPEECH 2004.
  • Ruiqiang Zhang, Genichiro Kikui, Hirofumi Yamamoto, Frank Soong, Taro Watnabe and Wai Kit Lo. 2004. A Unified Approach in Speech-to-Speech Translation: Integrating Features of Speech recognition and Machine Translation. In COLING 2004 pp.1168--1174. [paper]
  • Richard Zens, Hermann Ney, Taro Watanabe, Eiichiro Sumita. 2004. Reordering Constraints for Phrase-Based Statistical Machine Translation. In COLING 2004 pp.205-211. [paper]
  • Kenji IMAMURA, Hideo OKUMA, Taro WATANABE, Eiichiro SUMITA. 2004. Example-based Machine Translation Based on Syntactic Transfer with Statistical Models. In COLING 2004, Vol.I, pp.99-105. [paper]
  • Taro Watanabe. 2004. Example-based Statistical Machine Translation. Ph.D. thesis, Kyoto University.
  • Taro WATANABE, Kenji IMAMURA, Eiichiro SUMITA, Hiroshi G. OKUNO. 2004. Statistical Machine Translation Using Hierarchical Phrase Alignment. In THE IEICE TRANSACTION ON INFROMATION AND SYSTEMS, PT.2(JAPANESE EDITION) , Vol.J87-D-II, No.4, pp.978-986.

2003

  • Taro Watanabe, Eiichiro Sumita and Hiroshi G. Okuno. 2003. Decoding Algorithms for Statistical Machine Transaltion Considering Generation Directions. In Information Processing Society of Japan pp. 3202 - 3210 (Vol. 44, No. 12) [paper]
  • Taro Watanabe and Eiichiro Sumita. 2003. Example-based Decoding for Statistical Machine Translation. In Machine Translation Summit IX. pp. 410-417 New Orleans, Louisiana. [paper, slide]
  • Taro Watanabe and Eiichiro Sumita. 2003. Statistical Machine Translation by Example-based Decoder. In Forum on Information Technology (FIT2003). Japan
  • Taro Watanabe, Eiichiro Sumita and Hiroshi G. Okuno. 2003. Chunk-based Statistical Translation. In 41st Annual Meeting of the Association for Computational Linguistics (ACL 2003). pp. 303-310 Sapporo, Japan. [paper, slide]
  • Eiichiro SUMITA, Yasuhiro AKIBA, Takao DOI, Andrew FINCH, Kenji IMAMURA, Michael PAUL, Mitsuo SHIMOHATA and Taro WATANABE. 2003. A Corpus-Centered Approach to Spoken Language Translation. EACL-2003 pp.171-174. [paper]

2002

  • Taro Watanabe and Eiichiro Sumita. 2002. Statistical Machine Translation Decoder Based on Phrase. In 7th International Conference on Spoken Language Processing (ICSLP 2002) pp. 1889-1892 Denver, Colorado, USA, September
  • Taro Watanabe and Eiichiro Sumita. 2002. Bidirectional Decoding for Statistical Machine Translation. In 19th International Conference on Computational Linguistics (COLING 2002) pp. 1079-1085 Taipei, Taiwan, August. [paper]
  • Yasuhiro Akiba, Taro Watanabe and Eiichiro Sumita. 2002. Using Langauge and Translation Models to Select the Best among Outputs from Multiple MT Systems. In 19th International Conference on Computational Linguistics (COLING 2002) Taipei, Taiwan, August. [paper]
  • Hideharu Nakajima, Hirofumi Yamamoto and Taro Watanabe. 2002. Language Model Adaptation with Additional Text Generated by Machine Translation. In 19th International Conference on Computational Linguistics (COLING 2002) Taipei, Taiwan, August. [paper]
  • Andrew FINCH, Taro WATANABE and Eiichiro SUMITA. 2002. Paraphrasing by Statistical Machine Translation. In Forum on Information Technology (FIT 2002) (Vol..2) pp.187-188. Japan
  • Taro Watanabe, Mitsuo Shimohata and Eiichiro Sumita. 2002. Statistical Machine Translation on Paraphrased Corpora. In Third International Conference on Language Resources and Evaluation (LREC 2002), pp. 1954-1957 Las Palmas, Canary Islands, Spain, May. [paper]
  • Hideharu Nakajima, Hirofumi Yamamoto, Taro Watanabe. 2002. Language Model Adaptation with Additional Texts Generated by Machine Translation. In the 8th Annual Meeting of NLP pp.283-286 Japan
  • Taro Watanabe, Kenji Imamura and Eiichiro Sumita. 2002. Statistical Machine Translation Based on Hierarchical Phrase Alignment. In 9th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI 2002) , pp. 188-198 Keihanna, Japan, March. [paper, slide]

2000

  • Lessons Learned from a Task-based Evaluation of Speech-to-Speech Machine Translation. 2000. Lori Levin, Boris Bartlog, Ariadna Font Llitjos, Donna Gates, Alon Lavie, Dorcas Wallace, Taro Watanabe and Monika Woszczyna. In LREC 2002. [paper]