Nobuhiro Kaji (鍜治 伸裕)


Nobuhiro Kaji is senior chief researcher at Yahoo! JAPAN research. Formerly, he was research associate, research assistant professor, and research associate professor at IIS, the University of Tokyo. He was also senior researcher at NICT. He received PhD in information science and technology from the University of Tokyo in 2005. His research interests are in natural language processing and related fields: paraphrase generation and recognition, analysis of sentiment and emotion in textual data, NLP for supporting linguistics, and Web text mining for social analysis. E-mail address is: nkaji(at)

Selected Publications

  • Shumpei Sano, Nobuhiro Kaji and Manabu Sassano2016. Prediction of Prospective User Engagement with Intelligent Assistants. In Proceedings of ACL, pages 1023-1212.
  • Tatsuya Iwanari, Naoki Yoshinaga, Nobuhiro Kaji, Toshiharu Nishina, Masashi Toyoda and Masaru Kitsuregawa. 2016. Ordering Concepts based on Common Attribute Intensity. In Proceedings of IJCAI, pages 3747-3753.
  • Nobuhiro Kaji and Masaru Kitsuregawa. 2014. Accurate Word Segmentation and POS Tagging for Japanese Microblogs: Corpus Annotation and Joint Modeling with Lexical Normalization. In Proceedings of EMNLP, pages 99-109.
  • Nobuhiro Kaji and Masaru Kitsuregawa. 2013. Efficient Word Lattice Generation for Joint Word Segmentation and POS Tagging in Japanese. In Proceedings of IJCNLP, pages 153-161.
  • Takayuki Hasegawa, Nobuhiro Kaji, Naoki Yoshinaga and Masashi Toyoda. 2013. Predicting and Eliciting Addressee's Emotion in Online Dialogue. In Proceedings of ACL, pages 964-972.
  • Yohei Takaku, Nobuhiro Kaji, Naoki Yoshinaga and Masashi Toyoda. 2012. Identifying Constant and Unique Relations by using Time-Series Texts. In Proceedings of EMNLP-CoNLL, pages 882-893.
  • Nobuhiro Kaji and Masaru Kitsuregawa. 2011. Splitting Noun Compounds via Monolingual and Bilingual Paraphrasing: A Study on Japanese Katakana Words. In Proceedings of EMNLP, pages 959-969.
  • Nobuhiro Kaji, Yasuhiro Fujiwara, Naoki Yoshinaga and Masaru Kitsuregawa. 2010. Efficient Staggered Decoding for Sequence Labeling. In Proceedings of ACL, pages 485-494.
  • Nobuhiro Kaji and Masaru Kitsuregawa. 2008. Using Hidden Markov Random Fields to Combine Distributional and Pattern-based Word Clustering. In Proceedings of COLING, pages 401-408.
  • Nobuhiro Kaji and Masaru Kitsuregawa. 2007. Building Lexicon for Sentiment Analysis from Massive Collection of HTML Documents. In Proceedings of EMNLP-CoNLL, pages 1075-1083.
  • Nobuhiro Kaji, Masashi Okamoto and Sadao Kurohashi. 2004.  Paraphrasing Predicates from Written Language to Spoken Language Using the Web.  In Proceedings of HLT-NAACL, pages 241-248.
  • Nobuhiro Kaji, Daisuke Kawahara, Sadao Kurohashi and Satoshi Sato. 2002. Verb Paraphrase based on Case Frame Alignment. In Proceedings of ACL, pages 215-222.
  • The complete list of publications will be available here.

Invited Talks

  • Splitting Katakana Noun Compounds by Paraphrasing and Back-transliteration, 19th Annual Meeting of the Association for Natural Language Processing, Nagoya, Japan, March, 2013.
  • Natural Language Processing for CGM Text: Perspectives and Challenges, ARG SIG-WI2, Kanagawa, Japan, December, 2012.
  • Processing of New Words and Informal Spellings, Challenges to Real World NLP, National Convention of IPSJ, Nagoya, Japan, March, 2012 (joint talk with Ryohei Sasano).
  • Opinion Mining from Text and Machine Learning, JSAI SIG-FPAI, Tokyo, Japan, March, 2009.
  • An Attempt on Sentiment Analysis, Rakuten lab. seminar, Tokyo, Japan, 2006.
  • Automatic Construction of Polarity-tagged Corpus from HTML Documents, Yahoo! Japan lab. seminar, Tokyo, Japan, 2006.


  • Ryoko Uno, Nobuhiro Kaji, and Masaru Kitsuregawa. ウェブコーパスの広がりから現れるオノマトペの二つの境界. Kazuko Shinohara and Ryoko Uno (eds.), Sound Symbolism and Mimetics: Rethinking the relationship between sound and meaning in language, Hituzi Shobo.
  • Nobuhiro Kaji, Daisuke Kawahara, Sadao Kurohashi, and Satoshi Sato. 2013. Paraphrasing Predicates based on Case Frame Alignment. Timothy Baldwin, Francis Bond, Kentaro Inui, Shun Ishizaki, Hiroshi Nakagawa and Akira Shimazu (eds.), Readings in Japanese Natural Language Processing, pages 231-244, CSLI Publications.
  • Sadao Kurohashi, Daisuke Kawahara, Nobuhiro Kaji, and Tomohide Shibata. 2007. Automatic Text Presentation for the Conversational Knowledge Process. Toyoaki Nishida (ed.), Conversational Informatics: An Engineering Approach, pages 201-216, John Wiley&Sons, Ltd.

Software and Linguistic Resources

  • Software
    • HMP: Hidden Markov perceptron for large-scale sequence labeling [link]
  • Linguistic resources
    • Polar phrase dictionary [link]
    • Automatically constructed polarity-tagged corpus [link]

Professional Activities

  • PC member/reviewer (international conferences)
    • EACL (2017)
    • AAAI (2017)
    • The 2nd Workshop on Noisy User-generated Text (W-NUT2016)
    • International Conference on Emerging Databases (EDB2016)
    • SocialNLP (2016)
    • BigComp (2016)
    • Workshop on Data Discretization and Segmentation for Knowledge Discovery (DDS2013)
    • ACL (2016, 2017)
    • NAACL (2012, 2016)
    • EMNLP (2012, 2014, 2015, 2017)
    • Coling (2012, 2014, 2016)
    • WWW (2009-2013, 2015, 2017)
    • ICDM (2008-2013)
    • APWeb (2012-2013)
  • PC (co-)chair (domestic conferences)
    • YANS (2009-2011)
  • PC member (domestic conferences)
    • DBSJ Social Computing Symposium (2014)
    • WebDB Forum (2013, 2015)
    • JSAI SIG-FPAI (2013-2015)
    • ARG WI2 (2012-present)
    • NLP (2012, 2013)
    • IEICE Technical Committee on Thought and Language (2012-present)
    • YANS (2006, 2007)
  • Editorial committee member
    • Transactions of the Japanese Society for Artificial Intelligence (2015-present)
    • Journal of Natural Language Processing (2012-2013)
    • IPSJ Journal/Journal of Information Processing (2011-2015)