Katsuhiko Yamamoto

Ph.D. in Engineering

Research Scientist at AI Lab, CyberAgent, Inc.

My CV is HERE.

  • [NEW] My affiliation was changed to AI Lab, CyberAgent, Inc.

  • GammachirPy: A Python package of the dynamic compressive gammachirp filterbank (Irino & Patterson, 2006) is available on my GitHub page.

E-mail

yamamoto_katsuhiko(at)cyberagent.co.jp

Job Experiences

  • 2023-Current Research Scientist at AI Lab, CyberAgent, Inc., Japan.

  • 2019-2022 Engineer at Toyota Motor Corporation, Japan

  • 2017-2019 Research Fellow of Young Scientist at Japan Society for the Promotion of Science (JSPS), Japan

Educational Backgrounds

  • 2015-2019 Graduate School of System Engineering, Wakayama University, Japan

  • 2013-2015 Graduate School of Information Science, Japan Advanced Institute of Science and Technology, Japan

  • 2011-2013 Advanced course, Kobe city College of Technology, Japan

  • 2006-2011 Associate course, Kobe city College of Technology, Japan

Research Topics

  • Speech Intelligibility Prediction Based on Auditory Computational Models (Ph.D. Research Theme)

  • Perceptional characteristics of bone-conducted sound (Master's Research Theme)

  • Semi-scrambling method for speech signals based on phonemic restoration (Master's Research Sub-theme)

  • Measuring the intelligibility of bone-conducted ultrasonic hearing aid in noisy environments (Undergraduate Research Theme

Published Papers

  • K. Yamamoto, T. Irino, S. Araki, K. Kinoshita, and T. Nakatani, "GEDI: Gammachirp envelope distortion index for predicting intelligibility of enhanced speech,'' Speech Communication, Vol. 123, pp. 43-58, 2020. [Paper] [Software ]

  • K. Yamamoto, T. Irino, S. Araki, K. Kinoshita, and T. Nakatani, “Speech intelligibility prediction using a multi-resolution gammachirp envelope distortion index with common parameters for different noise conditions, ” Acoustical Letter, Acoustical Society of Japan, Vol. 41, Issue 1, pp. 396-399. [Paper]

  • K. Yamamoto, T. Irino, T. Matsui, S. Araki, K. Kinoshita, and T. Nakatani, “Speech intelligibility prediction with the dynamic compressive gammachirp filterbank and modulation power spectrum,” Acoustical Science and Technology, Vol. 40, No. 2, pp. 84-92, March 2019. [Paper]

  • K. Yamamoto, Z. Zhu, M. Unoki and N. Aoki, "Study on Semi-Scramble Method for Speech Signals Based on Phonemic Restoration," Journal of Signal Processing, Research Institute of Signal Processing Japan, Vol.18, No.4 Special Issue on Papers Awarded the Student Paper Award at NCSP'14, pp. 205-208, July 2014. [Paper]

  • Z. Zhu, K. Yamamoto, M. Unoki and N. Aoki, "Study on Scramble Method for Speech Signal by Using Random-Bit Shift of Quantization," Journal of Signal Processing, Research Institute of Signal Processing Japan, Vol.18, No.6 Special Issue on Nonlinear Circuits, Communications and Signal Processing, pp. 303-307, November 2014. [Paper]

Conference Activities & Talks

(International)

  • K. Arai, S. Araki, A. Ogawa, K. Kinoshita, T. Nakatani, K. Yamamoto, and T. Irino (2019) "Predicting Speech Intelligibility of Enhanced Speech Using Phone Accuracy of DNN-based ASR Systems," in Proc. Interspeech 2019, pp. 4275 - 4279. [The overall acceptance rate was about 50%] [Paper]

  • Yamamoto, K., Irino, T., Araki, S., Kinoshita, K., Nakatani, T. (2018) "Speech intelligibility prediction using a multi-resolution gammachirp envelope distortion index with common parameters for different noise conditions." in Proc. Seminar on brain, hearing and speech sciences for universal speech communication/Universal Symposium on Universal Acoustical Communication 2018 (UAC2018), October 2018.

  • Yamamoto, K., Irino, T., Ohashi, N., Araki, S., Kinoshita, K., Nakatani, T. (2018) "Multi-resolution Gammachirp Envelope Distortion Index for Intelligibility Prediction of Noisy Speech." in Proc. Interspeech 2018, 1863-1867. [The overall acceptance rate was about 50%] [Paper] [Software ]

  • Yamamoto, K., Irino, T., Matsui, T., Araki, S., Kinoshita, K., Nakatani, T. (2017) "Predicting Speech Intelligibility Using a Gammachirp Envelope Distortion Index Based on the Signal-to-Distortion Ratio." in Proc. Interspeech 2017, pp. 2949 - 2953, DOI: 10.21437/Interspeech.2017-170 [The overall acceptance rate was about 50%] [Paper] [Software ]

  • K. Yamamoto, T. Irino, T. Matsui, S. Araki, K. Kinoshita and T. Nakatani (2016), “Analysis of acoustic features for speech intelligibility prediction models,” in Proceedings of 5th Joint Meeting of the ASA/ASJ, Journal of the Acoustical Society America, Vol. 140, No. 4, Pt. 2, p. 3114, October 2016.

  • Yamamoto, K., Irino, T., Matsui, T., Araki, S., Kinoshita, K., Nakatani, T. (2016) Speech Intelligibility Prediction Based on the Envelope Power Spectrum Model with the Dynamic Compressive Gammachirp Auditory Filterbank. Proc. Interspeech 2016, 2885 - 2889, [The overall acceptance rate was about 50%] [Paper]

  • K. Yamamoto, T. Irino, T. Matsui, S. Araki, K. Kinoshita and T. Nakatani (2014), “Study on predicting speech intelligibility of enhanced speech sounds using the dynamic compressive gammachirp auditory filterbank and modulation filterbank,” presented at Taiwan/Japan Joint Auditory Research Meeting, National Tsing Hua University, Taiwan, 23–24 October 2015.

  • Yamamoto, K., Zhu, Z., Unoki, M. and Aoki, N. (2014), "Study on Semi-Scramble Method for Speech Signals Based on Phonemic Restoration," 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing, 1PM2-2-1, pp. 201-204, Honolulu, Hawaii, USA, 1-3, March 2014, [Student Paper Award]

  • Zhu, Z., Yamamoto, K., Unoki and M., Aoki, N. (2014), "Study on scramble method for speech signal by using random-bit shift of quantization," 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing, 1PM1-2-2, pp. 109-102, Honolulu, Hawaii, USA, 1-3, March 2014.

(Domestic, Japan)

  • S. Yoshida, K. Yamamoto, T. Matsui, R. Nisimura and T.Irino (2016), “Speech features for hearing impaired listeners: Analysis of envelope modulations of speech sound by using a hearing impaired listener simulator, ” in Proceedings of the 19th Young Researchers Meeting, Kansai Branch of the Acoustical Society of Japan, #44, p. 17, 18 Dec. 2016 (in Japanese).

  • K. Yamamoto, T. Irino, T. Matsui, S. Araki, K. Kinoshita and T. Nakatani, (2016), “Investigation of the dcGC-sEPSM for predicting speech intelligibility: characteristics of the reference noise and the effect on the predicting accuracy,” in Proceedings of the 2016 Autumn Meeting, Acoustical Society of Japan, 2-P-44, pp. 663-666, 14–15 Sep. 2016 (in Japanese).

  • K. Yamamoto, T. Irino, T. Matsui, S. Araki, K. Kinoshita and T. Nakatani (2016), “Predicting speech intelligibility for enhanced speech sound: comparison with the result of the listening experiment,” in Proceedings of the 2016 Spring Meeting, Acoustical Society of Japan, 2-P-23, pp. 823-826, 9–11 Mar. 2016 (in Japanese).

  • K. Yamamoto, T. Irino, T. Matsui, S. Araki, K. Kinoshita and T. Nakatani (2016), “An improvement of the predicting method for speech intelligibility using the dynamic compressive gammachirp filterbank,” in Proceedings of the Auditory Research Meeting, Vol.46, No.1, H-2016-9, pp. 35–40, 20–21 Feb. 2016 (in Japanese).

  • K. Yamamoto, T. Irino, T. Matsui, S. Araki, K. Kinoshita and T. Nakatani (2015), “Intelligiblities of enhanced speeches: Can computers predict human’s hearing?” in Proceedings of the 18th Young Researchers Meeting, Kansai Branch of the Acoustical Society of Japan, #42, p. 16, 13 Dec. 2015 (in Japanese), [First Prize of Best Presentation Award].

  • K. Yamamoto, T. Irino, S. Araki, K. Kinoshita and T. Nakatani (2015), “Predicting speech intelligibility using the dynamic compressive gammachirp auditory filterbank for enhanced speech sounds,” in Proceedings of the 2015 Autumn Meeting, Acoustical Society of Japan, 2-P-36, pp. 473-474, 16–18 Sep. 2015 (in Japanese), [Student Presentation Award].

  • Z. Zhu, K. Yamamoto, M. Unoki, N. Aoki (2014), “Study on scramble method for speech signal by using random bit shift of quantization,” in IEICE technical report., Enriched multimedia (EMM), IEICE, Vol. 113, No. 480, pp. 57–62, 7-8, March 2014 (in Japanese).

  • K. Yamamoto, Z. Zhu, M. Unoki, N. Aoki (2013), “Study on Semi-Scramble Method for Speech Signals Based on Phonemic Restoration,” in IEICE technical report., Enriched multimedia (EMM), IEICE, Vol. 113, No. 290, pp. 59–64, 13–14, November 2013 (in Japanese).

  • K. Yamamoto, Z. Zhu, M. Unoki, N. Aoki (2013), “Investigation of Semi-Scramble Method for Speech Signals Based on Phonemic Restoration,” in Proceeding of the JHES 2013, Hokuriku Branch of the IEEJ, G-18, pp. 21- 22, September 2013 (in Japanese), [Student Presentation Award].

  • Z. Zhu, K. Yamamoto, M. Unoki, N. Aoki (2013), “Investigation of scramble method for speech signal by using random bit shift of quantization,” in Proceeding of the JHES 2013, Hokuriku Branch of the IEEJ, G-17, 21-22, September 2013 (in Japanese).

  • K. Yamamoto and Y. Nagatani (2012), “Robustness of the bone-conducted ultrasound hearing aid in noisy environments,” in Proceedings of the 15th Young Researchers Meeting, Kansai Branch of the Acoustical Society of Japan, p.7, December 2012 (in Japanese), [Second Prize of Best Presentation Award and Special Award].

  • K.Yamamoto and Y.Nagatani(2012), “Improvement of Intelligibility of Bone-Conducted Ultrasonic Hearing Aid with Microphone Array System,” in Proceedings of the Auditory Research Meeting, Acoustical Society Japan, Vol.4, pp.537-542, September 2012 (in Japanese).

Software

Awards