発表リスト(学術論文/レター)
学術論文
S. Luan, Y. Wakabayashi, T. Toda. Unequally spaced sound field interpolation for rotation-robust beamforming. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 32, pp. 3185-3199, June 2024.
L.P. Violeta, D. Ma, W.-C. Huang, T. Toda. Pretraining and adaptation techniques for electrolaryngeal speech recognition. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 32, pp. 2777-2789, May 2024.
M. Eshghi, T. Toda. An investigation of fundamental frequency pattern prediction for Japanese eelectrolaryngeal speech enhancement based on frame-wise phoneme representations. IEEE Access, Vol. 12, pp. 50137-50153, Apr. 2024.
R. Wang, L. Li, T. Toda. Dual-channel target speaker extraction based on conditional variational autoencoder and directional information. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 32, pp. 1968-1979, Mar. 2024.
H. Yamashita, T. Okamoto, R. Takashima, Y. Ohtani, T. Takiguchi, T. Toda, H. Kawai. Fast neural speech waveform generative models with fully-connected layer-based upsampling. IEEE Access, Vol. 12, pp. 31409-31421, Feb. 2024.
C. Xie, T. Toda. Noisy-to-noisy voice conversion under variations of noisy condition. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 31, pp. 3871-3882, Oct. 2023.
R. Yoneyama, Y.-C. Wu, T. Toda. High-fidelity and pitch-controllable neural vocoder based on unified source-filter networks. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 31, pp. 3717-3729, Oct. 2023.
K. Matsubara, T. Okamoto, R. Takashima, T. Takiguchi, T. Toda, H. Kawai. Harmonic-Net: fundamental frequency and speech rate controllable fast neural vocoder. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 31, pp. 1902-1915, May 2023.
W.-C. Huang, S.-W. Yang, T. Hayashi, T. Toda, "A comparative study of self-supervised speech representation based voice conversion. IEEE Journal of Selected Topics in signal Processing, Vol. 16, No. 6, pp. 1308-1318, Oct. 2022.
Y. Yasuda, T. Toda. Investigation of Japanese Png BERT language model in text-to-speech synthesis for pitch accent language. IEEE Journal of Selected Topics in signal Processing, Vol. 16, No. 6, pp. 1319-1328, Oct. 2022.
Y.-C. Wu, P.L. Tobing, K. Yasuhara, N. Matsunaga, Y. Ohtani, T. Toda. A cyclical approach to synthetic and natural speech mismatch refinement of neural post-filter for low-cost text-to-speech system. APSIPA Transactions on Signal and Information Processing, Vol. 11, No. 1, e30, pp. 1-32, Sep. 2022 .
T. Okamoto, K. Matsubara, T. Toda, Y. Shiga, H. Kawai. Neural speech-rate conversion with multispeaker WaveNet vocoder. Speech Communication, Vol. 138, pp. 1-12, Mar. 2022.
K. Matsubara, T. Okamoto, R. Takashima, T. Takiguchi, T. Toda, Y. Shiga, H. Kawai. Full-band LPCNet: a real-time neural vocoder for 48 kHz audio with a CPU. IEEE Access, Vol. 9, pp. 94923-94933, July 2021.
A. Ando, T. Mori, S. Kobashikawa, T. Toda. Speech emotion recognition based on listener-dependent emotion perception models. APSIPA Transactions on Signal and Information Processing, Vol. 10, e6, pp. 1-11, Apr. 2021.
Y.-C. Wu, T. Hayashi, P.L. Tobing, K. Kobayashi, T. Toda. Quasi-periodic WaveNet: an autoregressive raw waveform generative model with pitch-dependent dilated convolution neural network. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 29, pp. 1134-1148, Mar. 2021.
Y.-C. Wu, T. Hayashi, T. Okamoto, H. Kawai, T. Toda. Quasi-periodic parallel WaveGAN: a non-autoregressive raw waveform generative model with pitch-dependent dilated convolution neural network. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 29, pp. 792-806, Feb. 2021.
W.-C. Huang, T. Hayashi, Y.-C. Wu, H. Kameoka, T. Toda. Pretraining techniques for sequence-to-sequence voice conversion. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 29, pp. 745-755, Feb. 2021.【IEEE Signal Processing Society Japan Student Best Paper Award (受賞者:Wen-Chin Huang)】
H. Kameoka, W.-C. Huang, K. Tanaka, T. Kaneko, N. Hojo, T. Toda. Many-to-many voice transformer network. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 29, pp. 656-670, Jan. 2021.
P.L. Tobing, Y.-C. Wu, T. Hayashi, K. Kobayashi, T. Toda. An evaluation of voice conversion with neural network spectral mapping models and WaveNet vocoder. APSIPA Transactions on Signal and Information Processing, Vol. 9, e26, pp. 1-14, Nov. 2020.
X. Wang, J. Yamagishi, M. Todisco, H. Delgado, A. Nautsch, N. Evans, M. Sahidullah, V. Vestman, T. Kinnunen, K.A. Lee, L. Juvela, P. Alku, Y.-H. Peng, H.-T. Hwang, Y. Tsao, H.-M. Wang, S. Le Maguer, M. Becker, F. Henderson, R. Clark, Y. Zhang, Q. Wang, Y. Jia, K. Onuma, K. Mushika, T. Kaneda, Y. Jiang, L.-J. Liu, Y.-C. Wu, W.-C. Huang, T. Toda, K. Tanaka, H. Kameoka, I. Steiner, D. Matrouf, J.-F. Bonastre, A. Govender, S. Ronanki, J.-X. Zhang, Z.-H. Ling. ASVspoof 2019: a large-scale public database of synthetic, converted and replayed speech. Computer Speech and Language, Vol. 64, Article 101114, 25 pages, Nov. 2020.
Y.-C. Wu, P.L. Tobing, T. Hayashi, K. Kobayashi, T. Toda. Non-parallel voice conversion system with WaveNet vocoder and collapsed speech suppression. IEEE Access, Vol. 8, No. 1, pp. 62094-62106, Apr. 2020.
大平 茂輝, 清谷 峻也, 伊藤 瑠哉, 岡本 康佑, 谷川 右京, 出口 大輔, 戸田 智基. LMS経由で手書きレポートを返却するWebサービス「かみレポ」の開発・評価. 情報処理学会論文誌:教育とコンピュータ, Vol. 6, No. 1, pp. 52-68, Feb. 2020.
A. Ando, R. Masumura, H. Kamiyama, S. Kobashikawa, Y. Aono, T. Toda. Customer satisfaction estimation in contact center calls based on a hierarchical multi-task model. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 28, No. 1, pp. 715-728, Jan. 2020.
P.L. Tobing, Y.-C. Wu, T. Hayashi, K. Kobayashi, T. Toda. Voice conversion with CycleRNN-based spectral mapping and finly tuned WaveNet vocoder. IEEE Access, Vol. 7, No. 1, pp. 171114-171125, Dec. 2019.
S. Seki, H. Kameoka, L. Li, T. Toda, K. Takeda. Underdetermined source separation based on generalized multichannel variational autoencoder. IEEE Access, Vol. 7, No. 1, pp. 168104-168115, Nov. 2019.
A. Tamamori, T. Hayashi, T. Toda, K. Takeda. Daily activity recognition based on recurrent neural network using multi-modal signals. APSIPA Transactions on Signal and Information Processing, Vol. 7, e21, pp. 1-11, Dec. 2018.
T. Kano, S. Takamichi, S. Sakti, G. Neubig, T. Toda, S. Nakamura. An end-to-end model for cross-lingual transformation of paralinguistic information. Machine Translation, Vol. 32, No. 4, pp. 353-368, Dec. 2018.
S. Seki, T. Toda, K. Takeda. Stereophonic music separation based on non-negative tensor factorization with cepstral distance regularization. IEICE Transactions on Fundamentals, Vol. E101-A, No. 7, pp. 1057-1064, July 2018.
K. Kobayashi, T. Toda, S. Nakamura. Intra-gender statistical singing voice conversion with direct waveform modification using log-spectral differential. Speech Communication, Vol. 99, pp. 211-220, May 2018.
T. Hayashi, M. Nishida, N. Kitaoka, T. Toda, K. Takeda. Daily activity recognition with large-scaled real-life recording datasets based on deep neural network using multi-modal signals. IEICE Transactions on Fundamentals, Vol. E101-A, No. 1, pp. 199-210, Jan. 2018.
P.L. Tobing, K. Kobayashi, T. Toda. Articulatory controllable speech modification based on statistical inversion and production mappings. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 25, No. 12, pp. 2337-2350, Dec. 2017.
T. Hayashi, S. Watanabe, T. Toda, T. Hori, J. Le Roux, K. Takeda. Duration-controlled LSTM for polyphonic sound event detection. IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 25, No. 11, pp. 2059-2070, Nov. 2017.【IEEE Signal Processing Society Japan Young Author Best Paper Award (受賞者:Tomoki Hayashi)】
K. Tanaka, T. Toda, S. Nakamura. A vibration control method of an electrolarynx based on statistical F0 pattern prediction. IEICE Transactions on Information and Systems, Vol. E100-D, No. 9, pp. 2165-2173, Sep. 2017.
Q. Truong Do, T. Toda, G. Neubig, S. Sakti, S. Nakamura. Preserving word-level emphasis in speech-to-speech translation. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 25, No. 3, pp. 544-556, Mar. 2017.【IEEE Signal Processing Society Japan Student Best Paper Award (受賞者:Quoc Truong Do)】
三浦 明波, Graham Neubig, Sakriani Sakti, 戸田 智基, 中村 哲. 中間言語情報を記憶するピボット翻訳手法. 自然言語処理, Vol. 23, No. 5, pp. 499-528, Dec. 2016.
Y. Oshima, S. Takamichi, T. Toda, G. Neubig, S. Sakti, S. Nakamura. Non-native text-to-speech preserving speaker individuality based on partial correction of prosodic and phonetic characteristics. IEICE Transactions on Information and Systems, Vol. E99-D, No. 12, pp. 3132-3139, Dec. 2016.
K. Kobayashi, T. Toda, T. Nakano, M. Goto, S. Nakamura. Improvements of voice timbre control based on perceived age in singing voice conversion. IEICE Transactions on Information and Systems, Vol. E99-D, No. 11, pp. 2767-2777, Nov. 2016.
T. Hiraoka, G. Neubig, S. Sakti, T. Toda, S. Nakamura. Learning cooperative persuasive dialogue policies using framing. Speech Communication, Vol. 84, pp. 83-96, Nov. 2016.
S. Takamichi, T. Toda, G. Neubig, S. Sakti, S. Nakamura. A statistical sample-based approach to GMM-based voice conversion using tied-covariance acoustic models. IEICE Transactions on Information and Systems, Vol. E99-D, No. 10, pp. 2490-2498, Oct. 2016.
H. Tanaka, S. Sakti, G. Neubig, T. Toda, H. Negoro, H. Iwasaka, S. Nakamura. Teaching social communication skills through human-agent interaction. ACM Transactions on Interactive Intelligent Systems, Vol. 6, No. 2, 23 pages, Aug. 2016.
H. Maki, T. Toda, S. Sakti, G. Neubig, S. Nakamura. Enhancing event-related potentials based on maximum a posteriori estimation with a spatial correlation prior. IEICE Transactions on Information and Systems, Vol. E99-D, No. 6, pp. 1410-1419, June 2016.
S. Takamichi, T. Toda, A.W. Black, G. Neubig, S. Sakti, S. Nakamura. Post-filters to modify the modulation spectrum for statistical parametric speech synthesis. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 24, No. 4, pp. 755-767, Apr. 2016.【IEEE Signal Processing Society Japan Young Author Best Paper Award (受賞者:Shinnosuke Takamichi)】
Z. Wu, P. De Leon, C. Demiroglu, A. Khodabakhsh, S. King, Z.-H. Ling, D. Saito, B. Stewart, T. Toda, M. Wester, J. Yamagishi. Anti-spoofing for text-independent speaker verification: an initial database, comparison of countermeasures, and human performance. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 24, No. 4, pp. 768-783, Apr. 2016.
赤部 晃一, Graham Neubig, Sakriani Sakti, 戸田 智基, 中村 哲. 機械翻訳システムの誤り分析のための誤り箇所選択手法. 自然言語処理, Vol. 23, No. 1, pp. 88-117, Jan. 2016.
水上 雅博, Lasguido Nio, 木付 英士, 野村 敏男, Graham Neubig, 吉野 幸一郎, Sakriani Sakti, 戸田 智基, 中村 哲. 快適度推定に基づく用例ベース対話システム. 人工知能学会論文誌, Vol. 31, No. 1, 12 pages, Jan. 2016.
P. Arthur, G. Neubig, S. Sakti, T. Toda, and S. Nakamura. Semantic parsing of ambiguous input through paraphrasing and verification. Transactions of the Association for Computational Linguistics, Vol. 3, pp. 571-584, Dec. 2015.
H. Tanaka, S. Sakti, G. Neubig, T. Toda, S. Nakamura. NOCOA+: multimodal computer-based training for social and communication skills. IEICE Transactions on Information and Systems, Vol. E98-D, No. 8, pp. 1536-1544, Aug. 2015.
K. Kobayashi, T. Toda, H. Doi, T. Nakano, M. Goto, G. Neubig, S. Sakti, S. Nakamura. Voice timbre control based on perceived age in singing voice conversion. IEICE Transactions on Information and Systems, Vol. E97-D, No. 6, pp. 1419-1428, June 2014.
K. Tanaka, T. Toda, G. Neubig, S. Sakti, S. Nakamura. A hybrid approach to electrolaryngeal speech enhancement based on noise reduction and statistical excitation generation. IEICE Transactions on Information and Systems, Vol. E97-D, No. 6, pp. 1429-1437, June 2014.
K. Kubo, S. Sakti, G. Neubig, T. Toda, S. Nakamura. Structured adaptive regularization of weight vectors for a robust grapheme-to-phoneme conversion model. IEICE Transactions on Information and Systems, Vol. E97-D, No. 6, pp. 1468-1476, June 2014.
L. Nio, S. Sakti, G. Neubig, T. Toda, S. Nakamura. Utilizing human-to-human conversation examples for a multi domain chat-oriented dialog system. IEICE Transactions on Information and Systems, Vol. E97-D, No. 6, pp. 1497-1505, June 2014.
S. Takamichi, T. Toda, Y. Shiga, S. Sakti, G. Neubig, S. Nakamura. Parameter generation methods with rich context models for high-quality and flexible text-to-speech synthesis. IEEE Journal of Selected Topics in Signal Processing, Vol. 8, No. 2, pp. 239-250, Apr. 2014.【電気通信普及財団賞 第30回テレコムシステム技術学生賞 (受賞者:Shinnosuke Takamichi)】【IEEE関西支部学生研究奨励賞 (受賞者:Shinnosuke Takamichi)】
H. Doi, T. Toda, K. Nakamura, H. Saruwatari, K. Shikano. Alaryngeal speech enhancement based on one-to-many eigenvoice conversion. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 22, No. 1, pp. 172-183, Jan. 2014.
山内 祐輝, Graham Neubig, Sakriani Sakti, 戸田 智基, 中村 哲. 対話システムにおける用語間の関係性を用いた話題誘導応答文生成. 人工知能学会論文誌, Vol. 29, No. 1, pp. 80-89, Jan. 2014.
T. Toda, M. Nakagiri, K. Shikano. Statistical voice conversion techniques for body-conducted unvoiced speech enhancement. IEEE Transactions on Audio, Speech and Language Processing, Vol. 20, No. 9, pp. 2505-2517, Sep. 2012.
T. Nakamura, K. Sugiura, T. Nagai, N. Iwahashi, T. Toda, H. Okada, T. Omori. Learning novel objects for extended mobile manipulation. Journal of Intelligent and Robotic Systems, Vol. 66, No. 1-2, pp. 187-204, Apr. 2012.
中村 友昭, アッタミミ ムハンマド, 杉浦 孔明, 長井 隆行, 岩橋 直人, 戸田 智基, 岡田 浩之, 大森 隆司. 拡張モバイルマニピュレーションのための新規物体の学習. 日本ロボット学会誌, Vol. 30, No. 2, pp. 213-224, Mar. 2012.
T. Kubo, T. Toda, M. Yoshida, T. Hattori, K. Ikeda. Vowel recognition based on surface electromyography with electrode grid on submental region. Transactions of Japanese Society for Medical and Biological Engineering, Vol. 50, No. 1, pp. 38-46, Feb. 2012.
K. Nakamura, T. Toda, H. Saruwatari, K. Shikano. Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech. Speech Communication, Vol. 54, No. 1, pp. 134-146, Jan. 2012.
H. Doi, K. Nakamura, T. Toda, H. Saruwatari, K. Shikano. Esophageal speech enhancement based on statistical voice conversion with Gaussian mixture models. IEICE Transactions on Information and Systems, Vol. E93-D, No. 9, pp. 2472-2482, Sep. 2010.
Y. Ohtani, T. Toda, H. Saruwatari, K. Shikano. Improvements of the one-to-many eigenvoice conversion system. IEICE Transactions on Information and Systems, Vol. E93-D, No. 9, pp. 2491-2499, Sep. 2010.
K. Nakamura, T. Toda, H. Saruwatari, K. Shikano. Evaluation of extremely small sound source signals used in speaking-aid system with statistical voice conversion. IEICE Transactions on Information and Systems, Vol. E93-D, No. 7, pp. 1909-1917, July 2010.
Y. Ohtani, T. Toda, H. Saruwatari, K. Shikano. Adaptive training for voice conversion based on eigenvoices. IEICE Transactions on Information and Systems, Vol. E93-D, No. 6, pp. 1589-1598, June 2010.
T. Hirahara, M. Otani, S. Shimizu, T. Toda, K. Nakamura, Y. Nakajima, K. Shikano. Silent-speech enhancement using body-conducted vocal-tract resonance signals. Speech Communication, Vol. 52, No. 4, pp. 301-313, Apr. 2010.
V.-A. Tran, G. Bailly, H. Loevenbruck, T. Toda. Improvement to a NAM-captured whisper-to-speech system. Speech Communication, Vol. 52, No. 4, pp. 314-326, Apr. 2010.
J. Yamagishi, T. Nose, H. Zen, Z.-H. Ling, T. Toda, K. Tokuda, S. King, S. Renals. Robust speaker-adaptive HMM-based text-to-speech synthesis. IEEE Transactions on Audio, Speech and Language Processing, Vol. 17, No. 6, pp. 1208-1230, Aug. 2009.
R. Gomez, T. Toda, H. Saruwatari, K. Shikano. Techniques in rapid unsupervised speaker adaptation based on HMM-sufficient statistics. Speech Communication, Vol. 51, No. 1, pp. 42-57, Jan. 2009.
H. Zen, T. Toda, K. Tokuda. The Nitech-NAIST HMM-based speech synthesis system for the Blizzard Challenge 2006. IEICE Transactions on Information and Systems, Vol. E91-D, No. 6, pp. 1764-1773, June 2008.
大谷 大和, 戸田 智基, 猿渡 洋, 鹿野 清宏. STRAIGHT混合励振源を用いた混合正規分布モデルに基づく最尤声質変換法. 電子情報通信学会論文誌,Vol. J91-D, No. 4, pp. 1082-1091, Apr. 2008.
T. Cincarek, T. Toda, H. Saruwatari, K. Shikano. Cost reduction of acoustic modeling for real-environment applications using unsupervised and selective training. IEICE Transactions on Information and Systems, Vol. E91-D, No. 3, pp. 499-507, Mar. 2008.
G. Nagino, M. Shozakai, T. Toda, H. Saruwatari, K. Shikano. Building an effective speech corpus by utilizing statistical multidimensional scaling method. IEICE Transactions on Information and Systems, Vol. E91-D, No. 3, pp. 607-614, Mar. 2008.
T. Toda, A.W. Black, K. Tokuda. Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model. Speech Communication, Vol. 50, No. 3, pp. 215-227, Mar. 2008.【The 2013 EURASIP-ISCA Best Paper Award (Speech Communication Journal)】
T. Toda, A.W. Black, K. Tokuda. Voice conversion based on maximum likelihood estimation of spectral parameter trajectory. IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, No. 8, pp. 2222-2235, Nov. 2007.【IEEE Signal Processing Society 2009 Young Author Best Paper Award】
T. Toda, K. Tokuda. A Speech parameter generation algorithm considering global variance for HMM-based speech synthesis. IEICE Transactions on Information and Systems, Vol. E90-D, No. 5, pp. 816-824, May 2007.【電気通信普及財団賞 第23回テレコムシステム技術賞】【電子情報通信学会 平成19年度情報・システムソサイエティ論文賞(連作論文)】
中村 圭吾, 戸田 智基, 猿渡 洋, 鹿野 清宏. 肉伝導人工音声の変換に基づく喉頭全摘出者のための音声コミュニケーション支援システム. 電子情報通信学会論文誌,Vol. J90-D, No. 3, pp. 780-787, Mar. 2007.
R. Gomez, T. Toda, H. Saruwatari, K. Shikano. Reducing computation time of the rapid unsupervised speaker adaptation based on HMM-sufficient statistics. IEICE Transactions on Information and Systems, Vol. E90-D, No. 2, pp. 554-561, Feb. 2007.
H. Zen, T. Toda, M. Nakamura, K. Tokuda. Details of the Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005. IEICE Transactions on Information and Systems, Vol. E90-D, No. 1, pp. 325-333, Jan. 2007.【電気通信普及財団賞 第23回テレコムシステム技術賞)】【電子情報通信学会 平成19年度情報・システムソサイエティ論文賞(連作論文)】
河井 恒, 戸田 智基, 山岸 順一, 平井 俊男, 倪 晋富, 西澤 信行, 津崎 実, 徳田 恵一. 大規模コーパスを用いた音声合成システムXIMERA. 電子情報通信学会論文誌,Vol. J89-D-II, No. 12, pp. 2688-2698, Dec. 2006.
平井 俊男, 河井 恒, 津崎 実, 戸田 智基. 音声合成システムXIMERAにおける日本語合成音の自然性劣化要因の分析. 日本音響学会誌, Vol. 62, No. 11, pp. 767-773, Nov. 2006.
T. Cincarek, T. Toda, H. Saruwatari, K. Shikano. Utterance-based selective training for the automatic creation of task-dependent acoustic models. IEICE Transactions on Information and Systems, Vol. E89-D, No. 3, pp. 962-969, Mar. 2006.
R. Gomez, A. Lee, T. Toda, H. Saruwatari, K. Shikano. Improving rapid unsupervised speaker adaptation based on HMM-sufficient statistics in noisy environments using multi-template models. IEICE Transactions on Information and Systems, Vol. E89-D, No. 3, pp. 998-1005, Mar. 2006.
T. Toda, H. Kawai, M. Tsuzaki, K. Shikano. An evaluation of cost functions sensitively capturing local degradation of naturalness for segment selection in concatenative speech synthesis. Speech Communication, Vol. 48, No. 1, pp. 45-56, Jan. 2006.
K. Adachi, T. Toda, H. Kawanami, H. Saruwatari, K. Shikano. Designing target cost function based on prosody of speech database. IEICE Transactions on Information and Systems, Vol. E88-D, No. 3, pp. 519-524, Mar. 2005.
舛田 剛志, 戸田 智基, 川波 弘道, 猿渡 洋, 鹿野 清宏. 韻律的に多重化した音声データベースの設計と発話速度におけるその評価. 電子情報通信学会論文誌,Vol. J87-D-II, No. 2, pp. 447-455, Feb. 2004.
戸田 智基, 河井 恒, 津崎 実, 鹿野 清宏. 素片接続型日本語テキスト音声合成における音素単位とダイフォン単位に基づく素片選択. 電子情報通信学会論文誌,Vol. J85-D-II, No. 12, pp. 1760-1770, Dec. 2002.
M. Mashimo, T. Toda, H. Kawanami, K. Shikano, N. Campbell. Cross-language voice conversion evaluation using bilingual databases. IPSJ Journal, Vol. 43, No. 7, pp. 2177-2185, July 2002.
戸田 智基,陸 金林,猿渡 洋,鹿野 清宏. 周波数軸伸縮を用いた混合正規分布モデルに基づく声質変換法. 電子情報通信学会論文誌,Vol. J84-D-II, No. 10, pp. 2181-2189, Oct. 2001.【電気通信普及財団賞 第18回テレコムシステム技術学生賞】
戸田 智基, 坂野 秀樹, 梶田 将司, 武田 一哉, 板倉 文忠, 鹿野 清宏. 側抑制性重み付けを用いた雑音環境下におけるSTRAIGHT分析合成系の品質改善. 電子情報通信学会論文誌,Vol. J83-D-II, No. 11, pp. 2180-2189, Nov. 2000.
レター
W.-C. Huang, Y.-C. Wu, T. Toda. Multi-speaker text-to-speech training with speaker anonymized data. IEEE Signal Processing Letters, Vol. 31, pp. 2995-2999, Oct. 2024.
K. Matsubara, T. Okamoto, R. Takashima, T. Takiguchi, T. Toda, H. Kawai. Comparison of real-time multi-speaker neural vocoders on CPUs . Acoustical Science and Technology, Acoustical Letter, Vol. 43, No. 2, pp. 121-124, Mar. 2022.
K. Matsubara, T. Okamoto, R. Takashima, T. Takiguchi, T. Toda, Y. Shiga, H. Kawai. Investigation of training data size for real-time neural vocoders on CPUs. Acoustical Science and Technology, Acoustical Letter, Vol. 42, No. 1, pp. 65-68, Jan. 2021.
T. Okamoto, K. Tachibana, T. Toda, Y. Shiga, H. Kawai. Deep neural network-based power spectrum reconstruction to improve quality of vocoded speech with limited acoustic parameters. Acoustical Science and Technology, Acoustical Letter, Vol. 39, No. 2, pp. 163-166, Mar. 2018.
H. Tanaka, S. Sakti, G. Neubig, T. Toda, S. Nakamura. NOCOA: A Computer-Based Training Tool for Social and Communication Skills That Exploits Non-verbal Behaviors. The Journal of Information and Systems in Education (Short Note), Vol. 12, No. 1, pp. 19-26, Apr. 2014.