JSUT (Japanese speech corpus of Saruwatari-lab., University of Tokyo)

The JSUT Collection is Japanese speech corpora connecting speech, song, and audio events. The JSUT corpus is a part of the JSUT Collection.

JSUT コレクションは,声・歌・音声模倣をつなげるための音声コーパスです.このJSUT コーパスは,JSUT コレクションの一部です.


You can download from here (ver. 1.1, 2.7GB).

ここ (ver. 1.1, 2.7GB) からダウンロード可能です.

(old ver.: ver.1)


This corpus consists of Japanese text (transcription) and reading-style audio. The audio data is sampled at 48kHz and recorded in our anechoic room. we recorded voices of a native Japanese female speaker. This corpus contains 10-hour speech consisting of the following data:

      • basic5000 ... covers all of daily-use characters (jouyou kanji).
      • utparaphrase512 ... replaces a part of a sentence with its paraphrase.
      • onomatopee300 ... includes onomatopees (onomatopia) of Japanese.
      • countersuffix26 ... countersuffix of Japanese
      • loanword128 ... loanwords of Japanese (e.g., ググる ['google' as verb])
      • voiceactress100 ... para-speech to the Voice Actress Corpus (free corpus of professional female speakers)
      • travel1000 ... travel-domain corpus
      • precedent130 ... precedent sentences
      • repeat500 ... repeatedly spoken utterances (100 sentence * 5 times)

このコーパスは日本語テキストと読み上げ音声からなります.音声データは48kHzでサンプリングされ,無響室で収録されました.一人の日本語女性話者の音声を収録しました.このコーパスは,10時間の音声 を含み,以下のデータからなります.

      • basic5000 ... 常用漢字の音読み・訓読みを全てカバー
      • utparaphrase512 ... 文の一部を読み替えたもの
      • onomatopee300 ... 日本語オノマトペ
      • countersuffix26 ... 助数詞
      • loanword128 ... 外来語由来の動詞・名詞 (e.g., ググる)
      • voiceactress100 ... 声優統計コーパス (プロ女性声優のフリーコーパス) とのパラ音声
      • travel1000 ... 旅行ドメインのフレーズ
      • precedent130 ... 判例文
      • repeat500 ... 繰り返し発話された音声 (100文 * 5回)

Terms of use/使い方

The text data is licensed with the CC-BY-SA 4.0 etc. See LICENCE file for the detail. The audio data may be used for

      • Research by academic institutions
      • Non-commercial research, including research conducted within commercial organizations
      • Personal use, including blog posts.

If you want to use for commercial purposes, please see below. Re-distribution is not permited, but you can upload a part of this corpus (e.g., ~100 audio files) in your webpage or blog. If possible, please let me know when you revealed papers, blog posts, and others. It will be very helpful to investigate contributions of this corpus.

テキストデータは,CC-BY-SA 4.0などにてライセンスされております.詳細は,LICENCEファイルをご覧ください.音声データは,以下の場合に限り使用可能です.

      • アカデミック機関での研究
      • 非商用目的の研究(営利団体での研究も含む)
      • 個人での利用(ブログなどを含む)



      • Ryosuke Sonobe (University of Tokyo) / 園部 良介 (東京大学)
      • Shinnosuke Takamichi (@forthshinji, University of Tokyo) / 高道 慎之介 (東京大学)
      • Hiroshi Saruwatari (University of Tokyo) / 猿渡 洋 (東京大学)

Bold is the main contributor. 太字が主な作成者です.


Please cite this paper. 下記論文を引用してください.

Ryosuke Sonobe, Shinnosuke Takamichi and Hiroshi Saruwatari, "JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis," arXiv preprint, 1711.00354, 2017.

Terms of commercial use/商用利用

We welcome your commercial use. Please contact the following email addresses.


      • Keiji Sueishi (TLO of the Univ. of Tokyo) / 居石 圭司 (東大TLO)
        • sueishi [_at_mark_] todaitlo.jp
      • Shinnosuke Takamichi / 高道 慎之介
        • shinnosuke_takamichi [_at_mark_] ipc.i.u-tokyo.ac.jp


Saruwatari Lab, the University of Tokyo / 東京大学 猿渡研究室 http://www.sp.ipc.i.u-tokyo.ac.jp/

JSUT-song corpus: https://sites.google.com/site/shinnosuketakamichi/publication/jsut-song

JSUT-vi corpus: https://sites.google.com/site/shinnosuketakamichi/publication/jsut-vi


A part of this work is supported by the SECOM Science and Technology Foundation.