WELCOME TO MY PAGE

Shintaro Ozaki | 尾﨑慎太郎

I am a first-year Ph.D student at Nara Institute of Science and Technology (NAIST, Japan 🇯🇵) and interested in the field of Natural Language Processing (NLP), especially mechanistic interpretability (MI) and Multimodal models.

Please get in touch with me at "ozaki.shintaro.ou6 at naist.ac.jp" if you are interested in.

Education

April 2026 ~ Present. Ph.D student at Nara Institute of Science and Technology (NAIST), Nara, JAPAN
April 2024 ~ March 2026. M.Eng. at Nara Institute of Science and Technology (NAIST), Nara, JAPAN.

Under the supervision of Prof. Taro Watanabe and Assoc. Prof. Hidetaka Kamigaito.

April 2020 ~ March 2024. B.S. at Meiji University, Tokyo, JAPAN,

Under the supervision of Assoc. Prof. Daisaku Yokoyama.

Research Interests

Mechanistic Interpretability (MI), Multimodal LLMs, and Natural Language Processing.

Work Experiences

June, 2024 ~ Present: Research Assistant at Research and Development Center for Large Language Models(LLMC), National Institute of Informatics (NII).

- Affiliated with the interpretability team.
April, 2025 ~ May, 2025: Visiting Research at MBZUAI, UAE.
August, 2024 ~ September, 2024. Research Intern at Hitachi R&D.

June, 2024 ~ September, 2024: Research Assistant at Research and Development Center for Large Language Models(LLMC), National Institute of Informatics (NII).
- Affiliated with the evaluation team.
January, 2024 ~ May, 2024: Research Intern at National Institute of Informatics (NII).

Awards

March 2026. 2 papers for Committee Special Awards in NLP2026. [link and link]

March 2026. Excellence Award in NLP2026. [link]

September 2024. Encouragement Award in YANS2024. [Article]

May 2024. Committee Special Awards in the 5th Werewolf competition in the field of NLP. [Article(ja)]

December 2023. Excellence Awards in Mercoin Hackathon sponsored by Mercoin, Mercari Inc. [Article(ja)]

November 2022. Teamwork Awards in "Tekunoko" Hackathon sponsored by SCSK Inc.

Publications

Journal

尾崎慎太郎, 林和樹, 坂井優介, 上垣外英剛, 林克彦, 渡辺太郎. 大規模視覚言語モデルにおける芸術作品の多言語説明生成能力の評価. 自然言語処理 33巻2号. 2026. [TBA]

Peer-reviewed

Kazuki Hayashi, Shintaro Ozaki, Yusuke Sakai, Hidetaka Kamigaito, and Taro Watanabe. Diagnosing Vision Language Models' Perception by Leveraging Human Methods for Color Vision Deficiencies. EACL 2026 (Main). 2026/03. [paper] [arxiv]
Hongyu Sun, Yusuke Sakai, Haruki Sakajo, Shintaro Ozaki, Kazuki Hayashi, Hidetaka Kamigaito, Taro Watanabe. LoCt-Instruct: An Automatic Pipeline for Constructing Datasets of Logical Continuous Instructions. EMNLP 2025 (Main). 2025/11. [paper]

Shintaro Ozaki, Kazuki Hayashi, Miyu Oba, Yusuke Sakai, Hidetaka Kamigaito, and Taro Watanabe. BQA: Body Language Question Answering Dataset for Video Large Language Models. ACL 2025 (Main). 2025/08. [arxiv] [paper]
Shintaro Ozaki, Yuta Kato, Siyuan Feng, Masayo Tomita, Kazuki Hayashi, Wataru Hashimoto, Ryoma Obara, Masafumi Oyamada, Katsuhiko Hayashi, Hidetaka Kamigaito and Taro Watanabe. Understanding the Impact of Confidence in Retrieval Augmented Generation: A Case Study in the Medical Domain. BioNLP 2025. 2025/08. [arxiv] [paper]
Shintaro Ozaki, Kazuki Hayashi, Yusuke Sakai, Hidetaka Kamigaito, Katsuhiko Hayashi, and Taro Watanabe. Towards Cross-Lingual Explanation of Artwork in Large-scale Vision Language Models. NAACL 2025 (Findings). 2025/05. [arxiv] [paper]
Adam Nohejl, Frederikus Hudi, Eunike Andriani Kardinata, Shintaro Ozaki, Maria Angelica Riera Machin, Hongyu Sun, Justin Vasselli, and Taro Watanabe. Beyond Film Subtitles: Is YouTube the Best Approximation of Spoken Vocabulary?. COLING 2025. 2025/01. [arxiv] [paper]
Keito Kudo, Hiroyuki Deguchi, Makoto Morishita, Ryo Fujii, Takumi Ito, Shintaro Ozaki, Koki Natsumi, Kai Sato, Kazuki Yano, Ryosuke Takahashi, Subaru Kimura, Tomomasa Hara, Yusuke Sakai, and Jun Suzuki. Document-level Translation with LLM Reranking: Team-J at WMT 2024 General Translation Task. Proceedings of the Ninth Conference on Machine Translation (WMT'24), 2024/11. [paper] [paper]
Takehiro Sato, Shintaro Ozaki, and Daisaku Yokoyama. An Implementation of Werewolf Agent That does not Truly Trust LLMs. The 2nd Workshop of AI Werewolf and Dialog System (AIWolfDial). 2024/09. [arxiv] [paper]

Preprint

Shintaro Ozaki, Kazuki Hayashi, Yusuke Sakai, Jingun Kwon, Hidetaka Kamigaito, Katsuhiko Hayashi, Manabu Okumura, and Taro Watanabe. TextTIGER: Text-based Intelligent Generation with Entity Prompt Refinement for Text-to-Image Generation. 2025/04. [arxiv]
Shintaro Ozaki, Tatsuya Hiraoka, Hiroto Otake, Hirki Ouchi, Masaru Isonuma, Benjamin Heinzerling, Kentaro Inui, Taro Watanabe, Yusuke Miyao, Yohei Oseki, and Yu Takagi. Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance. 2025/05. [arxiv]

Others

尾崎慎太郎, 橋本航, 上垣外英剛, 林克彦, 渡辺太郎. n-gramに基づく推論モデルの信頼度と較正特性の分析. 第32回言語処理学会年次大会. 2026/03. [paper]

尾崎慎太郎, 平岡達也, 大竹啓永, 大内啓樹, 磯沼大, Benjamin Heinzerling, 乾健太郎, 渡辺太郎, 宮尾祐介, 大関洋平, 髙木優. 大規模言語モデルの潜在言語は一貫しているべきか？. 第32回言語処理学会年次大会. 2026/03. [paper] 委員特別賞.
王略丞, 尾崎慎太郎, 上垣外英剛, 林克彦, Kwon Jingun, 奥村学, 渡辺太郎. 画像生成モデルにおける直喩喩体の生成挙動分析. 第32回言語処理学会年次大会. 2026/03. [paper] 委員特別賞.
加藤優汰. 尾崎慎太郎, 林和樹, 坂井優介, 上垣外英剛, 林克彦, 渡辺太郎. 知識グラフの反復的な探索による画像の詳細な説明文の生成. 第32回言語処理学会年次大会. 2026/03. [paper]

林和樹, 尾崎慎太郎, 神野倫行, 上垣外英剛, 渡辺太郎. Noisy Channel に基づく生成確率による画像生成評価. 第32回言語処理学会年次大会. 2026/03. [paper] 優秀賞.
Shintaro Ozaki, Tatsuya Hiraoka, Hiroto Otake, Hiroki Ouchi, Masaru Isonuma, Benjamin Heinzerling, Kentaro Inui, Taro Watanabe, Yusuke Miyao, Yohei Oseki, Yu Takagi. Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance. Japanese Symposium on Open Large Language Models. 2025/11. [link]
Hiroto Otake, Hiroki Ouchi, Shintaro Ozaki, Tatsuya Hiraoka, Taro Watanabe, Yusuke Miyao, Yohei Oseki, Yu Takagi. Formation of Geospatial Representations in Large Language Models and the Effect of Training Data. Japanese Symposium on Open Large Language Models. 2025/11. [link]

Shintaro Ozaki, Kazuki Hayashi, Hidetaka Kamigaito, and Taro Watanabe. Revealing Socio-Economic Bias in Text-to-Image Models. 第20回NLP若手の会シンポジウム(YANS), 2025/09. [link]
加藤優汰. 尾崎慎太郎, 林和樹, 小原涼馬, 上垣外英剛, 渡辺太郎, 小山田昌史, 林克彦. あなたのLLM、信用できますか？ —ペルソナの観点による分析—. 第20回NLP若手の会シンポジウム(YANS), 2025/09. [link]
大竹啓永, 大内啓樹, 尾崎慎太郎, 平岡達也, 渡辺太郎, 宮尾祐介, 大関洋平, 高木優. 大規模言語モデルにおける地理表現の形成と訓練データの影響. 第39回人工知能学会全国大会. 2025/05. [paper]

尾崎慎太郎, 平岡達也, 大竹啓永, 大内啓樹, 渡辺太郎, 宮尾祐介, 大関洋平, 高木優. 大規模言語モデルにおけるペルソナの役割と内部動作の理解. 第31回言語処理学会年次大会. 2025/03. [paper]
尾崎慎太郎, 加藤優汰, 馮思遠, 富田雅代, 林和樹, 小原涼馬, 小山田昌史, 林克彦, 上垣外英剛, 渡辺太郎. 検索拡張生成による信頼度の影響: 医療分野における分析. 第31回言語処理学会年次大会. 2025/03. [paper]
佐藤岳大, 尾崎慎太郎, 横山大作. 戦略的発話の多様な生成を目指した人狼エージェントの構築. 第31回言語処理学会年次大会. 2025/03. [paper]
尾崎慎太郎, 林和樹, 坂井優介, 上垣外英剛, 林克彦, 渡辺太郎. 大規模視覚言語モデルによる芸術作品の多言語説明生成. 第262回自然言語処理研究会. 2024/12. [paper]
尾崎慎太郎, 林和樹, 大羽美悠, 坂井優介, 上垣外英剛, 渡辺太郎. マルチモーダル大規模言語モデルは非言語コミュニケーションを理解しているか?. 第19回NLP若手の会シンポジウム(YANS), 2024/09. [link] 奨励賞.
LLM-jp. LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs. [arXiv], 2024/07.
尾崎慎太郎, 横山大作. 国会議事録を使用した政党ごとのスタンスの変遷の分析. pp. 2487-2492. 第30回言語処理学会年次大会, [paper], 2024/03.
尾崎慎太郎, 横山大作. 国会議事録を用いた政党のスタンス分析に向けて. 第18回NLP若手の会シンポジウム(YANS), 2023/08. [link]

Reviewer

5th Workshop on Image/Video/Audio Quality Assessment in Computer Vision, VLM and Diffusion Model, WACV 2026.
ACL SRW 2025

Page updated

Google Sites

Report abuse