I am a master's student at Nara Institute of Science and Technology (NAIST, Japan 🇯🇵) and am interested in the field of Natural Language Processing (NLP), especially mechanistic interpretability (MI) and Multimodal models.
Please get in touch with me at "ozaki.shintaro.ou6 at naist.ac.jp" if you are interested.
Links: [CV] [Twitter] [Google Scholar]
April, 2024 ~ Present.
M.Eng. at Nara Institute of Science and Technology, Nara. [Link]
Research area: Natural Language Processing.
Supervisor: Prof. Taro Watanabe and Assoc. Prof. Hidetaka Kamigaito.
April, 2020 ~ March, 2024.
B.S. at Meiji University, Tokyo. [Link(ja)]
Research area: Natural Language Processing.
Supervisor: Daisaku Yokoyama.
Mechanistic Interpretability (MI), Multimodal LLMs, and Natural Language Processing.
June, 2024 ~ Present: Research Assistant at Research and Development Center for Large Language Models(LLMC), National Institute of Informatics (NII).
Affiliated with interpretability team.
April, 2025 ~ May, 2025: Visiting Research at MBZUAI, UAE.
August, 2024 ~ September, 2024. Research Intern at Hitachi R&D.
June, 2024 ~ September, 2024: Research Assistant at Research and Development Center for Large Language Models(LLMC), National Institute of Informatics (NII).
Affiliated with evaluation team.
January, 2024 ~ May, 2024: Research Intern at National Institute of Informatics (NII).
September, 2024. Encouragement award in YANS2024. [Article(ja)]
May, 2024. Jury's Special Awards in 5th Werewolf competition in the field of NLP. [Article(ja)]
December, 2023. Excellence Awards in Mercoin Hackathon sponsored by Mercoin, Mercari Inc. [Article(ja)]
November, 2022. Teamwork Awards in "Tekunoko" Hackathon sponsored by SCSK Inc.
Peer-reviewed
Hongyu Sun, Yusuke Sakai, Haruki Sakajo, Shintaro Ozaki, Kazuki Hayashi, Hidetaka Kamigaito, Taro Watanabe. LoCt-Instruct: An Automatic Pipeline for Constructing Datasets of Logical Continuous Instructions. EMNLP 2025 (Main). 2025/11.
Shintaro Ozaki, Kazuki Hayashi, Miyu Oba, Yusuke Sakai, Hidetaka Kamigaito and Taro Watanabe. BQA: Body Language Question Answering Dataset for Video Large Language Models. ACL 2025 (Main). 2025/08. [arxiv] [paper]
Shintaro Ozaki, Yuta Kato, Siyuan Feng, Masayo Tomita, Kazuki Hayashi, Wataru Hashimoto, Ryoma Obara, Masafumi Oyamada, Katsuhiko Hayashi, Hidetaka Kamigaito and Taro Watanabe. Understanding the Impact of Confidence in Retrieval Augmented Generation: A Case Study in the Medical Domain. BioNLP 2025. 2025/08. [arxiv] [paper]
Shintaro Ozaki, Kazuki Hayashi, Yusuke Sakai, Hidetaka Kamigaito, Katsuhiko Hayashi and Taro Watanabe. Towards Cross-Lingual Explanation of Artwork in Large-scale Vision Language Models. NAACL 2025 (Findings). 2025/05. [arxiv] [paper]
Adam Nohejl, Frederikus Hudi, Eunike Andriani Kardinata, Shintaro Ozaki, Maria Angelica Riera Machin, Hongyu Sun, Justin Vasselli and Taro Watanabe. Beyond Film Subtitles: Is YouTube the Best Approximation of Spoken Vocabulary?. COLING 2025. 2025/01. [arxiv] [paper]
Keito Kudo, Hiroyuki Deguchi, Makoto Morishita, Ryo Fujii, Takumi Ito, Shintaro Ozaki, Koki Natsumi, Kai Sato, Kazuki Yano, Ryosuke Takahashi, Subaru Kimura, Tomomasa Hara, Yusuke Sakai and Jun Suzuki. Document-level Translation with LLM Reranking: Team-J at WMT 2024 General Translation Task. Proceedings of the Ninth Conference on Machine Translation (WMT'24), 2024/11. [paper] [paper]
Takehiro Sato, Shintaro Ozaki and Daisaku Yokoyama. An Implementation of Werewolf Agent That does not Truly Trust LLMs. The 2nd Workshop of AI Werewolf and Dialog System (AIWolfDial). 2024/09. [arxiv] [paper]
Preprint
Shintaro Ozaki, Kazuki Hayashi, Yusuke Sakai, Jingun Kwon, Hidetaka Kamigaito, Katsuhiko Hayashi, Manabu Okumura, and Taro Watanabe. TextTIGER: Text-based Intelligent Generation with Entity Prompt Refinement for Text-to-Image Generation. 2025/04. [arxiv]
Kazuki Hayashi, Shintaro Ozaki, Yusuke Sakai, Hidetaka Kamigaito, and Taro Watanabe. Diagnosing Vision Language Models' Perception by Leveraging Human Methods for Color Vision Deficiencies. 2025/05. [arxiv]
Shintaro Ozaki, Tatsuya Hiraoka, Hiroto Otake, Hirki Ouchi, Masaru Isonuma, Benjamin Heinzerling, Kentaro Inui, Taro Watanabe, Yusuke Miyao, Yohei Oseki, and Yu Takagi. Do LLMs Need to Think in One Language? Correlation between Latent Language and Task Performance. 2025/05. [arxiv]
Others
Shintaro Ozaki, Kazuki Hayashi, Hidetaka Kamigaito, and Taro Watanabe. Revealing Socio-Economic Bias in Text-to-Image Models. 第20回NLP若手の会 シンポジウム(YANS), 2025/09.
加藤 優汰. 尾崎 慎太郎, 林 和樹, 小原 涼馬, 上垣外 英剛, 渡辺 太郎, 小山田 昌史, 林 克彦. あなたのLLM、信用できますか? —ペルソナの観点による分析—. 第20回NLP若手の会 シンポジウム(YANS), 2025/09.
大竹 啓永, 大内 啓樹, 尾崎 慎太郎, 平岡 達也, 渡辺 太郎, 宮尾 祐介, 大関 洋平, 高木 優. 大規模言語モデルにおける地理表現の形成と訓練データの影響. 第39回 人工知能学会全国大会. 2025/05. [paper]
尾崎 慎太郎, 平岡 達也, 大竹 啓永, 大内 啓樹, 渡辺 太郎, 宮尾 祐介, 大関 洋平, 高木 優. 大規模言語モデルにおけるペルソナの役割と内部動作の理解. 第31回 言語処理学会年次大会. 2025/03. [paper]
尾崎 慎太郎, 加藤 優汰, 馮 思遠, 富田 雅代, 林 和樹, 小原 涼馬, 小山田 昌史, 林 克彦, 上垣外 英剛, 渡辺 太郎. 検索拡張生成による信頼度の影響: 医療分野における分析. 第31回 言語処理学会年次大会. 2025/03. [paper]
佐藤 岳大, 尾崎 慎太郎, 横山 大作. 戦略的発話の多様な生成を目指した人狼エージェントの構築. 第31回言語処理学会年次大会. 2025/03. [paper]
尾崎 慎太郎, 林 和樹, 坂井 優介, 上垣外 英剛, 林 克彦, 渡辺 太郎. 大規模視覚言語モデルによる芸術作品の多言語説明生成. 第262回自然言語処理研究会. 2024/12. [paper]
尾崎 慎太郎, 林 和樹, 大羽 美悠, 坂井 優介, 上垣外 英剛, 渡辺 太郎. マルチモーダル大規模言語モデルは非言語コミュニケーションを理解しているか?. 第19回NLP若手の会 シンポジウム(YANS), 2024/09.
LLM-jp. LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs. [arXiv], 2024/07.
尾崎慎太郎, 横山大作. 国会議事録を使用した政党ごとのスタンスの変遷の分析. pp. 2487-2492. 第30回 言語処理学会年次大会, [paper], 2024/03.
尾崎慎太郎, 横山大作. 国会議事録を用いた政党のスタンス分析に向けて. 第18回NLP若手の会 シンポジウム(YANS), 2023/08.
ACL SRW 2025