Koshiro Saito
Koshiro Saito
Master Student (M1), Computer Science, Institute of Science Tokyo @ Okazaki Lab
Member of Swallow Project - Eval Team.
(Developing LLMs with High Japanese Understanding Ability. Official Website.)
Artificial Text Detection. PUPPET: Easy-to-detect LLM.
Oct. 9, 2025:
🎉🏆 My paper was selected as a spotlight
at the MELT Workshop in conjunction with COLM 2025.
Sep. 17, 2025.
🎉🏆 Receive the Special Jury Award
at the YANS 2025 Hackathon!
Sep. 2, 2025:
🎉🏆 My presentation at the NL Workshop received
the “IPSJ Yamashita SIG Research Award 2025.”
Aug. 23, 2025:
I gave a keynote speech at the High School Scientific Conference (HSSC)
hosted by SMAN 10 BEKASI.
Aug. 8, 2025:
🎉 My paper was accepted at the MELT Workshop
in conjunction with COLM 2025.
(Past news can be found here)
Aug, 23. 2025:
I gave a keynote speech at the High School Scientific Conference (HSSC) hosted by SMAN 10 BEKASI.
Oct. 10, 2025:
"Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLM"
Koshiro Saito, Sakae Mizuki, Masanari Ohi, Taishi Nakamura, Taihei Shiotani, Koki Maeda, Youmi Ma, Kakeru Hattori, Kazuki Fujii, Takumi Okamoto, Shigeki Ishida, Hiroya Takamura, Rio Yokota, Naoaki Okazaki. The 1st Workshop on Multilingual and Equitable Language Technologies (MELT) (in conjunction with The Second Conference On Language Modeling (COLM))
🎉🏆 Selected as a spotlight! (Ref: Accepted submissions: https://melt-workshop.github.io/papers/.)
Oct. 10, 2025:
"Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models."
Youmi Ma, Sakae Mizuki, Kazuki Fujii, Taishi Nakamura, Masanari Ohi, Hinari Shimada, Taihei Shiotani, Koshiro Saito, Koki Maeda, Kakeru Hattori, Takumi Okamoto, Shigeki Ishida, Rio Yokota, Hiroya Takamura, Naoaki Okazaki. COLM2025.
Sep. 17, 2025:
🎉🏆 Receive the Special Jury Award at the YANS 2025 Hackathon!
Sep. 6-7, 2024:
"Easily detectable LLMs without sacrificing its generative capability" [Poster (JP)]
Koshiro Saito [1], Ryuto Koike [1], Masahiro Kaneko[2], Naoaki Okazaki[1] (1: TITECH, 2: MBZUAI)
🎉🏆 Won the ‘Encouragement Award’ and ‘Sponsor Award’ at YANS2024
Sep. 3, 2024:
"Advantages of Training LLMs on Japanese Text"[Paper (JP), Slides (JP)]
Koshiro Saito [1], Sakae Mizuki [2,1], Masanari Ohi [1], Takashi Nakamura [1], Taihei Shiotani [1], Koki Maeda [1], Ma Youmi [1], Kakeru Hattori [1], Kazuki Fujii [1], Takumi Okamoto [1], Shigeki Ishida [1], Hiroya Takamura [2], Rio Yokota [1], Naoaki Okazaki [1] (1: TITECH, 2: AIST)
🎉🏆 Awarded Excellence at the 261st NL Research Presentation
🎉🏆 Awarded the “IPSJ Yamashita SIG Research Award 2025.”
Mar. 2025:
"模倣学習による大規模言語モデルの指示チューニング (Instruction Tuning of Large Language Models Through Imitation Learning)".
Youmi Ma, Sakae Mizuki, Kazuki Fujii, Taishi Nakamura, Masanari Ohi, Hinari Shimada, Taihei Shiotani, Koshiro Saito, Koki Maeda, Kakeru Hattori, Takumi Okamoto, Shigeki Ishida, Rio Yokota, Hiroya Takamura, Naoaki Okazaki. The 31st Annual Meeting of the Association of Natural Language Processing (NLP2025), Nagasaki, Japan. (in Japanese).
Mar. 2025:
"新聞記事からつくる 時事と社会に強い日本語LLM (Japanese LLM Powered by News Articles: Strong in Current Events and Social Issues)".
Kakeru Hattori, Sakae Mizuki, Kazuki Fujii, Taishi Nakamura, Taihei Shiotani, Kokoro Ueki, Takuro Niitsuma, Akira Kawabata, Hideaki Tamori, Youmi Ma, Koki Maeda, Masanari Ohi, Koshiro Saito, Takumi Okamoto, Shigeki Ishida, Rio Yokota, Hiroya Takamura, Naoaki Okazaki. The 31st Annual Meeting of the Association of Natural Language Processing (NLP2025), Nagasaki, Japan. (in Japanese).
Mar. 2025:
"Swallowコーパスv2: 教育的な日本語ウェブコーパスの構築 (Swallow Corpus v2: Building an Educational Japanese Web Corpus)". Kakeru Hattori, Naoaki Okazaki, Sakae Mizuki, Kazuki Fujii, Taishi Nakamura, Masanari Ohi, Taihei Shiotani, Koshiro Saito, Youmi Ma, Koki Maeda, Takumi Okamoto, Shigeki Ishida, Rio Yokota, Hiroya Takamura. The 31st Annual Meeting of the Association of Natural Language Processing (NLP2025), Nagasaki, Japan. (in Japanese).
Apr. 2025 - Present:
Research Assistant
Okazaki Lab,Science Tokyo
Apr. 2025 - Present:
Teaching Assistant
Center for DS & AI Education,Science Tokyo
Jul. 2024 - Mar. 2025:
Research Assistant
Okazaki Lab,Tokyo Tech
Feb. 2023 - Jun. 2025:
Internship (R&D)
Canserscan Inc.
I believe that LLM is the essential bridge that will enable
the realization of a symbiotic society between AI and humans.
Watermark [Reinforcement Learning.] (Currently 🏃)
Is it possible to make models detectable
while keeping their abilities?
Is it possible to tell whether it is human-written
or AI-generated, and when would it be useful?
Models/Datasets Evaluation (Currently 🏃)
How & What to evaluate for 'better' models?
What are 'good' datasets?
Services and business leveraging LLMs.
(Actually, most interested in now 🔥)
How can we make LLMs so that
everyone on Earth can access and enjoy their benefits?
Can LLMs serve society in any aspect
other than Language and Generation?
Apr. 2025 - Present:
Master of Engineering (Computer Science)
Institute of Science Tokyo
Apr. 2021 - Mar. 2025:
Bachelor of Engineering (Computer Science)
Tokyo Institute of Technology
English
TOEIC: 910
TOEFL-iBT: 97
(MyBest Scores)
EIKEN: Grade Pre-1
Tech
Fundamental Information Technology Engineer Examination