Koshiro Saito
Koshiro Saito
Master Student (M1), Computer Science, Institute of Science Tokyo @ Okazaki Lab
Member of Swallow Project - Eval Team.
(Developing LLMs with High Japanese Understanding Ability. Official Website.)
Artificial Text Detection. PUPPET: Easy-to-detect LLM.
Oct. 9, 2025: ๐๐ My paper was selected as a spotlight at the MELT Workshop in conjunction with COLM 2025.
Sep. 17, 2025: ๐๐ Receive the Special Jury Award at the YANS 2025 Hackathon!
Sep. 2, 2025: ๐๐ My presentation at the NL Workshop received the โIPSJ Yamashita SIG Research Award 2025.โ
Aug. 23, 2025: I gave a keynote speech at the High School Scientific Conference (HSSC) hosted by SMAN 10 BEKASI.
Aug. 8, 2025: ๐ My paper was accepted at the MELT Workshop in conjunction with COLM 2025.
๏ผPast news can be found here๏ผ
Aug, 23. 2025:
I gave a keynote speech at the High School Scientific Conference (HSSC) hosted by SMAN 10 BEKASI.
Oct. 10, 2025:
"Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLM"
Koshiro Saito, Sakae Mizuki, Masanari Ohi, Taishi Nakamura, Taihei Shiotani, Koki Maeda, Youmi Ma, Kakeru Hattori, Kazuki Fujii, Takumi Okamoto, Shigeki Ishida, Hiroya Takamura, Rio Yokota, Naoaki Okazaki. The 1st Workshop on Multilingual and Equitable Language Technologies (MELT) (in conjunction with The Second Conference On Language Modeling (COLM))
๐๐ Selected as a spotlight! (Ref: Accepted submissions: https://melt-workshop.github.io/papers/.)
Oct. 10, 2025:
"Building Instruction-Tuning Datasets from Human-Written Instructions with Open-Weight Large Language Models."
Youmi Ma, Sakae Mizuki, Kazuki Fujii, Taishi Nakamura, Masanari Ohi, Hinari Shimada, Taihei Shiotani, Koshiro Saito, Koki Maeda, Kakeru Hattori, Takumi Okamoto, Shigeki Ishida, Rio Yokota, Hiroya Takamura, Naoaki Okazaki. COLM2025.
Sep. 17, 2025:
๐๐ Receive the Special Jury Award at the YANS 2025 Hackathon!
Sep. 6-7, 2024:
"Easily detectable LLMs without sacrificing its generative capability" [Poster (JP)]
Koshiro Saito [1], Ryuto Koike [1], Masahiro Kaneko[2], Naoaki Okazaki[1] (1: TITECH, 2: MBZUAI)
๐๐ Won the โEncouragement Awardโ and โSponsor Awardโ at YANS2024
Sep. 3, 2024:
"Advantages of Training LLMs on Japanese Text"[Paper (JP), Slides (JP)]
Koshiro Saito [1], Sakae Mizuki [2,1], Masanari Ohi [1], Takashi Nakamura [1], Taihei Shiotani [1], Koki Maeda [1], Ma Youmi [1], Kakeru Hattori [1], Kazuki Fujii [1], Takumi Okamoto [1], Shigeki Ishida [1], Hiroya Takamura [2], Rio Yokota [1], Naoaki Okazaki [1] (1: TITECH, 2: AIST)
๐๐ Awarded Excellence at the 261st NL Research Presentation
๐๐ Awarded the โIPSJ Yamashita SIG Research Award 2025.โ
Mar. 2025:
"ๆจกๅฃๅญฆ็ฟใซใใๅคง่ฆๆจก่จ่ชใขใใซใฎๆ็คบใใฅใผใใณใฐ (Instruction Tuning of Large Language Models Through Imitation Learning)".
Youmi Ma, Sakae Mizuki, Kazuki Fujii, Taishi Nakamura, Masanari Ohi, Hinari Shimada, Taihei Shiotani, Koshiro Saito, Koki Maeda, Kakeru Hattori, Takumi Okamoto, Shigeki Ishida, Rio Yokota, Hiroya Takamura, Naoaki Okazaki. The 31st Annual Meeting of the Association of Natural Language Processing (NLP2025), Nagasaki, Japan. (in Japanese).
Mar. 2025:
"ๆฐ่่จไบใใใคใใ ๆไบใจ็คพไผใซๅผทใๆฅๆฌ่ชLLM (Japanese LLM Powered by News Articles: Strong in Current Events and Social Issues)".
Kakeru Hattori, Sakae Mizuki, Kazuki Fujii, Taishi Nakamura, Taihei Shiotani, Kokoro Ueki, Takuro Niitsuma, Akira Kawabata, Hideaki Tamori, Youmi Ma, Koki Maeda, Masanari Ohi, Koshiro Saito, Takumi Okamoto, Shigeki Ishida, Rio Yokota, Hiroya Takamura, Naoaki Okazaki. The 31st Annual Meeting of the Association of Natural Language Processing (NLP2025), Nagasaki, Japan. (in Japanese).
Mar. 2025:
"Swallowใณใผใในv2: ๆ่ฒ็ใชๆฅๆฌ่ชใฆใงใใณใผใในใฎๆง็ฏ (Swallow Corpus v2: Building an Educational Japanese Web Corpus)". Kakeru Hattori, Naoaki Okazaki, Sakae Mizuki, Kazuki Fujii, Taishi Nakamura, Masanari Ohi, Taihei Shiotani, Koshiro Saito, Youmi Ma, Koki Maeda, Takumi Okamoto, Shigeki Ishida, Rio Yokota, Hiroya Takamura.ย The 31st Annual Meeting of the Association of Natural Language Processing (NLP2025), Nagasaki, Japan. (in Japanese).ย
Apr. 2025 - Present: Research Assistant Okazaki Lab๏ผScience Tokyo
Apr. 2025 - Present: Teaching Assistant Center for DS & AI Education๏ผScience Tokyo
Jul. 2024 - Mar. 2025: Research Assistant Okazaki Lab๏ผTokyo Tech
Feb. 2023 - Jun. 2025: Internship (R&D) Canserscan Inc.
I believe that LLM is the essential bridge that will enable the realization of a symbiotic society between AI and humans.
Watermark [Reinforcement Learning.] (Currently ๐)ย
Is it possible to make models detectable while keeping their abilities?
Is it possible to tell whether it is human-written or AI-generated, and when would it be useful?
Models/Datasets Evaluation (Currently ๐)
How & What to evaluate for 'better' models?
What are 'good' datasets?
Services and business leveraging LLMs. (Actually, most interested in now ๐ฅ)
How can we make LLMs so that everyone on Earth can access and enjoy their benefits?
Can LLMs serve society in any aspect other than Language and Generation?
Apr. 2025 - Present: Master of Engineering (Computer Science) Institute of Science Tokyo
Apr. 2021 - Mar. 2025: Bachelor of Engineering (Computer Science) Tokyo Institute of Technology
English
TOEIC: 910
TOEFL-iBT: 97 (MyBest Scores)
EIKEN: Grade Pre-1
Tech
Fundamental Information Technology Engineer Examination
Jul. 2024: Scholarship by Keyence Foundation