Haruki Konii (M2) and Prof. Tetsuji Ogawa presented our latest research at ACPR 2025, an international pattern-recognition conference held in Gold Coast, Australia.
In collaboration with Prof. Tomomi Sato (School of Nursing, Yokohama City University), Haruki Konii introduced a new pattern-recognition paradigm powered by vision–language models. Prof. Ogawa, presenting on behalf of Michihiro Nakata (M.S. '24), showcased a framework for developing and operating state-monitoring systems that use video to monitor livestock.
Tackling real-world problems with cutting-edge AI on international stages is one of our lab's strengths. If you're ready to take on these challenges, come join us!
References:
Haruki Konii, Teppei Nakano, Mari Wakabayashi, Tomomi Sato, Tetsuji Ogawa, ``Image recognition framework via adaptive class descriptions with vision-language models,'' Proc. The 8th Asian Conference on Pattern Recognition (ACPR 2025), pp.397-411, Nov. 2025.
Michihiro Nakata, Teppei Nakano, Susumu Saito, Tetsuji Ogawa, ``Towards farmers' decision support: Explainable-by design modeling for calving sign detection in cattle,'' Proc. The 8th Asian Conference on Pattern Recognition (ACPR 2025), pp.427-441, Nov. 2025.
Kaito Kosaki (M1) and Jun Taniguchi (B4) presented their work at APSIPA 2025, an international conference on signal and information processing held in Singapore.
In collaboration with Prof. Tomomi Sato (School of Nursing, Yokohama City University), Kaito Kosaki presented a method for stable detection of eye open/close cues to support communication with children with profound intellectual and multiple disabilities (PIMD), a nursing × AI project that infers affect from subtle ocular movements. In partnership with Daiichikosho Co., Ltd., Jun Taniguchi introduced a system that uses large language models to interpret song lyrics and automatically retrieve/select karaoke background videos aligned with each song's worldview, an innovative entertainment × AI application.
Showcasing practical, human-centered AI on international stages is a hallmark of our lab.
References:
Kaito Kosaki, Teppei Nakano, Mari Wakabayashi, Tomomi Sato, Tetsuji Ogawa, ``Strong eye closure detection in children with profound intellectual and multiple disabilities using robust temporal difference features,'' Proc. The 17th Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA2025), pp.2477-2482, Oct. 2025.
Tomoki Ariga, Jun Taniguchi, Yosuke Higuchi, Sayaka Toma, Kunihiro Abe, Rie Shigyo, Tetsuji Ogawa, ``Lyric-aware karaoke background video selection using large language models and moment retrieval,'' Proc. The 17th Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA2025), pp.1492-1497, Oct. 2025.
This fall, one Ph.D. student, one M.S. student, one fourth-year undergraduate, and one non-degree research student joined the Ogawa Lab, making our community even more vibrant.
Across the Kobayashi–Ogawa Labs, we now have 37 student members in total. In addition, eight third-year undergraduates are enrolled in Project Research B in the Ogawa Lab, tackling hands-on topics with real-world datasets.
We're excited to take on new challenges together with colleagues from diverse backgrounds.
Kohei Saijo (Ph.D. Year 3) and Takuma Yabe (M2) presented our lab's work at EUSIPCO 2025 in Palermo, Italy. Prof. Ogawa also served as session chair for a session on speech evaluation.
Using TF-Locoformer as a case study, Kohei Saijo delivered a comprehensive comparison of positional encoding strategies, how to represent ordering in time sequences and in the time–frequency plane, for dual-path transformer source-separation models. The careful, bottom-up analysis shows how positional-encoding choices can unlock model performance. Takuma Yabe reported on the design of qualification tests for crowdsourced subjective audio-quality evaluation, focusing on how to choose voice samples that yield reliable results when large numbers of listeners participate online. The study provides concrete guidance for building trustworthy crowdsourcing pipelines.
References:
Kohei Saijo, Tetsuji Ogawa, ``A comparative study on positional encoding for time-frequency domain dual-path transformer-based source separation models,'' Proc. the 33rd European Signal Processing Conference (EUSIPCO2025), pp.446-450, Sept. 2025.
Takuma Yabe, Moe Yaegashi, Teppei Nakano, Tetsuji Ogawa, ``Necessity of voice sample selection in qualification tests for crowdsourced subjective audio quality evaluation,'' Proc. the 33rd European Signal Processing Conference (EUSIPCO2025), pp.261-265, Sept. 2025.
We held a joint summer camp at the Karuizawa Seminar House with the Kobayashi and Ogawa Labs.
The program featured 11 B4 midterm thesis updates and 6 M2 midterm thesis updates, followed by lively Q&A that sharpened each project's focus. After the sessions, everyone enjoyed tennis, basketball, soccer, and table tennis, creating a great opportunity for cross-year and cross-group interaction. Many thanks to all presenters and to the organizers who planned and ran the event—excellent work!
``Work hard on research, play hard together.'' That rhythm, and the sense of community it builds, is a hallmark of our joint labs.
Yosuke Higuchi (Research Fellow) presented our latest work at INTERSPEECH 2025, a premier conference on speech and language processing held in Amsterdam, the Netherlands.
Our study introduces a method that leverages the strong translation abilities of large language models (LLMs) to enhance end-to-end speech translation. By intelligently integrating an LLM with a direct speech-to-text translation model, we aim to achieve more reliable cross-lingual translations. Prof. Ogawa also served as session chair for a session on speaker recognition.
Reference:
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, ``End-to-end speech translation guided by robust translation capability of large language model,'' Proc. The 26th Annual Conference of the International Speech Communication Association (INTERSPEECH2025), pp.21-25, Aug. 2025.
Prof. Tetsuji Ogawa delivered an invited tutorial at the 87th Artificial Intelligence Seminar hosted by the National Institute of Advanced Industrial Science and Technology (AIST).
Speaking in the themed session ``AI Technologies and Wind Power,'' Prof. Ogawa introduced predictive maintenance for wind turbines—detecting early signs of failure to ensure safe and reliable operation. The tutorial covered the core concepts of predictive maintenance, practical caveats for deploying anomaly-detection systems in the field, and a concrete case study from an ongoing renewable-energy data utilization collaboration (University of Tokyo, AIST, Chubu University, Waseda University): vibration-signal–based anomaly detection for wind turbines.
The talk underscored how AI and signal processing can contribute to the stable operation of renewable energy and social infrastructure.
Reference:
Tetsuji Ogawa, ``Fault precursor detection for predictive maintenance of wind turbines,'' AIST 87th Artificial Intelligence Seminar ``AI Technologies and Wind Power,'' Online, Aug. 2025 (in Japanese).
Takuya Wakayama (M2) and Kengo Miyauchi (M1) visited the floating offshore wind farm under construction off Hibikinada, Fukuoka, along with related facilities.
As part of the Academic Alliance for Renewable-Energy Data Utilization (The University of Tokyo, AIST, Chubu University, Waseda University), we regularly organize study sessions and site visits. This time, the team toured one of Japan’s largest offshore wind turbines and associated infrastructure.
Through discussions with project stakeholders, we gained a field-level perspective on the importance of condition monitoring for offshore turbines and the future of maintenance, including the design of anomaly-detection technologies and effective presentation of diagnostic information to operators. We are grateful to Green Power Investment Corporation for providing this valuable opportunity to experience firsthand how AI and signal processing contribute to the reliable operation of renewable energy and social infrastructure.
We held the results session for Project Research A (Spring semester, B3) in the Ogawa Lab.
Every project was impressively high quality, clearly reflecting each student's interests and strengths. Kudos to all presenters—great work! Below are the topics tackled this semester. The ability to take on such a wide range of themes from the third undergraduate year is one of our lab's hallmarks.
Music / Singing
Vibrato detection toward subjective singing–quality assessment
Speech / Audio
Fundamentals of automatic speech recognition (two students)
Fundamentals of speech synthesis
Fundamentals of source separation
Vision / Image Processing
Object detection with DINO-DETR
Fundamentals of image processing with CNNs
General Machine Learning
Effects of data augmentation in SimCLR
Evaluating large language models for vegetable quality assessment
Classifying monocots vs. dicots (group work, two students)
Comparative study of kernels in SVM
Impact of kernel choice on SVM performance and multi-kernel learning (group work, two students)
Detecting behaviors of previously unseen malware
Fundamentals of statistical hypothesis testing
Python exercises in the ML study group (two students)
Waseda University held its Open Campus on August 1–2.
From the Kobayashi–Ogawa Labs, we presented a hands-on demo of an English speaking–ability assessment being developed by our lab-originated startup, Ecumenopolis Inc. Visitors conversed in English with our speech dialogue agent InteLLA, experiencing firsthand how AI evaluates both what you say (content) and how you say it (delivery/prosody).
It was a great opportunity to see how lab-born research moves directly into society. If this sparked your interest, we’d love to have you join us and explore these ideas together in our lab!
Haruki Konii (M2) and Jun Taniguchi (B4) presented two papers at the 28th Meeting on Image Recognition and Understanding (MIRU 2025), held at the Kyoto International Conference Center.
Haruki Konii, in collaboration with Prof. Tomomi Sato (School of Nursing, Yokohama City University), introduced a new pattern-recognition paradigm leveraging vision–language models, built on adaptive class descriptions. Jun Taniguchi, in partnership with Daiichikosho Co., Ltd., presented a system that interprets song lyrics with large language models and automatically retrieves/selects karaoke background videos aligned with the lyrics, an original entertainment × AI application.
A distinctive strength of our lab is delivering and showcasing diverse applied research, ranging from nursing × AI to entertainment × AI.
The Oto-Gaku Symposium 2025—a research meeting covering the full spectrum of studies on sound—was held at Waseda University's Nishi-Waseda Campus. Prof. Tetsuji Ogawa served on the on-site local organizing team for the first time in a while.
Under Motoi Omachi (LINE Yahoo; Kobayashi Lab alumnus) as General Chair, the event was held fully in person despite severe room constraints caused by campus renovation. Thanks to the tireless efforts of Omachi-san and the organizing committee, the venue stayed lively from start to finish and the symposium was a great success.
From our lab-originated startup Ecumenopolis Inc., Yoichi Matsuyama delivered an invited talk, ``Developing a Conversational Diagnostic AI Agent that Unleashes Human Potential.'' It was a valuable opportunity to see how lab-born ideas move into entrepreneurship and real-world deployment, then grow further with academic feedback.
In addition, Hiroaki Sato (NHK Science & Technology Research Laboratories) received the IEICE SP Research Encouragement Award for the joint work ``Streaming Automatic Speech Recognition Based on Uncertainty with Evidential Deep Learning,'' which was recognized at the symposium's closing ceremony. Congratulations!
Haruki Konii (M2), Takuya Wakayama (M2), Keisuke Kobayashi (M2), Research Fellow Teppei Nakano, and Prof. Tetsuji Ogawa presented five papers at the Annual Conference of the Japanese Society for Artificial Intelligence (JSAI 2025), held at the Osaka International Convention Center.
All five studies address real social challenges in Japan through applied AI:
Haruki Konii, with JAMSTEC and the Kochi Prefectural Fisheries Research Institute, presented a positive–unlabeled (PU) learning model to narrow candidate fishing grounds, leveraging data from unexplored sea areas to improve accuracy.
Research Fellow Teppei Nakano (on behalf of Michihiro Nakata, M.S. '25), with Farmers Support Inc., reported field insights from operating a video-based calving monitoring system for breeding cattle on real farms.
Keisuke Kobayashi, with Kitasato University, examined how to deploy deep object detection for estrus behavior detection from video in breeding cattle.
Takuya Wakayama, from the Academic Alliance for Renewable-Energy Data Utilization, proposed a feature-learning method that detects fault precursors in wind turbines from vibration signals with high precision and small-data robustness.
Prof. Ogawa (on behalf of Kouta Mochida, M.S. '25), with Yokohama City University, introduced a framework to estimate the affective state of children with severe disabilities while minimizing caregiver burden, combining human-in-the-loop learning and vision–language models.
From fisheries and livestock to renewable energy, nursing, and healthcare, our lab advances AI with real-world impact. If ``real societal problems × AI'' resonates with you, this is a great place to dive in.
We held an orientation for students newly advanced to the Department of Communications & Computer Engineering.
The program had two parts: an alumni talk session followed by group work.
In Part 1, we welcomed Mika Hasegawa (Kobayashi Lab alum; now at NTT DATA) and Kazuki Matsumoto (Ogawa Lab alum; now at Tokyo University of Agriculture and Technology). They shared their current work and research, and how their experiences at Waseda shaped their careers. A long line of questions formed afterward—students actively engaged with the speakers.
Part 2 featured small-group, game-style activities (about 8 students per group) to help everyone connect right after the department assignment. Conversations quickly warmed up around research interests and personal values, helping students get a feel for the department's atmosphere.
We hope the alumni's candid messages, and the new connections made among classmates, become meaningful springboards for your university life and future career choices.
We have launched a new international collaboration with the Egypt-Japan University of Science and Technology (E-JUST), with Prof. Walid Gomaa visiting Waseda for short stays.
With JICA's establishment phase for E-JUST concluding in January 2025, the initiative is entering a new stage: building an academic network that links Japanese and African universities with E-JUST as a hub. As part of this effort, we started a joint project on video-based predictive maintenance for rotating machinery. Centered on E-JUST, the collaboration includes partners in Cameroon and Nigeria, forming an international research framework.
During his May and July visits, Prof. Gomaa delivered talks, led discussions, and conducted data collection in Waseda's mechanical engineering labs. Through this partnership, we aim, modestly but steadily, to advance international cooperation in science and technology and the practical development of predictive-maintenance technologies.
This is also a compelling opportunity for students and researchers interested in international joint research and global partnerships.
We held our joint spring retreat at Waseda University's Kamogawa Seminar House.
Unlike the summer camp, the spring retreat is all about camaraderie—playing hard, day and night. Held a month earlier than usual, the timing right after lab assignments helped everyone quickly connect across years and research themes.
Alongside classics like tennis and soccer, we tried a ``minor'' sport this year: wiffle ball. The ball control is wildly tricky, but even the seniors had a blast.
Huge thanks to our M1 organizers for planning and running a fantastic event!
Yosuke Higuchi (Research Fellow) and Kohei Saijo (Ph.D. Year 3) presented their work at the 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2025) in Hyderabad, India.
Yosuke Higuchi study proposes a method that leverages the zero-shot capabilities of instruction-tuned large language models (LLMs) to guide both training and decoding in end-to-end ASR, a cutting-edge fusion of speech recognition and LLM technology. Kohei Saijo also presented results from his internship at Mitsubishi Electric Research Laboratories (MERL). Prof. Ogawa served as session chair for a session on speaker spoofing.
Reference:
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, ``Harnessing the zero-shot power of instruction-tuned large language model for guiding end-to-end speech recognition,'' Proc. 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2025), pp.1-5, April 2025.
Starting this year, our lab has joined the LLM-jp Dialogue Working Group (Higashinaka Group).
In collaboration with Prof. Ryuichiro Higashinaka (Nagoya University; lead, ToM benchmarking) and Assoc. Prof. Shinnosuke Takamichi (Keio University; speech–language data creation), we will advance R&D on Japanese large language models.
Ogawa Lab will spearhead speech–language modeling for full-duplex dialogue, aiming for next-generation interfaces where humans and AI can speak simultaneously and converse naturally.
This year, eight new B4 students (four planning to continue to the M.S. program) joined the Ogawa Lab—our 10th cohort.
With their arrival, the Ogawa Lab now has 19 students: 2 Ph.D., 9 M.S., and 8 B4. Across the Kobayashi–Ogawa Labs, we total 33 students. In addition, 18 B3 students are taking Project Research A in the Ogawa Lab, getting an early start on hands-on research topics.
Even as we grow, our seminars and day-to-day discussions will remain close-knit and interactive. If you're serious about speech processing and applications of machine learning/pattern recognition, we'd love to hear from you—come knock on our door!