by Gabriel Skantze, KTH Royal Institute of Technology
Abstract: To be Announced
Bio: Gabriel Skantze is a Professor at KTH Royal Institute of Technology in Stockholm, Sweden, where he leads several research projects related to speech communication, conversational AI, and human-robot interaction. His research is highly interdisciplinary, encompassing topics such as computational modelling of turn-taking, feedback and gaze in interaction, language learning, and language grounding. He is the former President of SIGDIAL, the ACL special interest group on Discourse and Dialogue. He is also co-founder and Chief Scientist of the company Furhat Robotics.
by Asli Celikyilmaz, Meta Fundamentals AI Research (FAIR)
Abstract: To be Announced
Bio: Asli Celikyilmaz is a Senior Staff Research Scientist/Manager at Meta Fundamentals AI Research (FAIR) and an Affiliate Professor at the University of Washington. Her research focuses on socially intelligent language agents: systems that remember, learn, and reason about beliefs, goals, and perspectives to collaborate with people and other AI. She works across reasoning, alignment, and evaluation to make these capabilities reliable in practice. She is serving on the editorial boards of Transactions of the ACL (TACL) as area editor and Open Journal of Signal Processing (OJSP) as Associate Editor. She has received several “best of” awards including NAFIPS 2007, Semantic Computing 2009, CVPR 2019, EMNLP 2023.
Numerous applications of conversational AI have been developed for counseling, coaching, training, and supporting systems in the health domain. Recent applications of Large and Visual Language Models have reached out and engaged professionals from all specialties, ranging from imaging diagnostics to operating rooms to psychological interventions. While the opportunities and advantages are being identified in AI and medical journals, the challenges of this type of translational research requires a deep analysis of understanding and modeling of what constitutes a sustainable dialogue between speakers such as doctors, technicians, patients, nurses in the very large context of health.
This special session is accepting (preliminary) research and position papers in the area of conversational AI applied to Health.
Recent advances in Multimodal Foundation Models are reshaping the landscape of human-machine dialogue. These models, capable of integrating text, speech, vision, and other modalities, are opening up new avenues for applications ranging from education to robotics and creative industries. While the opportunities and advantages are increasingly evident, the challenges of this translational shift require careful analysis of what constitutes sustainable and effective dialogue in multimodal settings, where machines and humans share a common ground, reason, and act in complex environments.
This special session is accepting (preliminary) research and position papers in the area of multimodal dialogue systems with foundation models.