Interactive conversational systems are increasingly proving to be superior to traditional graphical user interfaces (GUIs) because they enable more intuitive, human-like exchanges. The increasing adoption of conversational agents across a plethora of domains and real-world scenarios has revolutionised the way humans engage with machines, fostering more seamless and natural interactions. This shift is further supported by the prevalence of mobile devices and messaging platforms, which have become integral to daily life and introduced millions to technology-driven conversational experiences. With over half the global population now connected to the Internet and accessibility barriers continuing to decline, these systems are poised to play an even greater role in shaping human-technology interaction.
Although text-based interfaces remain the most common approach for conversational user interfaces (CUIs), advancements in foundational AI models have paved the way for multimodal CUIs that incorporate voice and visual elements. Integrating auditory and visual cues alongside text-based interactions enhances user engagement, particularly in complex use cases such as healthcare guidance and job interviews. This tutorial will provide an overview of cutting-edge research and established best practices for designing and deploying multimodal CUIs. It will also explore the key research challenges that need to be addressed to further advance these interfaces.
The tutorial will also showcase the benefits of employing novel conversational interfaces in the domains of human-AI decision-making, health and well-being, information retrieval, and crowd computing. We will discuss the potential of conversational interfaces in facilitating and mediating the interactions of people with AI systems. The tutorial will include interactive elements and discussions, and provide participants with materials to build conversational interfaces.