Wednesday 2 July, 11am, ** LT3 ** : Verena Rieser (Google DeepMind): Whose Gold? Re-imagining Alignment for Truly Beneficial AI
Abstract: Human feedback is often the "gold standard" for AI alignment, but what if this "gold" reflects diverse, even contradictory human values? This keynote explores the technical and ethical challenges of building beneficial AI when values conflict -- not just between individuals, but also within them. My talk advocates for a dual expansion of the AI alignment framework: moving beyond a single, monolithic viewpoint to a plurality of perspectives, and transcending narrow safety and engagement metrics to promote comprehensive human well-being.
Bio:Verena Rieser is a Senior Staff Research Scientist at Google DeepMind, where she founded the VOICES team (Voices-of-all in alignment). Her team's mission is to enhance Gemini's safety and usability for diverse communities. Verena has pioneered work in data-driven multimodal Dialogue Systems and Natural Language Generation, encompassing conversational RL agents, faithful data-to-text generation, spoken language understanding, evaluation methodologies, and applications of AI for societal good. Verena previously directed the NLP lab as a full professor at Heriot-Watt University, Edinburgh, and held a Royal Society Leverhulme Senior Research Fellowship. She earned her PhD from Saarland University.
Wednesday 4th June, Robotics Summer School, Postgrad Centre: seminars from Jesse Thomason (USC), and Yonatan Bisk (CMU).
Jesse Thomason (USC): Embracing Language as Grounded Communication
Abstract: Language is not text data, it is a human medium for communication. The larger part of the natural language processing (NLP) community has doubled down on treating digital text as a sufficient approximation of language, scaling datasets and corresponding models to fit that text. In this talk, I’ll highlight some of the ways my lab enables agents and robots to better understand and respond to human communication by considering the grounded context in which that communication occurs, including neurosymbolic multimodal reasoning, natural language dialogue and interaction for lifelong learning, and utilizing NLP technologies on non-text communication.
Bio: I am an Assistant Professor at the University of Southern California where I lead the Grounding Language in Actions, Multimodal Observations, and Robots (GLAMOR) Lab. Our research enables agents and robots to better understand and respond to human language by considering the grounded context in which that language occurs. Previously, I was a postdoctoral researcher at the University of Washington, and I received my PhD from UT Austin. Website: https://jessethomason.com/
Yonatan Bisk (CMU): Why is language an embodied problem?
Abstract: We all speak, gesture, and communicate every day, so it may come as naturally to us as walking or breathing, but it’s actually a rather unique ability. We all agree on the meaning of words without thinking about it, we construct novel combinations that others have never heard before and yet they have no trouble understanding us, and we want in the future that our computers and robots can do the same. So, where did we learn the meaning of words? And how does that help us teach robots to understand language? The goal of this talk is to provide an introduction to thinking about Talking to Robots, what makes it interesting, what makes it hard, and what the next steps are for AI.
Bio: Yonatan Bisk is an assistant professor of computer science at Carnegie Mellon University – Language Technologies Institute and Robotics Institute (courtesy). He received his PhD from the University of Illinois at Urbana-Champaign working on unsupervised Bayesian models of linguistic syntax. He runs the CLAW lab, Connecting Language to Action and the World. His group works on grounded and embodied language and communication, placing perception and interaction as central to how language is learned and understood. He has held appointments at USC's ISI (working on grounding), the University of Washington (for commonsense research), Microsoft Research (for vision+language), and Meta Inc (for Embodied AI). Website: http://www.yonatanbisk.com
Tuesday 22 April 1.30-2.30, National Robotarium Atrium: Dr Steve Cousins, Executive Director of the Stanford Robotics Center: "Bring Robotics to the Real World"
Abstract: We have a long history of robotics in manufacturing, and over the past 20 years a second wave of robotics in warehouses - both controlled environments. As we bring robots into the world to operate in unconstrained environments around the general public, new challenges arise. This talk will review work from Willow Garage with early humanoid robots, and some of the challenges of bringing robots to hotels and hospitals. We will introduce the Stanford Robotics Center, which is looking at the next generation of challenges.
Bio: Dr. Steve Cousins is the Executive Director of the Stanford Robotics Center. He founded Relay Robotics, formerly Savioke, serving as CTO and CEO, where he led the development and deployment of Relay – an autonomous delivery robot that works in human environments to help people. Steve was previously President and CEO of Willow Garage, and is a founding board member of the Open Source Robotics Foundation. Steve received the IEEE/IFR Award for Invention and Entrepreneurship in Robotics and Automation award in 2017. Steve did his PhD in Computer Science at Stanford University under the direction of Terry Winograd, and also holds BS and MS computer science degrees from Washington University.
Wednesday 15th January, 1pm, Robotarium Seminar room (G83): Dr Oya Celiktutan "Designing Socially Acceptable Interactions with Embodied Agents"
Abstract: As embodied agents, such as robots and virtual agents, gain autonomy and integrate into daily environments to assist and collaborate with humans, their success depends not only on task performance but also on adhering to social norms and expectations. In this talk, Oya will explore the definitions, key concepts, and challenges involved in creating socially acceptable agents. Drawing from her ongoing research, she will present practical examples, including how agents can navigate crowded environments with social awareness, explain their actions to users, and learn to imitate human behaviours.
Bio: Dr Oya Celiktutan is a Reader in AI & Robotics at the Centre for Robotics Research in the Department of Engineering and leads the Social AI & Robotics Laboratory at King's College, London. She received a BSc degree in Electronics Engineering from Uludag University, and an MSc and PhD degree in Electrical and Electronics Engineering from Bogazici University, Turkiye. During her doctoral studies, she was a visiting researcher at the National Institute of Applied Sciences of Lyon, France. After completing her PhD, she moved to the United Kingdom and worked on several projects as a postdoctoral researcher at Queen Mary University London, the University of Cambridge, and Imperial College London, respectively. Oya’s research focuses on multimodal machine learning to develop autonomous agents, such as robots and virtual agents, capable of seamlessly interacting with humans. This encompasses tackling challenges in multimodal perception, understanding and forecasting human behaviour, as well as advancing the navigation, manipulation, and social awareness skills of these agents. Her work has been supported by EPSRC, The Royal Society, and the EU Horizon, as well as through industrial collaborations. She received the EPSRC New Investigator Award in 2020. Her team’s research has been recognised with several awards, including the Best Paper Award at IEEE Ro-Man 2022, NVIDIA CCS Best Student Paper Award Runner Up at IEEE FG 2021, First Place Award and Honourable Mention Award at ICCV UDIVA Challenge 2021.
Friday 6th December, 3pm, Robotarium Seminar room (G83): Prof Nava Tintarev "How do we make explanations of recommendations beneficial to different users?"
Abstract: A lot of people recognize the importance of explainable AI. This is also the case for personalized online content which influences decision-making at individual, business, and societal levels. Filtering and ranking algorithms such as those used in recommender systems support these decisions. However, we often lose sight of the purpose of these explanations and whether understanding is an end in itself. This talk addresses why we may want to develop decision-supportsystems that can explain themselves and how we may assess that we are successful in this endeavor. This talk will describe some of the state-of-the-art explanations in several domains that help link the mental models of systems and people. However, it is not enough to generate rich and complex explanations; more is required to support effective decision-making. This entails decisions around which information to select to show to people and how to present that information, often depending on the target users and contextual factors.
Bio: Nava Tintarev is a Full Professor of Explainable Artificial Intelligence in the Department of Advanced Computing Sciences (DACS) where she is also the Director of Research. She leads or contributes to several projects in the field of human-computer interaction in artificial advice-giving systems, such as recommender systems; specifically developing the state-of-the-art for automatically generated explanations (transparency) and explanation interfaces (recourse and control). ). Prof. Tintarev has developed and evaluated explanation interfaces in a wide range of domains (music, human resources, legal enforcement, nature conservation, logistics, and online search), machine learning tasks (recommender systems, classification, and search ranking); and modalities (text, graphics, and Interactive combinations thereof). She is currently representing Maastricht university as a Co-Investigator in the ROBUST consortium, pre-selected for a national (NWO) grant with a total budget of 95M (25M from NWO) to carry out long term (10-years) research into trustworthy artificial intelligence. She has published over 100 peer-reviewed papers in top human-computer interaction and artificial intelligence journals and conferences such as UMUAI, TiiS, ECAI, IUI, Recsys, and UMAP. These include best paper awards at the following conferences: CHI, Hypertext, HCOMP, UMAP and CHIIR. Webpage: http://navatintarev.com
Weds 27th March 1pm, Robotarium Seminar room (G50): Mohan Sridharan "Back to the Future of Cognition and Control in Robotics"
Abstract: In this talk, I will describe my vision and philosophy for designing an architecture for integrated knowledge representation, reasoning, control, and learning in robotics. I will begin by describing the underlying fundamental representational choices, processing commitments, and cognitive principles and theories that allow us to leverage the complementary strengths of knowledge-based and data-driven methods. I will then illustrate the capabilities of the architecture in realistic simulation environments and on physical robots. I will do so in the context of key visual scene understanding, manipulation, embodied AI, and multiagent collaboration problems.
Bio: Prof. Mohan Sridharan is a Chair in Robot Systems in the School of Informatics at the University of Edinburgh (UK). Prior to his current appointment, he held academic positions at the University of Birmingham (UK), The University of Auckland (NZ), and at Texas Tech University (USA). He received his Ph.D. from The University of Texas at Austin (USA). His research interests include knowledge representation and reasoning, cognitive systems, and interactive learning, as applied to robots and agents collaborating with humans. He is also interested in developing algorithms to promote automation and sustainability in domains such as transportation, agriculture, and climate informatics. Web page: https://homepages.inf.ed.ac.uk/msridhar/
Weds 13th March, 1pm, room 9, National Robotarium, first floor, Sandro Pezzelle "From Word Representation to Communicative Success: Beyond Image-Text Alignment in Language-and-Vision Modeling"
Abstract: By grounding language into vision, multimodal NLP models have a key advantage over purely textual ones: they can leverage signals in one or both modalities and potentially combine this information in any way required by a given communicative context. This ranges from representing single words taking into account their multimodal semantics [1] to resolving semantically underspecified image descriptions [2] to adapting their way of referring to images to achieve communicative success with a given audience [3]. Moving from word-level semantics to real-life communicative scenarios, I will present work investigating the abilities of current language and vision models to account for and deal with semantic and pragmatic aspects of human multimodal communication. I will argue that these abilities are necessary for models to successfully interact with human speakers.
[1] Pezzelle, S., Takmaz, E., Fernández, R. (2021). Word Representation Learning in Multimodal Pre-Trained Transformers: An Intrinsic Evaluation. TACL.
[2] Pezzelle, S. (2023). Dealing with Semantic Underspecification in Multimodal NLP. ACL 2023.
[3] Takmaz, E., Brandizzi, N., Giulianelli, M., Pezzelle, S. and Fernández, R. (2023). Speaking the Language of Your Listener: Audience-Aware Adaptation via Plug-and-Play Theory of Mind. Findings of ACL 2023.
Bio: Assistant Professor in Responsible AI at the ILLC, Faculty of Science, University of Amsterdam. Affiliated with the Dialogue Modelling Group. His research combines Natural Language Processing (NLP), Computer Vision, and Cognitive Science, and focuses on multimodal language understanding and generation, behavioral and mechanistic interpretability, and the cognitive mechanisms underlying human semantics. He published in ACL, EACL, EMNLP, NAACL, TACL, Cognition, and Cognitive Science. He is a member of the ELLIS society, a faculty member of the ELLIS Amsterdam Unit, and a board member of SigSem, the ACL special interest group in computational semantics.
Tues March 12th, 1pm, room: CMS01, Stefan Ultes: "Towards Natural Behaviour of Dialogue Systems with Explicit Dialogue Control"
Abstract: The goal of dialogue system researchers has always been to create artefacts that offer natural interaction capabilities and effortlessly communication by means humans also used to communicate among themselves. Even though systems like ChatGPT are already very good in form and style, there are more things to natural dialogue system behaviour than these LLM-based agents are capable of. I believe that this requires additional control capabilities of a dialogue system. In this talk, I will motivate this with insights from an analysis of communication styles in dialogues. I will continue with focussing on work on learning an explicit dialogue control component through reinforcement learning by optimizing on the estimated user satisfaction and thus ultimately improving the perceived naturalness of the interaction. I will finish with arguing that this basic idea is still relevant in the age of LLMs.
Bio: Stefan Ultes is a full professor of natural language generation and dialogue Systems at the Otto-Friedrich-University of Bamberg, Germany, and a member of the executive board of the Bamberg Center for Artificial Intelligence (BaCAI). Previously, he was leading the speech technology research group at Mercedes Benz Research & Development in Sindelfingen, Germany, and a research associate at the spoken dialogue systems group at the University of Cambridge, UK.