Accepted papers

Poster session 1

Challenges of Data-Driven Simulation of Diverse and Consistent Human Driving Behaviors

Kalle Kujanpää, Daulet Baimukashev, Shibei Zhu, Shoaib Azam, Farzeen Munir, Gokhan Alcan, Ville Kyrki

Building simulation environments for developing and testing autonomous vehicles necessitates that the simulators accurately model the statistical realism of the real-world environment, including the interaction with other vehicles driven by human drivers. To address this requirement, an accurate human behavior model is essential to incorporate the diversity and consistency of human driving behavior. We propose a mathematical framework for designing a data-driven simulation model that simulates human driving behavior more realistically than the currently used physics-based simulation models. Experiments conducted using the NGSIM dataset validate our hypothesis regarding the necessity of considering the complexity, diversity, and consistency of human driving behavior when aiming to develop realistic simulators.

Estimating Expert Prior Knowledge from Optimization Trajectories

Ville Tanskanen, Petrus Mikkola, Arto Klami

Bayesian optimization (BO) is a powerful algorithm for optimizing black-box functions, especially in scenarios where evaluating the function is costly e.g. due to requiring empirical experimentation. BO inherently models the optimization process as sequential, where each iteration consists of two stages: first, updating beliefs about the objective function, and then addressing the exploration-exploitation dilemma based on these beliefs. Even though this process can be purely autonomous, in many instances the humans play a significant role and there is clear demand for improved tools for human-AI collaboration in black-box optimization. We study a particular new task within this scope: How to recover prior information over the unknown function a human has, solely based on how they solve the optimization task. We introduce the problem, provide mathematical machinery for solving it, and demonstrate the method and the problem characteristics on simulated data.

Advancing Human Modeling in Autonomous Systems: A Perspective on Multi-Task Pedestrian Behavior Prediction

Farzeen Munir, Tomasz Pitor Kucner

Human behavior modeling is crucial for developing socially-aware autonomous vehicles that can safely navigate complex environments. This is particularly true for pedestrian behavior, which is inherently intricate due to the dynamic and unpredictable nature of human movement in urban settings. We formulate the modeling of pedestrian behavior as a combination of trajectory and intention prediction. Traditional methods have predominantly relied on historical trajectory data, often neglecting vital contextual information such as pedestrian-specific characteristics and environmental factors. Most existing studies treat trajectory prediction and intention prediction as separate tasks, even though they are interrelated and interdependent. Recognizing the intertwined nature of trajectory and intention in pedestrian behavior, our work adopts a holistic approach. We utilize a multi-task learning framework that concurrently predicts pedestrian intentions and trajectories. This involves an encoder-decoder architecture that not only considers past trajectories but also integrates a rich set of behavioral and contextual features. Our methodology's effectiveness is demonstrated through performance in trajectory and intention prediction on two leading datasets: JAAD and PIE. These advancements underscore the significance of comprehensive human behavior modeling in enhancing the safety and efficiency of autonomous vehicles in diverse and unpredictable urban environments.

Low-code LLM: Graphical User Interface over Large Language Models

Yuzhe Cai, Shaoguang Mao, Wenshan Wu, Zehua Wang, Yaobo Liang, Tao Ge, Chenfei Wu, You Wang, Qiang Guan, Ting Song, Yan Xia, Furu Wei, Nan Duan

Effectively utilizing LLMs for complex tasks is challenging, often involving a time-consuming and uncontrollable prompt engineering process. This paper introduces a novel human-LLM interaction framework, $\textbf{Low-code LLM}$. It incorporates six types of simple low-code visual programming interactions, all supported by clicking, dragging, or text editing, to achieve more controllable and stable responses. Through visual interaction with a graphical user interface, users can incorporate their ideas into the workflow without writing trivial prompts. The proposed Low-code LLM framework consists of a Planning LLM that designs a structured planning workflow for complex tasks, which can be correspondingly edited and confirmed by users through low-code visual programming operations, and an Executing LLM that generates responses following the user-confirmed workflow. We highlight three advantages of the low-code LLM: controllable generation results, user-friendly human-LLM interaction, and broadly applicable scenarios. We demonstrate its benefits using four typical applications. By introducing this approach, we aim to bridge the gap between humans and LLMs, enabling more effective and efficient utilization of LLMs for complex tasks. All codes and prompts will be publicly available.

Investigating the Benefits of Nonlinear Action Maps in Data-Driven Teleoperation

Michael Przystupa, Gauthier Gidel, Matthew E. Taylor, Martin Jägersand, Justus Piater, Samuele Tosatto

As robots become more common for both able-bodied individuals and those living with a disability, it is increasingly important that lay people be able to drive multi-degree-of-freedom platforms with low-dimensional controllers. One approach is to use state-conditioned action mapping methods to learn mappings between low-dimensional controllers and high DOF manipulators -- prior research suggests these mappings can simplify the teleoperation experience for users. Recent works suggest that neural networks predicting a local linear function are superior to the typical end-to-end multi-layer perceptrons because they allow users to more easily undo actions, providing more control over the system. However, local linear models assume actions exist on a linear subspace and may not capture nuanced actions in training data. We observe that the benefit of these mappings is being an odd function concerning user actions, and propose end-to-end nonlinear action maps which achieve this property. Unfortunately, our experiments show that such modifications offer minimal advantages over previous solutions. We find that nonlinear odd functions behave linearly  for most of the control space, suggesting architecture structure improvements are not the primary factor in data-driven teleoperation. Our results suggest other avenues, such as data augmentation techniques and analysis of human behavior, are necessary for action maps to become practical in real-world applications, such as in assistive robotics to improve the quality of life of people living with w disability.

Spatio-Temporal Video Grounding of Human Actions for Human-in-the-Loop Artificial Intelligence

Hans Tiwari, Selen Pehlivan, Jorma Laaksonen

In this paper, we study spatio-temporal video grounding, that is a recently introduced computer vision task, for human-in-the-loop artificial intelligence setups. We hypothesize an AI expert system capable of giving instructions for building or servicing  technical apparatus. The task of the spatio-temporal video ground subsystem in the setup is to observe the actions of the human operator, ground the given instructions in the video frames, and provide this information back to the AI system for verification and preparation of the next instruction. Our experiments were carried out with the large HC-STVG dataset of human action videos. The results of the experiments show that our proposed enhancements to the state-of-the-art STCAT architecture provide improved performance, especially in cases where the operated objects are small in their spatial size. Spatio-temporal video grounding will prove to be a necessary building block in future assistive AI systems that relate to human actions and their automated observation.

Towards Student-Centric AI-Supported Learning: Teaching Chatbots to Ask the Right Questions

Lucile Alys Favero, Juan Antonio Pérez-Ortiz, Tanja Käser, Nuria M Oliver

LLM-based chatbots have the potential to profoundly transform education by providing instant and personalized responses to queries on a broad set of topics. However, essential aspects of human learning -such as critical thinking and the development of meta-cognition- remain elusive from today’s chatbots. In this paper, we propose Maike a novel educational chatbot designed to engage students in Socratic dialogues by asking questions rather than providing immediate answers. With a grounding on educational psychology theories, this approach promotes critical thinking, purposeful learning, and self-efficacy. Maike includes a Reinforcement Learning-based content planner to personalize the student’s learning path, which is dynamically updated based on their responses to Socratic prompts generated by an LLM. The goal is to improve learning outcomes through enhanced motivation and active learning. Maike has the potential to improve educational experiences via self-regulation and dynamic adaptation, particularly in online and remote learning environments.

Detecting and Discouraging Deception via Anomaly Detection

Nitay Alon, Lion Schulz, Peter Dayan, Jeffrey Rosenschein, Joseph M Barnby

Agents with limited computational capacities are susceptibleto manipulation by resource-abundant agents. This power im-balance is rooted in theoretical principles; however, thoughunable to reason explicitly about the deceptive moves ofstronger agents, computationally weaker agents might still re-sist malign effects by comparing observed behaviour with anexpected one, and resent if these two do not match. In thiswork show how incorporating an Information Theory (IT)inspired anomaly detection mechanism, called XIPOMDP,into the computationally constrained agent’s inference pro-cesses, inhibits deceptive behaviour in mixed-motive games.We illustrate this using agents with various degrees of The-ory of Mind (ToM) in the Iterated Ultimatum game (IUG),to conclude that this mechanism indeed impedes more com-plex agents from engaging in manipulative behaviour andalso yielding a more egalitarian outcome.

Learning to Synthesize Novel Molecules from Human Feedback

Yujia Guo, Shibei Zhu, Sebastiaan De Peuter, Andrew Howes, Samuel Kaski

Retrosynthesis planning is a critical process in synthetic chemistry involving the recursive deconstruction of a target molecule to discover appropriate reactants. The challenges such as the vastness of the chemical search space often hinder chemists from identifying optimal chemical transformations. While several software tools exist to aid in various stages of retrosynthesis planning, they typically lack the flexibility to adapt to chemists' specific preferences and requirements. We propose a chemist-in-the-loop retrosynthesis planning framework. Our goal is to design an AI assistant that more closely aligns with human goals via Reinforcement Learning from Human Feedback (RLHF). It leverages a pre-trained reward model, further fine-tuned based on human preferences efficiently inferred from noisy feedback through a computational rationality user model. The route generator is guided by Q-values derived from this reward model, aligning the generated routes more closely with user specifications and preferences. This AI assistant not only facilitates more personalized route generation but also offers enhanced decision-making support during route evaluation.

Learning how Humans Learn to Play Board Games with GPT-4IAR

Victor Yeom-Song, Xinlei (Daisy) Lin, Ionatan Kuperwajs, Heiko H. Schütt, Wei Ji Ma, Luigi Acerbi

We present GPT-4IAR, a transformer neural network architecture for modeling and predicting human behavior in the board game four-in-a-row (4IAR). Experiments show that conditioning action predictions on longer histories of previous moves leads to improved accuracy over prior state-of-the-art models, hinting at longer-term strategic biases in human gameplay. Reaction time prediction is also explored, showing promise in capturing meaningful gameplay statistics beyond raw actions. This work ultimately aims to produce a faithful emulator of human cognition to afford detailed investigation into how humans plan and make decisions.

Concept alignment as a Prerequisite for Value Alignment

Sunayana Rane, Mark K Ho, Ilia Sucholutsky, Thomas L. Griffiths

Value alignment is essential for building AI systems that can safely and reliably interact with people. However, what a person values---and is even capable of valuing---depends on the concepts that they are currently using to understand and evaluate what happens in the world. The dependence of values on concepts means that concept alignment is a prerequisite for value alignment---agents need to align their representation of a situation with that of humans in order to successfully align their values. Here, we formally analyze the concept alignment problem in the inverse reinforcement learning setting, show how neglecting concept alignment can lead to systematic value mis-alignment, and describe an approach that helps minimize such failure modes by jointly reasoning about a person's concepts and values. Additionally, we report experimental results with human participants showing that humans reason about the concepts used by an agent when acting intentionally, in line with our joint reasoning model.

AI-assisted Auction Design

Xiaomei Mi, Jianhong Wang, Samuel Kaski

In auction mechanism design, current research primarily targets automated design with clearly defined objectives, such as maximising revenue, generally excluding human involvement. However, when objectives are not precisely known, these automatic methods fall short. In such cases, we propose to help the auction mechanism designer to solve this problem by introducing an AI assistant which interactively recommends auction rules and shares the same interface in the auction environment with the designer. The cooperating pair of the designer and AI assistant can be conceptualised as a `centaur'. In this centaur setting, the interactions can be summarised as three phases: 1) The AI assistant recommends advice to the designer; 2) The designer accepts or rejects these recommendations based on their implicit user model; 3) The AI assistant updates its beliefs based on the designer's decision. Experiments on repeated Myerson auctions with designer's tacit preference show that AI-assistance improves performance compared to (simulated) humans designing auctions without assistance.

FARPLS: Eliciting Preference Feedback for Robot Trajectories from Human Users

Hanfang Lyu, Yuanchen Bai, Xin Liang, Ujaan Das, Chuhan Shi, Leiliang Gong, Yingchi LI, Mingfei Sun, Ming Ge, Xiaojuan Ma

Preference-based learning aims to align robot task objectives with human values. One of the most common methods to infer human preferences is by pairwise comparisons of robot task trajectories. Traditional comparison-based preference labeling systems seldom support labelers to digest and identify critical differences between complex trajectories recorded in videos. Our formative study (N = 12) suggests that individuals may overlook non-salient task features and establish biased preference criteria during their preference elicitation process because of partial observations. In addition, they may experience mental fatigue when given many pairs to compare, causing their label quality to deteriorate. To mitigate these issues, we propose FARPLS, a Feature-Augmented Robot trajectory Preference Labeling System. FARPLS highlights potential outliers in a wide variety of task features that matter to humans and extracts the corresponding video keyframes for easy review and comparison. It also dynamically adjusts the labeling order according to users’ familiarities, difficulties of the trajectory pair, and level of disagreements. At the same time, the system monitors labelers’ consistency and provides feedback on labeling progress to keep labelers engaged. A between-subjects study (N = 42, 105 pairs of robot pick-and-place trajectories per person) shows that FARPLS can help users establish preference criteria more easily and notice more relevant details in the presented trajectories than the conventional interface. FARPLS also improves labeling consistency and engagement, mitigating challenges in preference elicitation without raising cognitive loads significantly.

From Emotion AI to Cognitive AI

Yante Li, Guoying Zhao, Qianru Xu, Zhaodong Sun

Cognitive computing is recognized as the next era of computing. In order to make hardware and software systems that can be more human-like, emotion AI and cognitive AI which simulate human intelligence are the core of real AI. Currently, the boom of sentiment analysis and affective computing in computer science makes the rapid development of emotion AI. However, the research of cognitive AI just started in recent years. In this visionary paper, we briefly review the current development in emotion AI, introduce the concept of cognitive AI, and propose the envisioned future of cognitive AI, which intends to let computers think, reason, and make decisions in similar ways that humans do. The important aspect of cognitive AI in terms of engagement, regulation, decision making, and discovery are further discussed. Finally, we propose important directions for constructing future cognitive AI, including data and knowledge mining, multi-modal AI explainability, hybrid AI, and potential ethical challenges.

Shifting Skills Across Users and Generative AI Applications: The Case of Insights in Information Visualization

Giulio Jacucci, Chen He

This position paper aims at reflecting on the skills in human computer collaboration in the era of generative AI. The paper presents as an example a prototype system to investigate how users generate insights with a CO2 visualization aided by either web search or a LLMs chatbot. We review how human computer collaboration has evolved and what frameworks on skills can be helpful to critically examine human-AI collaboration.

Poster Session 2

How Does User Behavior Evolve During Exploratory Visual Analysis?

Sanad Saha, Nischal Aryal, Leilani Battle, Arash Termehchy

Exploratory visual analysis (EVA) is an essential stage of the data science pipeline, where users often lack clear analysis goals at the start and iteratively refine them as they learn more about their data.Accurate models of users' exploration behavior are becoming increasingly vital to developing responsive and personalized tools for exploratory visual analysis. Yet we observe a discrepancy between the static view of human exploration behavior adopted by many computational models versus the dynamic nature of EVA. In this paper, we explore potential parallels between the evolution of users' interactions with visualization tools during data exploration and assumptions made in popular online learning techniques. Through a series of empirical analyses, we seek to answer the question: how might users' exploration behavior evolve in response to what they have learned from the data during EVA? We present our findings and discuss their implications for the future of user modeling for system design.

AI-assistance for Integrating Expert Fairness Decisions in ML Models

Elena Shaw, Ayush Bharti, Pierre-Alexandre Murena, Samuel Kaski

Despite recent attempts in machine learning (ML) to formalize fairness generally, there is growing recognition that fairness is intrinsically tied to context. Data alone cannot create contextually fair ML models and human input is necessary to provide the relevant context. Human input can be implicit -- via modeling assumptions and modeler expertise -- or explicit -- via elicitation and learning. In the latter case, prior works have made various simplifying assumptions, either regarding the expert or the fairness criterion, which cannot account for a given expert's domain knowledge in the fairness-accuracy trade-off. This inability to adapt leads to sub-optimal query strategies and results in less efficient learning. Our method improves on this line of work by specifically accounting for the given expert, freeing us from an a priori definition of fairness, and allowing us to learn a more efficient query strategy based on past interactions with the expert.

Online Learning of Human Constraints from Feedback in Shared Autonomy

Shibei Zhu, Tran Nguyen Le, Samuel Kaski, Ville Kyrki

Real-time collaboration with humans poses challenges due to the different behavior patterns of humans resulting from diverse physical constraints. Existing works typically focus on learning safety constraints for collaboration, or how to divide and distribute the subtasks between the participating agents to carry out the main task. In contrast, we propose to learn human constraints model that, in addition, consider the diverse behaviors of different human operators. We consider a type of collaboration in a shared-autonomy fashion, where both a human operator and an assistive robot act simultaneously in the same task space that affects each other's actions. The task of the assistive agent is to augment the skill of humans to perform a shared task by supporting humans as much as possible, both in terms of reducing the workload and minimizing the discomfort for the human operator. Therefore, we propose an augmentative assistant agent capable of learning and adapting to human physical constraints, aligning its actions with the ergonomic preferences and limitations of the human operator.

Learning from Human Preferences with a Computationally Rational Model of Human Choice Behavior

Sebastiaan De Peuter, Andrew Howes, Samuel Kaski

Learning from human preferences has allowed us to train AI systems to summarize text, answer questions, and more. These systems learn by asking human evaluators to choose a preferred behavior from a number of options. Models of human choice are then used to infer from these expressed preferences a general utility function that can be used for further training. However, the human choice models commonly used for this purpose are too simple to reproduce well-known biases in human choice behavior. To address this, we propose to use a recent model of human choice grounded in cognitive science for preference learning. We adapt this model for a preference learning setting and introduce surrogates to make inference tractable. Finally, we show experimentally that on simulated biased choices our proposed model produces better inferences than current baselines.

The Role of Higher-Order Cognitive Models in Active Learning

Oskar Keurulainen, Gokhan Alcan, Ville Kyrki

Building machines capable of efficiently collaborating with humans has been a longstanding goal in artificial intelligence. Especially in the presence of uncertainties, optimal cooperation often requires that humans and artificial agents model each other's behavior and use these models to infer underlying goals, beliefs or intentions, potentially involving multiple levels of recursion. Empirical evidence for such higher-order cognition in human behavior is also provided by previous works in cognitive science, linguistics, and robotics. We advocate for a new paradigm for active learning for human feedback that utilises humans as active data sources while accounting for their higher levels of agency. In particular, we discuss how increasing level of agency results in qualitatively different forms of rational communication between an active learning system and a teacher. Additionally, we provide a practical example of active learning using a higher-order cognitive model. This is accompanied by a computational study that underscores the unique behaviors that this model produces.

Generative AI as Collaborative Technology for ASHA Workers in Marginalized Communities

Nimisha Karnatak, Max Van Kleek, Nigel Shadbolt

In India, Accredited Social Health Activists (ASHA) workers are crucial in delivering healthcare to marginalized rural communities. However, their effectiveness is hampered by challenges in accessing medical knowledge and decision-support tools. AI advancements offer potential to enhance ASHA capabilities, but cognitive biases' impact on AI collaboration remains a concern for its adoption. This thematic discourse analysis investigates biases at the human-AI interface from a cognitive science perspective. It uncovers automation and confirmation biases, compounded by difficulties in interpreting complex AI behaviors. Through participatory design, frontline health workers can help in creating tools aligned with their needs. The proposed research, employing qualitative methods and co-design, aims to ensure cultural relevance while addressing frictions, barriers, and training gaps. It seeks to set guidelines for ethical AI integration. This approach strives for better access and dignity, prioritizing empathetic solutions over just technological implementation. The objective is to validate interventions fostering mutual understanding, essential for tackling inherent inequities in the current system.

A Workflow for Computationally Rational Models of Human Behavior

Suyog Chandramouli, Danqing Shi, Aini Putkonen, Sebastiaan De Peuter, Shanshan Zhang, Jussi Jokinen, Andrew Howes, Antti Oulasvirta

Human-AI collaboration is emerging as a domain for facilitating improved creativity, decision-making and problem-solving in interactive settings by combining the strengths of humans and machines in a complementary manner. A key challenge for this is to actively un-derstand, anticipate, and adapt to the human user’s goals and capacities in an online manner with minimal disruption to the user. Computational user models that approximate the interaction behavior of the user can be useful in this setting – the theory of computational rationality provides a promising way to build user models for interaction (Oulasvirta et al., 2022). According to it, human behavior adapts to maximize expected utility in a given situation under the constraints imposed by the environment and finite cognitive resources. Computational models can be instantiated in a reinforcement learning framework, and involves three key steps: i) speci-fying rewards and cognitive resources, ii) formulating a sequential decision-making problem to control such resources to maximize expected reward, and iii) using methods like deep reinforcement learning to approximate the optimal policy solution. Despite successes inreproducing human-like adaptive behavior across everyday tasks, these models are complicated to build. The design of psychological assumptions, as well machine learning-related decisions concerning policy optimization, parameter inference, and model selection, aretangled and full of pitfalls. We lay out a workflow focused on this family of models, with the aim of improving their validity and applicability for collaborative settings.

Differentially Private Probabilistic User Modelling

Amir Sonee, Alex Hämäläinen, Lukas Prediger, Samuel Kaski

Differentiable surrogate user models are a key ingredient in addressing the hurdle of building computationally-efficient advanced user models of cognitive backgrounds, for interactive collaborative AI scenarios in AI-assisted systems. User models are needed to adapt to the user and the task, and differentiable surrogates efficiently approximate user models rendering them applicable on-line in real time. However, this adaptation give rise to privacy concerns about the underlying user's data. In this paper, we study the problem of privacy-preserving learning of such surrogates; learning from other users' data would obviously be helpful, but on sensitive data can only be done with privacy protection.  We propose a solution based on differentially-private meta-learning of latent variable probabilistic models. We demonstrate that the proposed algorithm enables an AI-assistant to successfully model users' cognitive behaviours even without direct access to the users' data, with an accuracy within 10\% of the non-private case on average. This will enable trustworthy incorporation of users into probabilistic programming models.

Differentiable User Models

Alex Hämäläinen, Mustafa Mert Çelikok, Samuel Kaski

Probabilistic user modeling is essential for building machine learning systems in the ubiquitous cases with humans in the loop. However, modern advanced user models, often designed as cognitive behavior simulators, are incompatible with modern machine learning pipelines and computationally prohibitive for most practical applications. We address this problem by introducing widely-applicable differentiable surrogates for bypassing this computational bottleneck; the surrogates enable computationally efficient inference with modern cognitive models. We show experimentally that modeling capabilities comparable to the only available solution, existing likelihood-free inference methods, are achievable with a computational cost suitable for online applications. Finally, we demonstrate how AI-assistants can now use cognitive models for online interaction in a menu-search task, which has so far required hours of computation during interaction.

A Statistical Framework for Measuring AI Reliance

Ziyang Guo, Yifan Wu, Jason Hartline, Jessica Hullman

Humans frequently make decisions with the aid of artificially intelligent (AI) systems, with the AI recommending an action and the human retaining control over the final decision. Researchers have identified the critical goal of ensuring that a human has appropriate reliance on an AI to achieve complementary performance. We argue that the current definition of appropriate reliance used in such research lacks formal statistical grounding and can lead to contradictions. We propose a formal definition of reliance, based on statistical decision theory, which separates the concepts of reliance as the probability the decision-maker follows the AI's prediction from challenges a human may face in forming accurate beliefs about the situation. Our definition gives rise to a framework that can be used to guide the design and interpretation of studies on human-AI complementarity and reliance.

Hypothesis- and Structure-based prompting for medical and business diagnosis

Juyeon Heo, Kyunghyun Lee, Hyonkeun Joh, Umang Bhatt, Adrian Weller

In real-world scenarios like healthcare and business, tackling many-to-one problems is challenging but crucial. Take medical diagnosis: A patient's chief complaint can be caused by various diseases, yet time and resource constraints make identifying the cause via difficult. To tackle these issues, our study introduces Hypothesis-based and Structure-based (HS) prompting, a method designed to enhance the problem-solving capabilities of Large Language Models (LLMs). Our approach starts by efficiently breaking down the problem space using a Mutually Exclusive and Collectively Exhaustive (MECE) framework. Armed with this structure, LLMs generate, prioritize, and validate hypotheses through targeted questioning and data collection. The ability to ask the right questions is crucial for pinpointing the root cause of a problem accurately. We provide an easy-to-follow guide for crafting examples, enabling users to develop tailored HS prompts for specific tasks. We validate our method through diverse case studies in business consulting and medical diagnosis, which are further evaluated by domain experts. Interestingly, adding one sentence ``You can request one data in each response if needed'' initiates human interaction and improves performance.

Exploratory Training: When Annotators Learn About Data

Rajesh Shrestha, Omeed Habibelahian, Arash Termehchy, Paolo Papotti

ML systems often present examples and solicit labels from users to learn a target model, i.e., active learning. However, due to the complexity of the underlying data, users may not initially have a perfect understanding of the effective model and do not know the accurate labeling. For example, a user who is training a model for detecting noisy or abnormal values may not perfectly know the properties of typical and clean values in the data. Users may improve their knowledge about the data and target model as they observe examples during training. As users gradually learn about the data and model, they may revise their labeling strategies. Current systems assume that users always provide correct labeling with potentially a fixed and small chance of annotation mistakes. Nonetheless, if the trainer revises its belief during training, such mistakes become significant and non-stationarity. Hence, current systems consume incorrect labels and may learn inaccurate models.  In this paper, we build theoretical underpinnings and design algorithms to develop systems that collaborate with users to learn the target model accurately and efficiently. At the core of our proposal, a game-theoretic framework models the joint learning of user and system to reach a desirable eventual stable state, where both user and system share the same belief about the target model. We extensively evaluate our system using user studies over various real-world datasets and show that our algorithms lead to accurate results with a smaller number of interactions compared to existing methods.

Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination

Yang Li, Shao Zhang, Jichen Sun, Wenhao Zhang, Yali Du, Ying Wen, Xinbing Wang, Wei Pan

Achieving coordination between humans and artificial intelligence in scenarios involving previously unencountered humans remains a substantial obstacle within Zero-Shot Human-AI Coordination, which aims to develop AI agents capable of efficiently working alongside previously unknown human teammates. Traditional algorithms have aimed to collaborate with humans by optimizing fixed objectives within a population, fostering diversity in strategies and behaviors. However, these techniques may lead to learning loss and an inability to cooperate with specific strategies within the population, a phenomenon named cooperative incompatibility. To mitigate this issue, we introduce the \textbf{C}ooperative \textbf{O}pen-ended \textbf{LE}arning (\textbf{COLE}) framework, which formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy. We put forth a practical algorithm incorporating insights from game theory and graph theory, e.g., Shapley Value and Centrality. We also show that COLE could effectively overcome the cooperative incompatibility from theoretical and empirical analysis. Subsequently, we created an online Overcooked human-AI experiment platform, the COLE platform, which enables easy customization of questionnaires, model weights, and other aspects. Utilizing the COLE platform, we enlist 130 participants for human experiments. Our findings reveal a preference for our approach over state-of-the-art methods using a variety of subjective metrics. Moreover, objective experimental outcomes in the Overcooked game environment indicate that our method surpasses existing ones when coordinating with previously unencountered AI agents and the human proxy model. Our code and demo are publicly available at https://sites.google.com/view/cole-2023.

Can ChatGPT Read Who You Are?

Erik Derner, Dalibor Kučera, Nuria M Oliver, Jan Zahálka

The interplay of artificial intelligence (AI) and psychology, particularly in personality assessment, represents an important emerging area of research. Accurate personality trait estimation is crucial not only for enhancing personalization in human-computer interaction but also for a wide variety of applications ranging from mental health to education. This paper analyzes the capability of a generic chatbot (ChatGPT) to effectively infer personality traits from short texts. We report the results of a comprehensive user study featuring texts written in Czech by a representative population sample of 155 participants. Their self-assessments based on the Big Five Inventory (BFI) questionnaire serve as the ground truth. We compare the personality trait estimations made by ChatGPT against those by human raters and a dedicated machine learning-based personality analysis system. The results illustrate ChatGPT's competitive performance in inferring personality traits from text. We uncover a 'positivity bias' in ChatGPT's assessments across all personality dimensions. We also explore the impact of prompt composition on accuracy. This work contributes to the understanding of AI capabilities in psychological assessment, highlighting both the potential and limitations of using large language models for personality inference. It underscores the importance of responsible AI development, considering ethical implications such as privacy, consent, autonomy, and bias in AI applications.

Contextual Policies Enable Efficient and Interpretable Inverse Reinforcement Learning for Populations

Ville Tanskanen, Chang Rajani, Perttu Hämäläinen, Christian Guckelsberger, Arto Klami

Inverse reinforcement learning (IRL) methods learn a reward function from expert demonstrations such as human behavior, offering a practical solution for crafting reward functions for complex environments. However, IRL is computationally expensive when applied to large populations of demonstrators, as existing IRL algorithms require solving a separate reinforcement learning (RL) problem for each individual. We propose a new IRL approach that relies on contextual RL, where an optimal policy is learned for multiple contexts.