WP5 Description

(from the proposal)

Objectives

The main goal of WP5 is to promote the study of foundations, techniques, algorithms and tools to for allowing autonomous AI agent to decide and learn how to act. The major challenge is integrating data-based methods with model-based methods by learning first-order symbolic models from non-symbolic data, to allow flexible and compositional reasoning and planning in terms of objects and relations. The interest in particular is to learn meaningful dynamic models from data that allow reasoning and explanation. Apart from the specific scientific work within the project itself, one of the most important objectives of this work-package is to pave the way for research on the topic of “How an AI agent decides and learns how to act” that is multidisciplinary, involving, planning, knowledge representation, synthesis and verification in formal methods, reinforcement learning in non- Markovian models, neuro-symbolic relational methods, and deep learning.

The work is divided into 4 “scientific challenge tasks”, i.e., addressing 4 main scientific challenges in the theme, plus 2 extra tasks, one on cross-fertilization with industry and one on fostering a scientific community dedicated to this theme. Each scientific challenge task continuously interacts with the latter two in order to provide input and receive feedback and challenges in order to get a closed-loop approach to the research activities. Notice that the scientific challenges will have strong synergies with WP2, WP3, WP4 and WP6.


Task 5.1: Extended and multi-facet models of the world dynamics and tasks (M1-M36: Task Lead: UOR): To study foundations, techniques, algorithms and tools for handling models of the world and task specification that are fully realistic, including giving up Markovian assumptions, handling first-order representations, adopting task specification based on formalisms used in formal methods. Of particular interest is to study foundations, techniques, algorithms and tools for reasoning and learning multi-facet models, reasoning and planning on multiple representations of the world; planning and acting with multiple representation; tolerant models and tolerant plans (plans that work in a reference model + variations); learning hierarchical and compositional models; learning hierarchies, models at different levels of abstraction; learning hierarchical problem solving strategies represented as hierarchical automata and formal grammars.

Task 5.2: Integrating data-based methods with model-based methods in deciding and learning how to act (M1- M36: Task Lead: UPF): To study foundations, techniques, algorithms and tools for integrating data-based methods with model-based methods by learning first-order symbolic models from non-symbolic data, to allow flexible and compositional reasoning and planning in terms of objects and relations. Of particular interest is to study foundations and methods for learning symbolic representations (a) from images by leveraging generative models and languages for graphical representation; (b) from audio data to uncover action models.

Task 5.3: Learning for reasoners and planners, and reasoners and planners for learning (M1-M36: Task Lead: UNIBAS): To study foundations, techniques, algorithms and tools for integrating learning into reasoners and planners and vice versa, including, on the one hand, learning how to process the models to facilitate reasoning and planning and how to do learning that will improve the solving of problems, and, on the other hand, adopting reasoners, planners, and symbolic models to drive the learning. Of particular interest is to study foundations and methods for (a) learning procedural control knowledge and reformulation of problem representations to improve problem solving (b) modelling planning problem solving as a learning task.

Task 5.4: Monitoring and controlling to make actions AI trustworthy in the real world (M1-M36: Task Lead: FBK): To study foundations, techniques, algorithms and tools for devising and learning meaningful dynamic models that mix human understandable fluents versus human un-understandable features; updating and correcting imperfect models, detecting problems in a model; learning from failures; learning (soft) constraints on the model when the model fails; mixing prior human dynamic knowledge / models with learning from data; grey-box systems; reasoning/planning modulo no theory; semi-structured models. The ultimate aim is to allow for human control and oversight, to making AI-based deliberation and acting trustworthy.

Task 5.5 Synergies Industry, Challenges, Roadmap concerning on autonomous actions in AI systems (M1- M36: Task Lead: CNRS-IRIT) See Instrument 3, Section 1.3.2.3 for a description.

Task 5.6: Fostering the AI scientific community on the theme of deciding and learning how to act (M1-M36: Task Lead: RWTH) See Instrument 3, Section 1.3.2.3 for a description.


Deliverables

D5.1: Foundations, techniques, algorithms and tools for allowing autonomous AI agents to decide and learn how to act (report) [M18, v1; M36, v2, UOR] Report the novel insights, techniques, algorithms and tools developed within the scientific challenges tasks T5.1, T5.2, T5.3, T5.4.

D5.2: Synergies Industry, Challenges, Roadmap concerning autonomous acting in AI systems (report) [M18, v1; M36, v2, UOR]