Mini-Symposium on Optimisation in Machine Learning


June 23 2023, Ghent Belgium

Aim and focus

The goal of this small-scale mini-symposium is to bring together researchers interested in the topic of optimisation in machine learning. It is meant to provide a place for the discussion of the most recent developments in optimisation in machine learning problems, and the exploration of new research directions in this field. 


Program 

Abstract: We consider a problem of optimizing macro-averaged complex performance metrics in multi-label classification, that is metrics linearly decomposable into a sum of binary classification utilities applied separately to each label. We focus on tasks in which the algorithm is required to predict exactly k labels for each instance, leading to macro-at-k metrics. The constraint couples the otherwise independent binary classification tasks, leading to a much more challenging optimization problem than standard macro-averages. We analyze the problem in two different statistical regimes, namely in the expected test utility (ETU) and the population utility (PU) frameworks. The former aims at optimizing the expected performance on a given test set, while the latter optimizes the performance on a population level. For both frameworks we derive optimal prediction rules and practical algorithmic solutions with provable statistical guarantees. Empirical results provide evidence for the competitive performance of the proposed approaches.


Abstract: Due to the steadily increasing relevance of machine learning for practical applications, many of which are coming with safety requirements, the notion of uncertainty has received increasing attention in machine learning research in the recent past. This talk will address questions and challenges regarding the representation and adequate handling of (predictive) uncertainty in (supervised) machine learning. A specific focus will be put on the distinction between two important types of uncertainty, often referred to as aleatoric and epistemic. In this regard, a recent generalisation of the common (empirical) loss minimisation approach in machine learning will be critically anaysed and shown to be fundamentally flawed.


Abstract: Industry and society are increasingly automating processes, which requires solving combinatorial optimisation problems. To find not just optimal solutions, but also 'desirable' solutions for the end user, it is increasingly important to offer AI tools that automatically learn from the user and the environment and that support the constraint modelling in interpretable ways. In this talk I will provide an overview of three different ways in which AI can augment the modeling part of combinatorial optimisation. This includes learning from the user (preference learning in VRP), learning from the environment (end-to-end decision focussed learning) and explanation generation, that sit at the intersection of learning and reasoning. As part of this work, we are building a modern constraint programming language called CPMpy(http://cpmpy.readthedocs.io) that eases integration of multiple constraint solving paradigms with machine learning and other scientific python libraries. I will shortly highlight its possibilities beyond the above cases, as well as our larger vision of conversational human-aware technology for optimisation.


Abstract: In recent years, the use of machine learning has gained popularity in solving complex tasks across multiple domains, including robotics, healthcare, biology and finance. However, with the increasing amount of data and complexity of tasks, the notion of uncertainty has become of major importance in machine learning. The ability to represent uncertainty in an efficient and trustworthy way should therefore be considered as a key feature of any machine learning method. This thesis aims to address this challenge by focusing on the development of principled tools to efficiently represent uncertainty in machine learning. In particular, the concept of set-valued prediction is highlighted, which provides end-users with multiple answers instead of a single answer with little guarantee. To this end, probabilistic classification is considered, where the unknown relationship between inputs and classes is assumed to be non-deterministic and expressed by a conditional class distribution. A distinction between several settings in classification is made and different biological applications are considered, including large-scale bacterial species identification using Matrix Assisted Laser Desorption/Ionisation Time-Of-Flight Mass Spectrometry (MALDI-TOF MS) data.


[UPDATED] Location 

The workshop will be hosted at Campus Coupure, which is in the city center of Ghent. The address is Coupure links 653, 9000 Ghent. The workshop takes place in Auditorium E2. That's on the first floor of Building E (see campus plan with indicated route in red below). People coming by car can park at the campus. You will need a special coin to leave the parking area. Ask the organizers for a coin. 


Organisers

Willem Waegeman, Ghent University, Belgium

Thomas Mortier, Ghent University, Belgium