In recent years, machine learning (ML) has been increasingly applied in a variety of settings, with promise to analyze huge and heterogeneous volumes of data, improve decisional accuracy, and ease human labor. These systems have been predominantly built based on the supervised learning paradigm, which relies on the availability of huge amounts of data, assumed to be reliably labeled by human experts.
However, with this increasing interest has also come the realization that the real-world is often far from the idealized perfection assumed in the supervised paradigm: data can be missing or noisy; supervision can be costly to obtain or have veracity issues; human users could fail at appropriating technologies based on ML, due to the incapability of ML models at reliably conveying their uncertainty.
As a result, recently, increasing interest has been devoted to the development of techniques capable of dealing with these issues. These include uncertainty quantification and cautious learning, where the ML models convey to the users their uncertainty to improve reliability and reduce cognitive biases; as well as weakly supervised learning and its variants, such as incomplete supervision, where only a subset of training data is labeled; imprecise supervision, where the training data is coarse-grained labeled; and inaccurate supervision, where the labels are not always true. At the same time, due to the inherently multidisciplinary nature of these issues, there has been an increasingly deepened dialogue between ML and neighboring scientific fields, such as knowledge and uncertainty representation, as well as human-computer interaction, crowd-sourcing and active learning.
The aim of the WSCL workshop is to explore how machine learning and related methods can handle weak supervision and provide more cautious and reliable support in the presence of data imperfection, as well as encourage broad discussion between ML researchers and experts in neighboring related fields whose underlying principles and foundations are central to allow the functioning of systems built on ML in less than perfect real-world settings.