Invited Talks

Akifumi Okuno (RIKEN AIP)

Title: Bregman Hyperlink Regression and Its Expressive Power

Abstract:

A collection of U (\in \mathbb{N}) data vectors is called a U-tuple, and the association strength among the vectors of a tuple is termed as the hyperlink weight, that is assumed to be symmetric with respect to permutation of the entries in the index. We propose Bregman hyperlink regression (BHLR), which learns a user-specified symmetric similarity function such that it predicts the tuple's hyperlink weight from data vectors stored in the U-tuple. BHLR is based on Bregman divergence (BD) and encompasses various existing methods such as logistic regression (U=1), Poisson regression (U=1), graph embedding (U=2), matrix factorization (U=2), tensor factorization (U>=2), and their variants equipped with arbitrary BD. We demonstrate that, regardless of the choice of BD and U \in \mathbb{N}, the proposed BHLR is generally (P-1) robust against the distributional misspecification, that is, it asymptotically recovers the underlying true conditional expectation of hyperlink weights given data vectors regardless of its conditional distribution, and (P-2) computationally tractable, that is, it is efficiently computed by stochastic optimization algorithms using a novel generalized minibatch sampling procedure for hyper-relational data. Regarding the similarity function used in BHLR, siamese neural network, that applies a kernel function (U=2) or its generalization (U>2) to the vector-valued neural networks, is typically employed; we also consider its expressive power, for providing a highly expressive hyperlink regression.

Kazuto Fukuchi (University of Tsukuba / RIKEN AIP)

Title: Faking Fairness via Stealthily Biased Sampling

Abstract:

Fairness has attracted much attention from the machine learning community as a problem in which a machine learning algorithm can cause unfair decision-making by unintentionally inheriting a bias in the training dataset. With this concern, much effort has been dedicated to the development of tools for auditing fairness in a machine learning algorithm. In this talk, I will introduce our recent study in which we investigate a risk regarding such auditing tools. The focus of this study is to raise awareness of the risk that malicious decision-makers can fake fairness of their machine learning algorithm by abusing the auditing tools and thereby deceiving the social communities. The question is whether such a fraud of the decision-maker is detectable so that the society can avoid the risk of fake fairness. In this study, we answer this question negatively. We specifically put our focus on a situation where the decision-maker publishes a benchmark dataset as the evidence of his/her fairness and attempts to deceive a person who uses an auditing tool that computes a fairness metric. To assess the (un)detectability of the fraud, we explicitly construct an algorithm, the stealthily biased sampling, that can deliberately construct an evil benchmark dataset via subsampling. We show that the fraud made by the stealthily based sampling is indeed difficult to detect both theoretically and empirically.

Masaaki Imaizumi (The Institute of Statistical Mathematics / RIKEN AIP / JST Presto)

Title: Generalization Analysis for Mechanism of Deep Learning via Nonparametric Statistics

Abstract:

We theoretically investigate an advantage of deep neural networks (DNNs) which empirically perform better than other standard methods. While DNNs have empirically shown higher performance than other methods, understanding its mechanism is still a challenging problem. From an aspect of the nonparametric statistics, it is known many standard methods attain the optimal rate of errors for standard settings such as smooth functions, and thus it has not been straightforward to find theoretical advantages of DNNs. Our study fills this gap by extending a class for data generating processes. We mainly consider the following two points; non-smoothness of functions and intrinsic structures of data distributions. We derive the generalization error of estimators by DNNs with a ReLU activation, and show that convergence rates of the generalization error can describe an advantage of DNNs over some of the other methods. In addition, our theoretical result provides guidelines for selecting an appropriate number of layers and edges of DNNs. We provide numerical experiments to support the theoretical results.

Kenta Oono (The University of Tokyo / Preferred Networks Inc.)

Title: Explaining Oversmoothing of Non-linear Graph Neural Networks

Abstract:

Graph Neural Network (GNN) is a deep learning model for analyzing graph-structured data. Despite their practical popularity, their theory has not been matured compared to classical deep models such as multi-layer perceptrons, especially about their expressive power. For the classical models, there have been numerous studies about empirical and theoretical justifications for deep and non-linear structures. Contrary to that, several researchers reported that GNNs suffer from the oversmoothing phenomena, in which deep nor non-linearity structures do not help to improve (or even worse) predictive performance. There are several studies for explaining the oversmoothing phenomena. However, most of them have restricted their scope to linear GNNs. We gave a condition under which GCNs with ReLU activation functions, which are the most popular variant of GNNs, are provably vulnerable to oversmoothing even in the existence of non-linearity. This talk is based on our recent paper (Oono & Suzuki, 2019).

[Oono & Suzuki, 2019]: Kenta Oono and Taiji Suzuki. On Asymptotic Behaviors of Graph CNNs from Dynamical Systems Perspective. arXiv preprint arXiv:1905.10947 (2019).