Cause-effect pairs

Challenges in Machine Learning, book in preparation.

Discovering causal relationships from observational data is becoming a hot topic in data science. Does the increasing amount of available data make it easier to detect potential triggers in epidemiology, social sciences, economy, biology, medicine, and other sciences? The angle we take is that causal discovery algorithms provide putative mechanisms that still need to be challenged by experiments. However, they can help defining policies and prioritizing experiments in large scale experimental designs to reduce costs. 
In 2013 we conducted a challenge on the problem of cause-effect pairs, which pushed the state-of-the art considerably, revealing that the joint distribution of two variables can be scrutinized by machine learning algorithms to reveal the possible existence of a "causal mechanism", in the sense that the values of one variable may have been generated from the values of the other (and not the other way around).
The ambition of this book is the provide both tutorial material on the state-of-the-art on cause-effect pairs, put in the context of other research on causal discovery, and a series of advanced readings from articles selected in the proceedings of the NIPS 2013 workshop on causality and the JMLR special topic on large scale experimental design and the inference of causal mechanisms. Supplemental material includes videos, slides, and code found on the workshop website

Cause-effect Pairs in Machine Learning
Springer Series on Challenges in Machine Learning (volume TBA)
Isabelle Guyon
Alexander Statnikov
Berna Bakır Batu

Tentative Table of Contents ( * pending author’s approval)

Preface. Isabelle GuyonAlexander Statnikov, Berna Bakır Batu (in preparation)
The preface will provide motivations for the book. It will give a brief historical perspective, review the results of the cause-effect pair challenge, and summarize the contributions of the book.

PART I: Fundamentals

Chapter 1: Causal discovery: the cause-effect pair problem setting. Dominik Janzing. [PDF draft chapter 1, Oct 29, 2018]
Telling cause from effect from observations of just two variables has attracted increasing interest since more than one decade. On the one hand, it defines a nice binary classification problem for which it is easy to define a success rate, in contrast to more general causal inference tasks where no straightforward performance criteria exist. On the other hand, it fascinates researchers because solving this elementary task implies 
statistical asymmetries between cause and effect that were previously unknown. Discussing some real-world and toy examples, I argue that humans seem to have some intuition about these asymmetries, but I also argue that some straightforward ideas to distinguish between cause and effect are flawed. The discussion on the origin  of the true asymmetries relates machine learning, philosophy and physics. For instance, the postulate that P(cause) and P(effect|cause) contain no information about each other (while P(effect) and P(cause|effect) may satisfy some "non-generic" relations), is relevant for semi-supervised learning on the one hand, but is also related to the thermodynamic arrow of time on the other hand.

Chapter 2: Evaluating causal discovery: assessment methods. Isabelle Guyon. [Overleaf link, draft chapter 2][PDF Nov 12, 2018]
This chapter addresses the problem of benchmarking causal models or validating particular putative causal relationships, in the limited setting of  cause-effect pairs, when empirical "observational'' data are available.  We do not address experimental validations e.g. via randomized controlled trials. Our goal is to compare methods, which provide a score C(X, Y), called causation coefficient, rating a pair of variable (X, Y) for being in a potential causal relationship X -> Y. Causation coefficients may be used for various purposes, including to prioritize experiments, which may be costly or risky, or guiding decision makers in domains in which experiments are infeasible or unethical. We provide a methodology to evaluate their reliability. We take three points of views: (1) that of algorithm developers who must justify the soundness of their method, particularly with respect to identifiability and consistency, (2) that of practitioners who seek to understand on what basis algorithms make their decisions and evaluate their statistical significance, and (3) that of benchmark organizers who desire to make fair evaluations to compare methods. We adopt the framework of pattern recognition in which  pairs of variable (X, Y)$and their ground truth causal graph are drawn i.i.d. from a "mother distribution''. This leads us to define new notions of probabilistic identifiability, Bayes optimal causation coefficients, and multi-part statistical tests. These new notions are evaluated on the data of the first cause-effect pair challenge. Other datasets and date generative models are also reviewed. 

Chapter 3: Filter methods to orient causal arrows. [abandoned]
The cause-effect pair problem can be thought of as an "enhanced" feature selection problem in which determining the significance of the dependency between 2 variables is replaced by determining the significance of the influence (in a mechanistic sense) of one variable on the other in a particular direction. Like for feature selection, the most basic methods are "filters" which make use a of given criterion of "test statistic" to determine this "significance", based on first principles, without "looking at the data" at hand. Such criteria may be derived from information theoretic arguments. This chapter will contrast and compare various such approaches and state the hypothesis made (eventually) on the data generating process. It will present associated statistical test when they exist. This chapter can be written more as an overview than a tutorial. It would be nice to attach to it a library of methods implemented e.g. in Python.

Chapter 3 (old 4): Generative models - Olivier Goudet and Diviyan Kalainathan [Overleaf link draft chapter 3][PDF Nov 12, 2018]
Finding the causal direction in the cause-effect pair problem has been addressed in the literature by comparing two alternative generative models X -> Y and Y -> X. In this chapter, we  first define what is meant by generative modeling and what are the main assumptions usually invoked in the literature in this bivariate setting. Then we present the theoretical identifiability problem that arises when considering causal graph  with only two variables. It will lead us to present the general ideas used in the literature to perform a model selection based on the evaluation of a complexity/fit trade-off. Three main families of methods can be identified: methods making restrictive assumptions on the class of admissible causal mechanism, methods computing a smooth trade-off between fit and complexity and methods exploiting independence between cause and mechanism.

Chapter 4 (old 5): Discriminant learning machines - Diviyan Kalainathan, Olivier Goudet, Michele Sebag [Overleaf link draft chapter 4][PDF Nov 12, 2018]
The cause-effect pair challenge has, for the first time, formulated the cause-effect problem as a learning problem in which a causation coefficient is trained from data. This can be thought of as a kind of meta learning. This chapter will present an overview of the contributions in this domain and state the advantages and limitations of the method as well as recent theoretical results (learning theory/mother distribution). This chapter will point to code from the winners of the cause-effect pair challenge.

Chapter 5 (old 6): Cause-Effect Pairs in Time Series with a Focus on Econometrics - Sebastiano Cattaruzzo, Nicolas Doremus, and Alessio Moneta [Overleaf link draft chapter 5][PDF Nov 12, 2018]
Long before the cause-effect pair challenge, researchers in econometrics had devices causation coefficients for pairs of time series (Granger causality). This chapter reviews such approaches, including the vector-autoregressive framework and novel direct causal approaches, taking into account contemporaneous causal relationships

Chapter 6 (old 7): Beyond cause-effect pairs - Frederick Eberhardt [PDF Jul 31, 2018]
1) Introduction:
-- what is the generalization of the cause-effect pair challenge? learning graphs, distinguishing between more than just orientation of the edge, e.g. confounding, feedback cycles, dependence due to sample selection bias, mischaractization of variables or relational dependencies
-- to what extent can the ideas of the proposals for the cause-effect pair challenge be generalized to the graph search setting?
2) Taking the winning methods beyond the cause-effect pair challenge
-- a description of the approaches that the winning methods took and how one would apply them to the more general settings
-- a critical commentary on whether these sorts of extensions provide a sensible route for the more general causal discovery questions
-- an indication of subsequent methods in the literature that have tried to take the ideas underlying the winning methods of the cause effect pair challenge and apply them to the more general setting
3) Official generalizations
-- a discussion of methods that were originally motivated by the cause effect pair challenge and have been developed into graph learning algorithms
-- in particular, this will provide some notes of the generalizations involving algorithms that broadly fall in the additive noise framework
4) More complicated cases
-- how to think about cases where the cause effect pair is not so clearly defined
-- -- relational causal modeling
-- -- can a cause and its effect be extended in time?
-- -- how to think of dynamically related cause-effect pairs?
-- -- how do I find my cause and effect pairs in the first place?
5) Construction of cause and effect pairs
-- the cause effect pair challenge assumed that both the causal variables, whether causally related or not, were each well defined
-- how do we come to identify such well defined causal variables in the first place?
-- what happens to the cause-effect pair if it is mischaracterized? Can we tell it is mischaracterized

PART II: Selected readings, including reports on the cause-effect pair challenge (*=not confirmed)
Chapter 7: Results of the cause-effect pair challenge. Isabelle Guyon et al.
Chapter 8: Non-linear Causal Inference using Gaussianity Measures. Daniel Hernández-Lobato, Pablo Morales-Mombiela, David Lopez-Paz, Alberto Suárez. JMLR 17(28):1−39, 2016. 
Chapter 9: From dependency to causality: a machine learning approachGianluca Bontempi, Maxime Flauder. JMLR 16(Dec):2437−2457, 2015.
Chapter 10:  Pattern-based Causal Feature Extraction, Diogo Moitinho de Almeida:  The method used by the winners of the cause-effect pair challenge.
Chapter 12:  Conditional distribution variability measures for causality detection, José A. R. Fonollosa. The method used by the second ranked in the cause-effect pair challenge (round 1) and first ranked in round 2. 
Chapter 14:  * - Experiment Design in Static Models of Dynamic Biological Systems, Karen Sachs, Solomon Itani, Mingyu Chung, Gabriela K. Fragiadakis, Jonathan Fitzgerald, Birgit Schoeberl, Garry P. Nolan, Claire Tomlin.
Chapter 14:  - Markov Blanket Ranking using Kernel-based Conditional Dependence Measures, Eric V. Strobl, Shyam Visweswaran. 

Isabelle Guyon,
Sep 20, 2016, 2:55 AM
Isabelle Guyon,
Sep 20, 2016, 2:55 AM
Isabelle Guyon,
Sep 20, 2016, 2:55 AM
Isabelle Guyon,
Sep 20, 2016, 2:54 AM
Isabelle Guyon,
Sep 20, 2016, 2:54 AM
Isabelle Guyon,
Sep 20, 2016, 2:54 AM
Isabelle Guyon,
Sep 20, 2016, 2:54 AM
Isabelle Guyon,
Jun 29, 2018, 3:41 AM