Probabilistic graphical models such as Bayesian networks (BNs) represent a successful marriage between graph theory and probability theory, and provide principled approaches to solving reasoning and decision making problems involving uncertainty. Applying these methods to real-world problems typically requires building the graphical model representations of the problems at hand. One popular approach is to use score-based methods to find a network structure that optimizes a scoring function for given data. Since finding the optimal structure is shown to be NP-hard, early research mainly focused on developing approximation algorithms, such as greedy hill climbing. Unfortunately, the models found by those methods are not guaranteed to be optimal and, even worse, are of unknown quality.

In recent years, several exact algorithms have been developed that can learn BNs from datasets with dozens of variables. This tutorial will provide a comprehensive overview of the theoretical foundations and algorithmic developments of this important research topic. We will start with a brief review of the basic representation of Bayesian networks and major approaches to learning Bayesian networks. Then we will dive into the score-based learning approach. We first introduce the major scoring functions for Bayesian network structures and their theoretical properties. We then review the major approximate learning algorithms. We finally introduce several exact learning algorithms developed recently, including dynamic programming, branch and bound, integer linear programming, and admissible heuristic search.

The tutorial is targeted at students and researchers who are interested in either pursuing research in probabilistic graphical models, or applying the methods to real-world problems. The prerequisites are a basic knowledge of probability theory, combinatorial optimization, and/or machine learning.

The tutorial will take place in the afternoon session on Monday, August 5.