WHI 2016 @ ICML, New York, June 23, 2016


The 2016 Workshop on Human Interpretability in Machine Learning (WHI 2016), held in conjunction with ICML 2016, will bring together researchers who study the interpretability of predictive models, develop interpretable machine learning algorithms, and develop methodology to interpret black-box machine learning models (e.g., post-hoc interpretations). They will exchange ideas on these and allied topics, including:
  • Quantifying and axiomatizing interpretability,
  • Psychology of human concept learning,
  • Rule learning,
  • Symbolic regression,
  • Case-based reasoning,
  • Generalized additive models,
  • Interpretation of black-box models (including deep neural networks),
  • Causality of predictive models,
  • Visual analytics, and
  • Interpretability in reinforcement learning.


Please bring your ID, and allow extra 10 minutes for check-in process at the lobby. 

We have a very nice room in a very nice building, but that also means extra security check at the lobby. ;)



8:30 AM

8:45 AM

opening remarks

Been Kim, Dmitry Malioutov, Kush Varshney

8:45 AM

9:30 AM

invited talk:  Friends Don’t Let Friends Deploy Models They Don’t Understand

Rich Caruana

9:30 AM

9:36 AM

finalist spotlight talk: Model-Agnostic Interpretability of Machine Learning

Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin

9:36 AM

9:42 AM

finalist spotlight talk:  Making Tree Ensembles Interpretable

Satoshi Hara, Kohei Hayashi

9:42 AM

9:48 AM

finalist spotlight talk: Using Visual Analytics to Interpret Predictive Machine Learning Models

Josua Krause, Adam Perer, Enrico Bertini

9:48 AM

9:54 AM

finalist spotlight talk: EU regulations on algorithmic decision-making and a "right to explanation"

Bryce Goodman, Seth Flaxman

9:54 AM

10:00 AM

finalist spotlight talk: Visualizing Dynamics from t-SNE to SEMI-MDPs

Nir Ben Zrihem, Tom Zahavy, Shie Mannor

10:00 AM

10:30 AM

coffee break

10:30 AM

11:15 AM

invited talk:  Simplicity in Human Concept Learning

Jacob Feldman

11:15 AM

12:00 AM

poster session

12:00 PM

1:30 PM

lunch break

1:30 PM

2:15 PM

invited talk

Susan Athey

2:15 PM

3:00 PM

invited talk

Hanna Wallach

3:00 PM

3:30 PM


3:30 PM

4:15 PM

invited talk: Provenance and Contracts in Machine Learning 

Percy Liang

4:15 PM

5:00 PM

panel discussion

Moderated by Rich Caruana. Panelists: Susan Athey, Klaus-Robert Mueller, Percy Liang, Yisong Yue, Jacob Feldman

Invited Speakers

Title: Friends Don’t Let Friends Deploy Models They Don’t Understand

Deploying unintelligible black-box machine learned models is risky --- high accuracy on a test set is NOT sufficient.  Unfortunately, the most accurate models usually are not very intelligible (e.g., random forests, boosted trees, and neural nets), and the most intelligible models usually are less accurate (e.g., linear or logistic regression).  This tradeoff limits the accuracy of models that can be deployed in mission-critical applications such as healthcare where being able to understand, validate, edit, and ultimately trust the learned model is important. We’re developing a learning method based on generalized additive models (GAMs) that is as accurate as full complexity models, but as intelligible as linear/logistic regression models.  I'll present two case studies where these high-performance generalized additive models (GA2Ms) yield state-of-the-art accuracy on healthcare problems while remaining intelligible.  In the pneumonia case study, the intelligible model uncovers surprising patterns in the data that previously prevented other black-box models from being deployed, but because it is intelligible and modular allows these patterns to easily be recognized and removed.  In the 30-day hospital readmission case study, we show that the same methods scale to large datasets containing hundreds of thousands of patients and thousands of attributes while remaining intelligible and providing accuracy comparable to the best (unintelligible) machine learning models.

Title: Simplicity in human concept learning

The closest parallel to machine learning in human cognition is what psychologists call concept learning, the process by which human learners induce categories from objects they observe. In this talk I will discuss the role of the simplicity principle (Occam’s razor) in psychological models of concept learning and categorization. Despite some notice in the early days of concept learning research, for several decades simplicity criteria played very little role in dominant models of human categorization, which was instead dominated by “exemplar models” based on similarity comparisons with numerous stored examples. However it can be shown that exemplar models overfit training data relative to human learners, in some cases dramatically; that is, they allow categorization hypotheses that are overly complex compared to the human solution. Instead, human learning seems to rely heavily on various kinds of simplicity principles, which “regularize” human induction in a way that makes it both more cognitively tractable and also, perhaps, more effective. Machine learning research may benefit by pursuing closer parallels with human concept learning with regard to complexity minimization and associated computational procedures.

Title: Counterfactual Inference for Consumer Choice Across Many Products

Authors: Susan Athey, David Blei, Robert Donnelly, Francisco Ruiz, and Dustin Tran

In this paper, we develop a model of consumer choice across a large number of products. In contrast to most of the economics and marketing literature, which focuses on choices among a small set of substitutable products in a narrow category, we analyze choices across a large number of products that are not close substitutes for one another, such as different categories of products in a grocery store (e.g. potato chips, lemon-lime soft drinks, or organic apples). Our goal is to make counterfactual inferences about how consumer choices and welfare would change if, for example, prices or product availability change for a category. Our model differs from most economic models in that we model preferences for a large number of products in a single model; we attempt to capture preferences about many products in a lower-dimensional utility function where consumers have preferences about characteristics of products. Our model is designed for settings where the same consumers are observed over time making consumption choices about a large set of products. In our model, a consumer's utility from consuming a product is determined by individual-specific latent preferences for latent product characteristics and an idiosyncratic shock. Some product characteristics and user characteristics may also be observed. We also allow for shocks to utility that are common across a group of users, and vary by product and time period (e.g., date), to incorporate the idea that demand for products may depend on factors such as holidays.  We show how to evaluate the assumptions required for our parameter estimates to have a causal interpretation.

Title: Interpretability and Measurement

Title: Provenance and Contracts in Machine Learning

This talk poses two questions.  The first question is: Why did the model make a certain prediction?  I will discuss the importance of making a prediction via the correct means, which not only provides human interpretability but also more robust generalization.  For example, a question answering system should not only be able to answer the question but to justify the answer with the proper provenance.  The second question is: How should we reason about a model's behavior?  The implicit contract in machine learning is that if the training data looks like the test data, then we will get good generalization.  But this contract is often broken in practice.  We discuss two alternative contracts: one based on the ability to say "don't know", which allows us to obtain 100% precision when the model is well-specified, and the other based on leveraging conditional independence structure, which allows us to perform unsupervised risk estimation.

List of Papers [download ALL manuscripts (one big PDF)]

  • A Model Explanation System: Latest Updates and Extensions [1606.09517]
    Ryan Turner
  • ACDC: α-Carving Decision Chain for Risk Stratification [1606.05325]
    Yubin Park, Joyce Ho, and Joydeep Ghosh
  • Building an Interpretable Recommender via Loss-Preserving Transformation
    Amit Dhurandhar, Sechan Oh, and Marek Petrik
  • Clustering with a Reject Option: Interactive Clustering as Bayesian Prior Elicitation [1606.05896]
    Akash Srivastava, James Zou, Ryan P. Adams, and Charles Sutton
  • Enhancing Transparency and Control when Drawing Data-Driven Inferences about Individuals [1606.08063]
    Daizhuo Chen, Samuel P. Fraiberger, Robert Moakler, and Foster Provost
  • EU Regulations on Algorithmic Decision-Making and a "Right to Explanation" [1606.08813]
    Bryce Goodman and Seth Flaxman
  • Explainable Restricted Boltzmann Machines for Collaborative Filtering [1606.07129]
    Behnoush Abdollahi and Olfa Nasraoui
  • Explaining Classification Models Built on High-Dimensional Sparse Data
    Julie Moeyersoms, Brian d'Alessandro, Foster Provost, and David Martens
  • Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions? [1606.05589]
    Abhishek Das, Harsh Agrawal, C. Lawrence Zitnick, Devi Parikh, and Dhruv Batra
  • Increasing the Interpretability of Recurrent Neural Networks Using Hidden Markov Models [1606.05320]
    Viktoriya Krakovna and Finale Doshi-Velez

  • Interactive Semantic Featuring for Text Classification [1606.07545]
    Camille Jandot, Patrice Simard, Max Chickering, David Grangier, and Jina Suh
  • Interpretability in Linear Brain Decoding [1606.05672]
    Seyed Mostafa Kia and Andrea Passerini

  • Interpretable Machine Learning Models for the Digital Clock Drawing Test [1606.07163]
    William Souillard-Mandar, Randall Davis, Cynthia Rudin, Rhoda Au, and Dana L. Penney
  • Interpretable Two-level Boolean Rule Learning for Classification [1606.05798]
    Guolong Su, Dennis Wei, Kush R. Varshney, and Dmitry M. Malioutov
  • Interpreting Extracted Rules from Ensemble of Trees: Application to Computer-Aided Diagnosis of Breast MRI [1606.08288]
    Cristina Gallego-Ortiz and Anne L. Martel
  • Learning Interpretable Musical Compositional Rules and Traces [1606.05572]
    Haizi Yu, Lav R. Varshney, Guy E. Garnett, and Ranjitha Kumar
  • Making Tree Ensembles Interpretable [1606.05390]
    Satoshi Hara and Kohei Hayashi

  • Meaningful Models: Utilizing Conceptual Structure to Improve Machine Learning Interpretability [1607.00279]
    Nick Condry
  • Model-Agnostic Interpretability of Machine Learning [1606.05386]
    Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin

  • The Mythos of Model Interpretability [1606.03490]
    Zachary C. Lipton
  • Toward Interpretable Topic Discovery via Anchored Correlation Explanation [1606.07043]
    Kyle Reing, David Kale, Greg Ver Steeg, and Aram Galstyan

  • Using Visual Analytics to Interpret Predictive Machine Learning Models [1606.05685]
    Josua Krause, Adam Perer, and Enrico Bertini
  • Visualizing Dynamics: from t-SNE to SEMI-MDPs [1606.07112]
    Nir Ben Zrihem, Tom Zahavy, and Shie Mannor
  • Visualizing Textual Models with In-Text and Word-as-Pixel Highlighting [1606.06352]
    Abram Handler, Su Lin Blodgett, and Brendan O'Connor

We thank our sponsors for generous support:


Doctors, judges, business executives, and many other people are faced with making critical decisions that can have profound consequences. For example, doctors decide which treatment to administer to patients, judges decide on prison sentences for convicts, and business executives decide to enter new markets and acquire other companies. Such decisions are increasingly being supported by predictive models learned by algorithms from historical data.

The latest trend in machine learning is to use very sophisticated systems involving deep neural networks with many complex layers, kernel methods, and large ensembles of diverse classifiers. While such approaches produce impressive, state-of-the art prediction accuracies, they give little comfort to decision makers, who must trust their output blindly because very little insight is available about their inner workings and the provenance of how the decision was made.

Therefore, in order for predictions to be adopted, trusted, and safely used by decision makers in mission-critical applications, it is imperative to develop machine learning methods that produce interpretable models with excellent predictive accuracy. It is in this way that machine learning methods can have impact on consequential real-world applications.

Organizing Committee


  • Been Kim, Allen Institute for Artificial Intelligence

Committee Members:

Important Dates

Submission deadline: May 1, 2016   May 5, 2016

Acceptance notification:  May 10, 2016

Workshop: June 23, 2016

Call for Submissions

Authors are invited to submit short papers in the ICML format up to 4 pages in length with 1 additional page containing only acknowledgements and references.  The review process will be single blind and thus the submissions need not be anonymized.

Submission Instructions

We invite submissions of full papers (maximum 4 pages excluding references and acknowledgements) as well as works-in-progress, position papers, and papers describing open problems and challenges. Papers must be formatted using the ICML template and submitted online via:


 Accepted papers will be selected for a short oral presentation or poster presentation and published in proceedings overlayed on arXiv. While original contributions are preferred, we also invite submissions of high-quality work that has recently been published in other venues.