Explainable AI in Industry: Practical Challenges and Lessons Learned
(ACM FAT* 2020 Tutorial)
Artificial Intelligence is increasingly playing an integral role in determining our day-to-day experiences. Moreover, with proliferation of AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching. The dominant role played by AI models in these domains has led to a growing concern regarding potential bias in these models, and a demand for model transparency and interpretability. In addition, model explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, and critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling.
As a consequence, AI researchers and practitioners have focused their attention on explainable AI to help them better trust and understand models at scale. The challenges for the research community include (i) defining model explainability, (ii) formulating explainability tasks for understanding model behavior and developing solutions for these tasks, and finally (iii) designing measures for evaluating the performance of models in explainability tasks.
In this tutorial, we will first motivate the need for model interpretability and explainability in AI from societal, legal, customer/end-user, and model developer perspectives. [Note: Due to time constraints, we will not focus on techniques/tools for providing explainability as part of AI/ML systems.] Then, we will focus on the real-world application of explainability techniques in industry, wherein we present practical challenges / implications for using explainability techniques effectively and lessons learned from deploying explainable models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning application domains such as search and recommendation systems, sales, lending, and fraud detection. Finally, based on our experiences in industry, we will identify open problems and research directions for the research community.
Tutorial Outline and Description
The tutorial will consist of a brief motivation and overview of explainability in AI/ML systems, followed by case studies across different companies, spanning application domains such as search & recommendation systems, hiring, sales, lending, and fraud detection.
We will motivate the need for explainability in AI/ML systems and provide a brief overview of the explainability techniques. Due to time constraints, we will not discuss the sections below in detail (please see our KDD'19 tutorial slides).
- Need for Transparency and Explainability in AI
- Model Validation: Validation metrics, such as classification accuracy, are an incomplete description of most real-world tasks.
- Scientific Consistency (beyond statistical consistency)
- Feature Importance
- Model Internals
- Explaining by Examples
- Intrinsic Interpretable Models
- Explaining Model Behavior Globally. A global surrogate model is an interpretable model that is trained to approximate the predictions of a black box model.
- Explaining Model Behavior Locally. Local surrogate models are interpretable models that are used to explain individual predictions of black box machine learning models.
- Example-based explanation methods select particular instances of the dataset to explain the behavior of machine learning models or to explain the underlying data distribution.
- Explaining Model Differences
Explainability By Design
- Designing explainable models for prediction.
- LIME and its variants such as xLIME
- Feature Importance (Random Forest)
Evaluation of Explainability
- Coverage / Representativeness
- Human friendliness (concepts easier for humans to understand)
Case Studies (including practical challenges and lessons learned during deployment in industry)
We will present case studies across different companies, spanning application domains such as search & recommendation systems, hiring, sales, lending, and fraud detection.
This tutorial is aimed at attendees with a wide range of interests and backgrounds, including researchers interested in knowing about model interpretability and explainability in AI, key regulations / laws, and explainability notions / techniques as well as practitioners interested in implementing explainable models for web-scale machine learning and data mining applications. We will not assume any prerequisite knowledge, and present the intuition underlying various explainability notions and techniques to ensure that the material is accessible to all FAT* attendees.
Krishna Gade is the founder and CEO of Fiddler Labs, an enterprise startup building an explainable AI engine to address problems regarding bias, fairness, and transparency in AI. An entrepreneur and engineering leader with a strong technical experience of creating scalable platforms and delightful consumer products, Krishna previously held senior engineering leadership roles at Facebook, Pinterest, Twitter, and Microsoft. He has given several invited talks at prominent practitioner forums, including a talk on addressing bias, fairness, and transparency in AI at Strata Data Conference, 2019.
Sahin Cem Geyik has been part of the Careers/Talent AI teams at LinkedIn over the past three years, focusing on personalized and fairness-aware recommendations across several LinkedIn Talent Solutions products. Prior to LinkedIn, he was a research scientist at Turn Inc., an online advertising startup which was later acquired by Amobee, a subsidiary of Singtel. He received his Ph.D. degree in Computer Science from Rensselaer Polytechnic Institute in 2012, and his Bachelor degree in Computer Engineering in 2007 at Bogazici University, Istanbul/Turkey. Sahin worked on various research topics in ML spanning over Online Advertising Models and Algorithms, Recommender and Search Systems, Fairness-aware ML, and Explainability. He also has performed extensive research in Systems domain, which resulted in multiple publications in Ad-hoc/Sensor Networks and Service-Oriented Architecture fields. Sahin has authored papers in several top-tier conferences and journals such as KDD, WWW, INFOCOM, SIGIR, ICDM, CIKM, IEEE TMC, IEEE TSC, and presented his work in multiple external venues.
Krishnaram Kenthapadi is a Principal Scientist at Amazon AWS AI, where he leads the fairness, explainability, and privacy initiatives in Amazon AI platform. Until recently, he led similar efforts across different LinkedIn applications as part of the LinkedIn AI team, and served as LinkedIn's representative in Microsoft's AI and Ethics in Engineering and Research (AETHER) Advisory Board. He shaped the technical roadmap and led the privacy/modeling efforts for LinkedIn Salary product, and prior to that, served as the relevance lead for the LinkedIn Careers and Talent Solutions Relevance team, which powers search/recommendation products at the intersection of members, recruiters, and career opportunities. Previously, he was a Researcher at Microsoft Research Silicon Valley, where his work resulted in product impact (and Gold Star / Technology Transfer awards), and several publications/patents. Krishnaram received his Ph.D. in Computer Science from Stanford University in 2006, and his Bachelors in Computer Science from IIT Madras. He serves regularly on the program committees of KDD, WWW, WSDM, and related conferences, and co-chaired the 2014 ACM Symposium on Computing for Development. He received Microsoft's AI/ML conference (MLADS) distinguished contribution award, NAACL best thematic paper award, CIKM best case studies paper award, SODA best student paper award, and WWW best paper award nomination. He has published 40+ papers, with 2500+ citations and filed 140+ patents (30+ granted). He has presented lectures/tutorials on privacy, fairness, and explainable AI in industry at forums such as KDD '18 '19, WSDM '19, and WWW '19, and instructed a course on AI at Stanford.
Varun Mithal is an AI researcher at LinkedIn, where he works on jobs and hiring recommendations. Prior to joining LinkedIn, he received his PhD in Computer Science from University of Minnesota-Twin Cities, and his Bachelors in Computer Science from Indian Institute of Technology, Kanpur. He has developed several algorithms to identify rare classes and anomalies using unsupervised change detection as well as supervised learning from weak labels. His thesis also explored machine learning models for scientific domains that incorporate physics-based constraints and makes them interpretable for domain scientists. He has published 20 papers with 350+ citations. His work has appeared in top-tier data mining conferences and journals such as IEEE TKDE, AAAI, and ICDM.
Ankur Taly is the Head of Data Science at Fiddler labs, where he is responsible for developing and evangelizing core explainable AI technology. Previously, he was a Staff Research Scientist at Google Brain where he carried out research in explainable AI, and was most well-known for his contribution to developing and applying Integrated Gradients (220+ citations) — a new interpretability algorithm for Deep Networks. His research in this area has resulted in publications at top-tier machine learning conferences (ICML 2017, ACL 2018), and prestigious journals like the American Academy of Ophthalmology (AAO) and Proceedings of the National Academy of Sciences (PNAS). He also given invited talks (Slides, Video) at several academic and industrial venues, including, UC Berkeley (DREAMS seminar), SRI International, Dagstuhl seminar, and Samsung AI Research. Besides explainable AI, Ankur has a broad research background, and has published 25+ papers in several other areas including Computer Security, Programming Languages, Formal Verification, and Machine Learning. He has served on several conference program committees (PLDI 2014 and 2019, POST 2014, PLAS 2013), taught guest lectures at graduate courses, and instructed a short course on distributed authorization at the FOSAD summer school in 2016. Ankur obtained his Ph.D. in computer science from Stanford University in 2012 and a B. Tech in CS from IIT Bombay in 2007.