Responsible AI in Industry (Tutorial)
Artificial Intelligence (AI) is increasingly playing an integral role in determining our day-to-day experiences. Increasingly, the applications of AI are no longer limited to search and recommendation systems, such as web search and movie and product recommendations, but AI is also being used in decisions and processes that are critical for individuals, businesses, and society. With web-based AI based solutions in areas such as hiring, lending, criminal justice, healthcare, and education, the resulting personal and professional implications of AI are far-reaching.
With many factors playing a role in development and deployment of AI systems, they can exhibit different, and sometimes harmful, behaviors. For example, the training data often comes from society and real world, and thus it may reflect the society’s biases and discrimination toward minorities and disadvantaged groups. For instance, minorities are known to face higher arrest rates for similar behaviors as the majority population, so building an AI system without compensating for this is likely to only exacerbate this prejudice.
The above concerns highlight the need for regulations, best practices, and practical tools to help data scientists and ML developers build AI systems that are secure, privacy-preserving, transparent, explainable, fair, and accountable – to avoid unintended consequences and compliance challenges that can be harmful to individuals, businesses, and society.
Among these principles, model transparency and explainability is a prerequisite for building trust and adoption of AI systems in high stakes domains requiring reliability and safety such as healthcare and automated transportation, as well as critical industrial applications with significant economic implications such as predictive maintenance, exploration of natural resources, and climate change modeling. Besides explainability, more and more stakeholders are questioning the fairness of their AI systems, as there are plenty of examples to illustrate the consequences of failing to consider fairness, from face recognition working significantly better for white men compared to women of color, to automated hiring systems that discriminate against certain groups of people. Incorporating tools to ensure transparency and fairness of models makes it easier for data scientists, engineers, and model users to debug models and achieve important objectives such as ensuring the fairness, reliability, and safety of AI systems.
Finally, the AI products are often powered by ML models that are trained on sensitive user data. Given sufficient complexity -- either in terms of the number of parameters [e.g., deep learning models with several layers], or user-level personalization --, it is possible for the model to encode private information of users. In addition, it is often desirable to ensure user privacy during different stages of the ML life-cycle and protect against different types of bad actors and threat scenarios, necessitating privacy-preserving AI approaches.
In this tutorial, we will present an overview of responsible AI, highlighting model explainability, fairness, and privacy in AI, key regulations/laws, and techniques/tools for providing understanding around web-based AI/ML systems. Then, we will focus on the application of explainability, fairness assessment/unfairness mitigation, and privacy techniques in industry, wherein we present practical challenges/guidelines for using such techniques effectively and lessons learned from deploying models for several web-scale machine learning and data mining applications. We will present case studies across different companies, spanning application domains such as search and recommendation systems, hiring, sales, lending, and fraud detection. We will emphasize that topics related to responsible AI are socio-technical, that is, they are topics at the intersection of society and technology. The underlying challenges cannot be addressed by technologists alone; we need to work together with all key stakeholders --- such as customers of a technology, those impacted by a technology, and people with background in ethics and related disciplines --- and take their inputs into account while designing these systems. Finally, based on our experiences in industry, we will identify open problems and research directions for the data mining/machine learning community.
AAAI Conference on Artificial Intelligence (AAAI 2021)
8:30am – 11:45am Pacific Time on Wednesday, February 3, 2021 [Link for registered attendees]
ACM Conference on Fairness, Accountability, and Transparency (FAccT 2021)
The Web Conference (WWW 2021)
8:00am - 11:30am Pacific Time (17:00-20:30 CET) on Monday, April 12, 2021
International Conference on Machine Learning (ICML 2021)
8:00am - 11:00am Pacific Time on Monday, July 19, 2021 [Link for registered attendees]
AAAI'21 Video Recording (Slideslive link; embedded above)
AAAI'21, FAccT'21, and WWW'21 Tutorial Slides (Slideshare link; embedded below)
ICML'21 Tutorial Slides (Slideshare link; embedded below)
FAccT'21 Video Recording (YouTube link; shorter (90 minute) version; embedded below)
Krishnaram Kenthapadi is a Principal Scientist at Amazon AWS AI, where he leads the fairness, explainability, and privacy initiatives in Amazon AI platform. Prior to joining Amazon, he led similar efforts across different LinkedIn applications as part of the LinkedIn AI team, and served as LinkedIn’s representative in Microsoft’s AI and Ethics in Engineering and Research (AETHER) Advisory Board. He shaped the technical roadmap and led the privacy/modeling efforts for LinkedIn Salary product, and prior to that, served as the relevance lead for the LinkedIn Careers and Talent Solutions Relevance team, which powers search/recommendation products at the intersection of members, recruiters, and career opportunities. Previously, he was a Researcher at Microsoft Research Silicon Valley, where his work resulted in product impact (and Gold Star / Technology Transfer awards), and several publications/patents. Krishnaram received his Ph.D. in Computer Science from Stanford University in 2006, and his Bachelors in Computer Science from IIT Madras. He serves regularly on the program committees of KDD, WWW, WSDM, and related conferences, and co-chaired the 2014 ACM Symposium on Computing for Development. His work has been recognized through awards at NAACL, WWW, SODA, CIKM, ICML AutoML workshop, and Microsoft’s AI/ML conference (MLADS). He has published 50+ papers, with 4500+ citations and filed 145+ patents (65 granted). He has presented lectures/tutorials on privacy, fairness, and explainable AI in industry at forums such as KDD ’18 ’19, WSDM ’19, WWW ’19 ’20, FAccT ’20, and AAAI ’20, and instructed a course on AI at Stanford.
Ben Packer is a Software Engineer in Research at Google AI, responsible for Fairness and Robustness engagements. He works at the intersection of research and product engagements, conducting research on fairness and robustness as well as implementing these practices directly into various Google products. He has contributed to the Machine Learning Fairness Education effort at Google that has reached thousands of employees and tens of thousands of external developers, has taught a Fairness module to developers across industry as part of Google's CapitalG program, and presented a fairness tutorial at WWW '19. Prior to working at Google, Ben was the Principal Data Scientist at Opower, building large-scale predictive and descriptive models that drove energy efficiency and demand response for millions of households. He received his Ph.D. in Computer Science from the AI lab at Stanford University, specializing in Machine Learning, Probabilistic Graphical Models, and Computer Vision. He received a Bachelors in Cognitive Science and a Masters in Computer Science from the University of Pennsylvania.
Mehrnoosh Sameki is a senior technical program manager at Microsoft, responsible for leading the product efforts on the open source machine learning interpretability and fairness toolkits (InterpretML and Fairlearn) and their platform integration within the Azure Machine Learning platform. She is also an adjunct assistant professor at Boston University, School of Computer Science, where she earned her PhD degree in 2017. She has presented at several industry forums (including Microsoft Build) and a fairness tutorial at KDD '19.
Nashlie Sephus is the Applied Science Manager for Amazon's Artificial Intelligence (AI) team focusing on fairness and identifying biases in technologies across the company. She formerly led the Amazon Visual Search team in Atlanta, which launched visual search for replacement parts on the Amazon Shopping app in June 2018. This technology was a result of former startup Partpic (Atlanta) being acquired by Amazon, for which she was the Chief Technology Officer (CTO). Prior to working at Partpic, she received her Ph.D. from the School of Electrical and Computer Engineering at the Georgia Institute of Technology in 2014 and worked for a year with Exponent technical consulting firm in New York City. Her core research areas were digital signal processing, machine learning, and computer engineering. She received her B.S. in Computer Engineering from Mississippi State University (2007). She’s had several internships and research experiences worldwide with companies such as IBM, Delphi, University of California at Berkeley, GE Research Center, GE Energy, Miller Transporters, and Kwangwoon University in Seoul, South Korea. In 2018, Dr. Sephus founded The Bean Path non-profit organization based in Jackson, MS assisting individuals with technical expertise, equity, and guidance.
Tutorial Outline and Description
The tutorial will consist of two parts: responsible AI foundations including motivation, definitions, models, algorithms, and tools for explainability, fairness, and privacy in AI/ML systems (1.5 hours) and case studies across different companies, spanning application domains such as search and recommendation systems, hiring, computer vision, cognition tasks including machine translation, lending, and analytics, along with open problems and research directions (1.5 to 2 hours).
Motivation from regulatory, business, and data science perspectives
Fairness-aware ML techniques/tools
Explainable AI techniques/tools
Privacy-preserving ML techniques/tools
Open source and commercial tools for AI explainability, fairness, and privacy (e.g., Amazon SageMaker Clarify and Debugger, Google AI Explainability, Fairness, and What-If tools, Fiddler Explainable AI Engine, Harvard OpenDP, H2O Driverless AI, IBM AI Fairness & Explainability 360, LinkedIn Fairness Toolkit (LiFT), Microsoft Fairlearn and InterpretML)
Case Studies (including practical challenges and lessons learned during deployment in industry)
We will present case studies across different companies, spanning application domains such as search and recommendation systems, hiring, computer vision, cognition tasks including machine translation, lending, and analytics. We will emphasize that topics related to responsible AI are socio-technical, that is, they are topics at the intersection of society and technology. The underlying challenges cannot be addressed by technologists alone; we need to work together with all key stakeholders --- such as customers of a technology, those impacted by a technology, and people with background in ethics and related disciplines --- and take their inputs into account while designing these systems. Finally, we will identify open problems and research directions for the data mining/machine learning community.
This tutorial is aimed at attendees with a wide range of interests and backgrounds, including researchers interested in knowing about techniques and tools for model explainability, fairness, and privacy in AI as well as practitioners interested in implementing responsible AI models for web-scale machine learning and data mining applications. We will not assume any prerequisite knowledge, and present the intuition underlying various explainability, fairness, and privacy notions and techniques to ensure that the material is accessible to all attendees.
Related Tutorials and Resources
Sara Hajian, Francesco Bonchi, and Carlos Castillo, Algorithmic bias: From discrimination discovery to fairness-aware data mining, KDD Tutorial, 2016.
Solon Barocas and Moritz Hardt, Fairness in machine learning, NeurIPS Tutorial, 2017.
Kate Crawford, The Trouble with Bias, NeurIPS Keynote, 2017.
Arvind Narayanan, 21 fairness definitions and their politics, FAccT Tutorial, 2018.
Sam Corbett-Davies and Sharad Goel, Defining and Designing Fair Algorithms, Tutorials at EC 2018 and ICML 2018.
Ben Hutchinson and Margaret Mitchell, Translation Tutorial: A History of Quantitative Fairness in Testing, FAccT Tutorial, 2019.
Henriette Cramer, Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé III, Miroslav Dudík, Hanna Wallach, Sravana Reddy, and Jean Garcia-Gathright, Translation Tutorial: Challenges of incorporating algorithmic fairness into industry practice, FAccT Tutorial, 2019.
Sarah Bird, Ben Hutchinson, Krishnaram Kenthapadi, Emre Kiciman, and Margaret Mitchell, Fairness-Aware Machine Learning: Practical Challenges and Lessons Learned, Tutorials at WSDM 2019, WWW 2019, and KDD 2019.
Krishna Gade, Sahin Cem Geyik, Krishnaram Kenthapadi, Varun Mithal, and Ankur Taly, Explainable AI in Industry, Tutorials at KDD 2019, FAccT 2020, and WWW 2020.
Freddy Lecue, Krishna Gade, Fosca Giannotti, Sahin Geyik, Riccardo Guidotti, Krishnaram Kenthapadi, Pasquale Minervini, Varun Mithal, and Ankur Taly, Explainable AI: Foundations, Industrial Applications, Practical Challenges, and Lessons Learned, AAAI 2020 Tutorial.
Himabindu Lakkaraju, Julius Adebayo, and Sameer Singh, Explaining Machine Learning Predictions: State-of-the-art, Challenges, and Opportunities, Tutorials at NeurIPS 2020 and AAAI 2021.
Freddy Lecue, Pasquale Minervini, Fosca Giannotti and Riccardo Guidotti, On Explainable AI: From Theory to Motivation, Industrial Applications and Coding Practices, AAAI 2021 Tutorial.
Kamalika Chaudhuri and Anand D. Sarwate, Differentially Private Machine Learning: Theory, Algorithms, and Applications, NeurIPS 2017 Tutorial.
Krishnaram Kenthapadi, Ilya Mironov, and Abhradeep Guha Thakurta, Privacy-preserving Data Mining in Industry, Tutorials at KDD 2018, WSDM 2019, and WWW 2019.
Krishnaram Kenthapadi, Himabindu Lakkaraju, Pradeep Natarajan, Mehrnoosh Sameki, Model Monitoring in Practice, Tutorials at FAccT 2022 and KDD 2022.