Fairness-Aware Machine Learning: Practical Challenges and Lessons Learned

(KDD 2019 Tutorial)

Overview

Researchers and practitioners from different disciplines have highlighted the ethical and legal challenges posed by the use of machine learned models and data-driven systems, and the potential for such systems to discriminate against certain population groups, due to biases in algorithmic decision-making systems. This tutorial presents an overview of algorithmic bias / discrimination issues observed over the last few years and the lessons learned, key regulations and laws, and evolution of techniques for achieving fairness in machine learning systems. We will motivate the need for adopting a "fairness by design" approach (as opposed to viewing algorithmic bias / fairness considerations as an afterthought), when developing machine learning based models and systems for different consumer and enterprise applications. Then, we will focus on the application of fairness-aware machine learning techniques in practice, by presenting non-proprietary case studies from different technology companies. Finally, based on our experiences working on fairness in machine learning at companies such as Facebook, Google, LinkedIn, and Microsoft, we will present open problems and research directions for the data mining / machine learning community.

Contributors

Sarah Bird (Microsoft, USA)

Ben Hutchinson (Google, USA)

Krishnaram Kenthapadi (LinkedIn, USA)

Emre Kıcıman (Microsoft, USA)

Margaret Mitchell (Google, USA)

The tutorial will be presented at KDD'19 by:

Sarah Bird (Microsoft, USA)

Sahin Cem Geyik (LinkedIn, USA)

Emre Kıcıman (Microsoft, USA)

Margaret Mitchell (Google, USA)

Mehrnoosh Sameki (Microsoft, USA)

Tutorial Logistics

Venue/Time: 8:00AM - 12:00PM on Sunday, August 4, 2019 in Summit 11 - Ground Level, Egan (refer KDD'19 webpage (schedule) for any last minute changes)

Tutorial Outline and Description

The tutorial will cover the following topics (please see the outline below for the depth within each topic).

  • Introduction to algorithmic bias / discrimination
  • Industry best practices
  • Sources of biases in ML lifecycle
  • Algorithmic techniques for fairness in ML
  • Fairness methods in practice: Challenges and lessons learned
    • Image and related topics
    • Machine translation
    • Conversational agents
    • Web search and other ranking domains
  • Open problems
  • Key takeaways and reflections

Machine learned models and data-driven systems are being increasingly used to perform decision making in crucial applications such as lending, hiring, and college admissions as a result of confluence of factors such as ubiquitous connectivity, ability to collect, aggregate, and process large amounts of fine-grained data using cloud computing, and ease of access to applying sophisticated machine learning models. There is increasing awareness about the ethical and legal challenges posed by the use of such data-driven systems. Researchers and practitioners from different disciplines have highlighted the potential for such systems to discriminate against certain population groups, due to biases in algorithmic decision-making systems. Several studies have shown that ranked lists produced by a biased machine learning model can result in systematic discrimination and reduced visibility for an already disadvantaged group (e.g., disproportionate association of higher risk scores of recidivism with minorities, over/under-representation and racial/gender stereotypes in image search results, and incorporation of gender and other human biases as part of algorithmic tools). One possible reason is that machine learned prediction models that are trained on datasets that exhibit existing societal biases end up learning them and can reinforce such bias in its results, potentially even amplifying the effect of biases. In light of the importance of understanding and addressing algorithmic bias in practical applications, this tutorials will be timely and relevant to the data mining / machine learning research and practitioner community.

The tutorial presenters have extensive experience in the field of fairness-aware machine learning, including the theory and application of fairness techniques, and have been leading efforts on AI and ethics at their respective organizations. This tutorial incorporates the lessons learned by the presenters while applying fairness techniques in practice, and consists of an overview of algorithmic bias / discrimination, industry best practices, sources of biases in ML lifecycle, algorithmic techniques for achieving fairness in machine learning systems, and fairness methods in practice. We will motivate the need for adopting a "fairness-first" approach (as opposed to viewing algorithmic bias / fairness considerations as an afterthought), when developing machine learning based models and systems for different applications. Then, we will focus on the application of fairness-aware machine learning techniques in practice, by presenting non-proprietary case studies from different technology companies. Based on our experiences working on fairness in machine learning at companies such as Facebook, Google, LinkedIn, and Microsoft, we will finally present open problems and research directions for the data mining / machine learning community.

Related Tutorials / References

Fairness tools

Contributor Bios

The tutorial authors have extensive experience in the field of fairness-aware machine learning, including the theory and application of fairness techniques. They have published several papers on fairness / related topics such as causal analysis and privacy, and taught tutorials & given invited talks on these topics. They also have rich experience applying fairness techniques in practice, and have been leading efforts on AI and ethics at their respective organizations.

Sarah Bird leads strategic projects at the intersection of AI research and products at Facebook. Her current work focuses on AI Ethics and developing AI responsibly at scale. She has also been working on open AI systems and is one of the co-creators of ONNX, an open standard for deep learning models, and a leader in the Pytorch 1.0 project. Prior to joining Facebook, she was an AI systems researcher at Microsoft Research NYC and a technical advisor to Microsoft's Data Group. She is one of the researchers behind Microsoft's Decision Service, one of the first general-purpose reinforcement-learning style cloud systems publicly released. She also co-founded the FATE research group for AI ethics at Microsoft. She has a Ph.D. in computer science from UC Berkeley advised by Dave Patterson, Krste Asanovic, and Burton Smith. Sarah has co-organized several workshops on related topics (Workshop on Ethical, Social and Governance Issues in AI, NIPS 2018; Machine Learning Systems workshop, NIPS 2018; Machine Learning Systems workshop, NIPS 2017; AI Systems workshop, SOSP 2017; Machine Learning Systems workshop, NIPS 2016), and presented an invited keynote talk ("AI and Machine Learning: A Perspective from Facebook") at the Berkeley Privacy Law Forum 2018.

Ben Hutchinson is a Senior Engineer in Google's Research & Machine Intelligence group, working on artificial intelligence, fairness and ethics, in Google's Ethical AI team. His interdisciplinary research includes learning from social sciences to inform the ethical development of AI. In January 2019, he taught a tutorial at FAT* on how the fields of employment and education have approached quantitative measures of fairness, and how their measures relate to and can inform developments in machine learning. Prior to joining Google Research, he spent ten years working on a variety of products such as Google Wave, Google Maps, Knowledge Graph, Google Search, Social Impact, and others. He now uses this experience to work closely with product teams as a consultant on responsible practices and the development of fair machine learning models. He has a PhD in Natural Language Processing from the University of Edinburgh.

Krishnaram Kenthapadi is part of the AI team at LinkedIn, where he leads the fairness, transparency, explainability, and privacy modeling efforts across different LinkedIn applications. He also serves as LinkedIn's representative in Microsoft's AI and Ethics in Engineering and Research (AETHER) Committee. He shaped the technical roadmap and led the privacy/modeling efforts for LinkedIn Salary product, and prior to that, served as the relevance lead for the LinkedIn Careers and Talent Solutions Relevance team, which powers search/recommendation products at the intersection of members, recruiters, and career opportunities. Previously, he was a Researcher at Microsoft Research Silicon Valley, where his work resulted in product impact (and Gold Star / Technology Transfer awards), and several publications/patents. He received his Ph.D. in Computer Science from Stanford University in 2006. He serves regularly on the program committees of KDD, WWW, WSDM, and related conferences, and co-chaired the 2014 ACM Symposium on Computing for Development. He received the CIKM best case studies paper award, the SODA best student paper award, and the WWW best paper award nomination. He has published 35+ papers, with 2500+ citations and filed 130+ patents.

Emre Kıcıman is a Principal Researcher at Microsoft Research AI. His current research focuses on causal analysis and data bias in the context of computational social science analyses and decision support systems; and, broadly, the implications of AI for people and society. Emre's past research includes foundational work on applying machine learning to fault management in large-scale internet services, now an industry standard practice. Emre has recently taught tutorials and presented lectures on data bias, AI best practices, causal reasoning, and other topics. At Microsoft, Emre is a co-organizer of MSR AI's efforts on AI's impact on people and society, lectures on best practices for AI, sources of AI biases, and is on internal working groups on fairness and bias and best practices. In the academic community, Emre has served as the steering committee chair of the AAAI International Conference on Web and Social Media (ICWSM 2013-2017), general chair and program chair of major conferences and on the organizing committee/SPC/PC of many conferences, including FAT, WWW, KDD, CSCW, WSDM, and IJCAI. Emre received his Ph.D. and M.S. from Stanford University, and his B.S. in Electrical Engineering and Computer Science from U.C. Berkeley.

Margaret Mitchell is a Senior Research Scientist in Google's Research & Machine Intelligence group, working on artificial intelligence, multimodality, and ethics, and she currently leads Google's Ethical AI team. Her research involves vision-language, computer vision, and grounded language generation, focusing on how to evolve artificial intelligence towards positive goals. This includes research on helping computers to communicate based on what they can process, as well as projects to create assistive and clinical technology from the state of the art in AI. She has presented invited talks at forums such as TED and FAT/ML.