Fairness-Aware Machine Learning: Practical Challenges and Lessons Learned
(WWW 2019 Tutorial)
Researchers and practitioners from different disciplines have highlighted the ethical and legal challenges posed by the use of machine learned models and data-driven systems, and the potential for such systems to discriminate against certain population groups, due to biases in algorithmic decision-making systems. This tutorial presents an overview of algorithmic bias / discrimination issues observed over the last few years and the lessons learned, key regulations and laws, and evolution of techniques for achieving fairness in machine learning systems. We will motivate the need for adopting a "fairness by design" approach (as opposed to viewing algorithmic bias / fairness considerations as an afterthought), when developing machine learning based models and systems for different consumer and enterprise applications. Then, we will focus on the application of fairness-aware machine learning techniques in practice, by presenting non-proprietary case studies from different technology companies. Finally, based on our experiences working on fairness in machine learning at companies such as Facebook, Google, LinkedIn, and Microsoft, we will present open problems and research directions for the data mining / machine learning community.
Sarah Bird (Microsoft, USA)
Ben Hutchinson (Google, USA)
Krishnaram Kenthapadi (LinkedIn, USA)
Emre Kıcıman (Microsoft, USA)
Margaret Mitchell (Google, USA)
The tutorial will be presented at WWW'19 by:
Sarah Bird (Microsoft, USA)
Krishnaram Kenthapadi (LinkedIn, USA)
Emre Kıcıman (Microsoft, USA)
Ben Packer (Google, USA)
Tutorial Outline and Description
The tutorial will cover the following topics (please see the outline below for the depth within each topic).
- Introduction to algorithmic bias / discrimination
- Industry best practices
- Sources of biases in ML lifecycle
- Algorithmic techniques for fairness in ML
- Fairness methods in practice: Challenges and lessons learned
- Image and related topics
- Machine translation
- Conversational agents
- Web search and other ranking domains
- Open problems
- Key takeaways and reflections
Machine learned models and data-driven systems are being increasingly used to perform decision making in crucial applications such as lending, hiring, and college admissions as a result of confluence of factors such as ubiquitous connectivity, ability to collect, aggregate, and process large amounts of fine-grained data using cloud computing, and ease of access to applying sophisticated machine learning models. There is increasing awareness about the ethical and legal challenges posed by the use of such data-driven systems. Researchers and practitioners from different disciplines have highlighted the potential for such systems to discriminate against certain population groups, due to biases in algorithmic decision-making systems. Several studies have shown that ranked lists produced by a biased machine learning model can result in systematic discrimination and reduced visibility for an already disadvantaged group (e.g., disproportionate association of higher risk scores of recidivism with minorities, over/under-representation and racial/gender stereotypes in image search results, and incorporation of gender and other human biases as part of algorithmic tools). One possible reason is that machine learned prediction models that are trained on datasets that exhibit existing societal biases end up learning them and can reinforce such bias in its results, potentially even amplifying the effect of biases. In light of the importance of understanding and addressing algorithmic bias in practical applications, this tutorials will be timely and relevant to the data mining / machine learning research and practitioner community.
The tutorial presenters have extensive experience in the field of fairness-aware machine learning, including the theory and application of fairness techniques, and have been leading efforts on AI and ethics at their respective organizations. This tutorial incorporates the lessons learned by the presenters while applying fairness techniques in practice, and consists of an overview of algorithmic bias / discrimination, industry best practices, sources of biases in ML lifecycle, algorithmic techniques for achieving fairness in machine learning systems, and fairness methods in practice. We will motivate the need for adopting a "fairness-first" approach (as opposed to viewing algorithmic bias / fairness considerations as an afterthought), when developing machine learning based models and systems for different applications. Then, we will focus on the application of fairness-aware machine learning techniques in practice, by presenting non-proprietary case studies from different technology companies. Based on our experiences working on fairness in machine learning at companies such as Facebook, Google, LinkedIn, and Microsoft, we will finally present open problems and research directions for the data mining / machine learning community.
Related Tutorials / References
- Sara Hajian, Francesco Bonchi, and Carlos Castillo, Algorithmic bias: From discrimination discovery to fairness-aware data mining, KDD Tutorial, 2016.
- Solon Barocas and Moritz Hardt, Fairness in machine learning, NeurIPS Tutorial, 2017.
- Kate Crawford, The Trouble with Bias, NeurIPS Keynote, 2017.
- Arvind Narayanan, 21 fairness definitions and their politics, FAT* Tutorial, 2018.
- Sam Corbett-Davies and Sharad Goel, Defining and Designing Fair Algorithms, Tutorials at EC 2018 and ICML 2018.
- Ben Hutchinson and Margaret Mitchell, Translation Tutorial: A History of Quantitative Fairness in Testing, FAT* Tutorial, 2019.
- Henriette Cramer, Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé III, Miroslav Dudík, Hanna Wallach, Sravana Reddy, and Jean Garcia-Gathright, Translation Tutorial: Challenges of incorporating algorithmic fairness into industry practice, FAT* Tutorial, 2019.
- ACM Conference on Fairness, Accountability, and Transparency (ACM FAT*)
The tutorial authors have extensive experience in the field of fairness-aware machine learning, including the theory and application of fairness techniques. They have published several papers on fairness / related topics such as causal analysis and privacy, and taught tutorials & given invited talks on these topics. They also have rich experience applying fairness techniques in practice, and have been leading efforts on AI and ethics at their respective organizations.
Sarah Bird leads research and emerging technology strategy for Azure AI. Sarah works to accelerate the adoption and impact of AI by bringing together the latest innovations research with the best of open source and product expertise to create new tools and technologies. Sarah is currently leading the development of responsible AI tools in Azure Machine Learning. She is also an active member of the Microsoft AETHER committee, where she works to develop and drive company-wide adoption of responsible AI principles, best practices, and technologies. Sarah was one of the founding researchers in the Microsoft FATE research group and prior to joining Microsoft worked on AI fairness in Facebook. Sarah is active contributor to the open source ecosystem, she co-founded ONNX, an open source standard for machine learning models and was a leader in the Pytorch 1.0 project. She was an early member of the machine learning systems research community and has been active in growing and forming the community. She co-founded the SysML research conference and the Learning Systems workshops. She has a Ph.D. in computer science from UC Berkeley advised by Dave Patterson, Krste Asanovic, and Burton Smith.
Ben Hutchinson is a Senior Engineer in Google's Research & Machine Intelligence group, working on artificial intelligence, fairness and ethics, in Google's Ethical AI team. His interdisciplinary research includes learning from social sciences to inform the ethical development of AI. In January 2019, he taught a tutorial at FAT* on how the fields of employment and education have approached quantitative measures of fairness, and how their measures relate to and can inform developments in machine learning. Prior to joining Google Research, he spent ten years working on a variety of products such as Google Wave, Google Maps, Knowledge Graph, Google Search, Social Impact, and others. He now uses this experience to work closely with product teams as a consultant on responsible practices and the development of fair machine learning models. He has a PhD in Natural Language Processing from the University of Edinburgh.
Krishnaram Kenthapadi is part of the AI team at LinkedIn, where he leads the fairness, transparency, explainability, and privacy modeling efforts across different LinkedIn applications. He also serves as LinkedIn's representative in Microsoft's AI and Ethics in Engineering and Research (AETHER) Committee. He shaped the technical roadmap and led the privacy/modeling efforts for LinkedIn Salary product, and prior to that, served as the relevance lead for the LinkedIn Careers and Talent Solutions Relevance team, which powers search/recommendation products at the intersection of members, recruiters, and career opportunities. Previously, he was a Researcher at Microsoft Research Silicon Valley, where his work resulted in product impact (and Gold Star / Technology Transfer awards), and several publications/patents. He received his Ph.D. in Computer Science from Stanford University in 2006. He serves regularly on the program committees of KDD, WWW, WSDM, and related conferences, and co-chaired the 2014 ACM Symposium on Computing for Development. He received the CIKM best case studies paper award, the SODA best student paper award, and the WWW best paper award nomination. He has published 35+ papers, with 2500+ citations and filed 130+ patents.
Emre Kıcıman is a Principal Researcher at Microsoft Research AI. His current research focuses on causal analysis and data bias in the context of computational social science analyses and decision support systems; and, broadly, the implications of AI for people and society. Emre's past research includes foundational work on applying machine learning to fault management in large-scale internet services, now an industry standard practice. Emre has recently taught tutorials and presented lectures on data bias, AI best practices, causal reasoning, and other topics. At Microsoft, Emre is a co-organizer of MSR AI's efforts on AI's impact on people and society, lectures on best practices for AI, sources of AI biases, and is on internal working groups on fairness and bias and best practices. In the academic community, Emre has served as the steering committee chair of the AAAI International Conference on Web and Social Media (ICWSM 2013-2017), general chair and program chair of major conferences and on the organizing committee/SPC/PC of many conferences, including FAT, WWW, KDD, CSCW, WSDM, and IJCAI. Emre received his Ph.D. and M.S. from Stanford University, and his B.S. in Electrical Engineering and Computer Science from U.C. Berkeley.
Margaret Mitchell is a Senior Research Scientist in Google's Research & Machine Intelligence group, working on artificial intelligence, multimodality, and ethics, and she currently leads Google's Ethical AI team. Her research involves vision-language, computer vision, and grounded language generation, focusing on how to evolve artificial intelligence towards positive goals. This includes research on helping computers to communicate based on what they can process, as well as projects to create assistive and clinical technology from the state of the art in AI. She has presented invited talks at forums such as TED and FAT/ML.