Addressing a rapidly growing public awareness about bias and fairness issues in algorithmic decision-making systems (ADS), the tech industry is now championing a set of tools to assess and mitigate these. Such tools, broadly categorized as algorithmic fairness definitions, metrics and mitigation strategies find their roots in recent research from the community on Fairness, Accountability and Transparency in Machine Learning (FAT/ML), which started convening in 2014 at popular machine learning conferences, and has since been succeeded by a broader conference on Fairness, Accountability and Transparency in Sociotechnical Systems (FAT*). Whereas there is value in this research to assist diagnosis and informed debate about the inherent trade-offs and ethical choices that come with data-driven approaches to policy and decision-making, marketing poorly validated tools as quick fix strategies to eliminate bias is problematic and threatens to deepen an already growing sense of distrust among companies and institutions procuring data analysis software and enterprise platforms. This trend is coinciding with efforts by the IEEE and others to develop certification and marking processes that "advance transparency, accountability and reduction in algorithmic bias in Autonomous and Intelligent Systems". These efforts combined suggest a checkbox recipe for improving accountability and resolving the many ethical issues that have surfaced in the rapid deployment of ADS. In this talk, we nuance this timely debate by pointing at the inherent technical limitations of fairness metrics as a go-to tool for fixing bias. We discuss earlier attempts of certification to clarify pitfalls. We refer to developments in governments adopting ADS systems and how a lack of accountability and existing power structures are leading to new forms of harm that question the very efficacy of ADS. We end with discussing productive uses of diagnostic tools and the concept of Algorithmic Impact Assessment as a new framework for identifying the value, limitations and challenges of integrating algorithms in real world contexts.
Recent discussion in the public sphere about classification by algorithms has involved tension between competing notions of what it means for such a classification to be fair to different groups. We consider several of the key fairness conditions that lie at the heart of these debates. In particular, we study how these properties operate when the goal is to rank-order a set of applicants by some criterion of interest, and then to select the top-ranking applicants. Among other results, we show that imposing a constraint to favor "simple" rules -- for example, to promote interpretability -- can have consequences for the equity of the ranking toward disadvantaged groups.
Manuel Gomez Rodriguez - Enhancing the Accuracy and Fairness of Human Decision Making
Societies often rely on human experts to take a wide variety of decisions affecting their members, from jail-or-release decisions taken by judges and stop-and-frisk decisions taken by police officers to accept-or-reject decisions taken by academics. In this context, each decision is taken by an expert who is typically chosen uniformly at random from a pool of experts. However, these decisions may be imperfect due to limited experience, implicit biases, or faulty probabilistic reasoning. Can we improve the accuracy and fairness of the overall decision making process by optimizing the assignment between experts and decisions?
In this talk, we address the above problem from the perspective of sequential decision making and show that, for different fairness notions from the literature, it reduces to a sequence of (constrained) weighted bipartite matchings, which can be solved efficiently using algorithms with approximation guarantees. Moreover, these algorithms also benefit from posterior sampling to actively trade off exploitation---selecting expert assignments which lead to accurate and fair decisions---and exploration---selecting expert assignments to learn about the experts' preferences and biases. We demonstrate the effectiveness of our algorithms on both synthetic and real-world data and show that they can significantly improve both the accuracy and fairness of the decisions taken by pools of experts.
Hannah Wallach - Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need?
The potential for machine learning systems to amplify social inequities and unfairness is receiving increasing popular and academic attention. A surge of recent research has focused on the development of algorithmic tools to detect and mitigate such unfairness. However, if these tools are to have a positive impact on industry practice, it is crucial that their design be informed by an understanding of industry teams’ actual needs. Through semi-structured interviews with 35 machine learning practitioners, spanning 19 teams and 10 companies, and an anonymous survey of 267 practitioners, we conducted the first systematic investigation of industry teams' challenges and needs for support in developing fairer machine learning systems. I will describe this work and summarize areas of alignment and disconnect between the challenges faced by industry practitioners and solutions proposed in the academic literature. Based on these findings, I will highlight directions for future research that will better address practitioners' needs.
Hoda Heidari - What Can Fair ML Learn from Economic Theories of Distributive Justice?
Recently, a number of technical solutions have been proposed for tackling algorithmic unfairness and discrimination. I will talk about some of the connections between these proposals and to the long-established economic theories of fairness and distributive justice. In particular, I will overview the axiomatic characterization of measures of (income) inequality, and present them as a unifying framework for quantifying individual- and group-level unfairness; I will propose the use of cardinal social welfare functions as an an effective method for bounding individual-level inequality; and last but not least, I will cast existing notions of algorithmic (un)fairness as special cases of economic models of equality of opportunity---through this lens, I hope to offer a better understanding of the moral assumptions underlying technical definitions of fairness.
Rich Caruana - Justice May Be Blind But It Shouldn’t Be Opaque: The Risk of Using Black-Box Models in Healthcare & Criminal Justice
In machine learning often a tradeoff must be made between accuracy and intelligibility. This tradeoff sometimes limits the accuracy of models that can be safely deployed in mission-critical applications such as healthcare and criminal justice where being able to understand, validate, edit, and ultimately trust a learned model is important. In this talk I’ll present a case study where intelligibility is critical to uncover surprising patterns in the data that would have made deploying a black-box model dangerous. I’ll also show how distillation with intelligible models can be used to detect bias inside black-box models.