Resources

The resources linked have been curated from the working library of the Ethics and Social Justices subcommittee of the Data SRI. Within the field of data science, we have resources categorized by artificial intelligence, computer vision, data, machine learning, and natural language processing. The ethics category covers justice, accountability, fairness, and transparency. For social justice, we have resources loosely organized by race, gender, and intersectional social identities. These socially constructed categories are fluid and we acknowledge that individuals and community members hold multiple and intersectional identities.

If you have resources you would like us to include, please send them our way via the contact form.

A more complete working library of resources can be found HERE. Appreciations to Cal Poly student Divya Satrawada who compiled and supported the curation of these resources.

Data Science

Artificial Intelligence

Artificial UnIntelligence: How Computers Misunderstand the World

Book / 2018

Meredith Broussard writes A guide to understanding the inner workings and outer limits of technology and why we should never assume that computers always get it right.

Measuring trends in Artificial Intelligence

The 2021 AI Index Report summarizing the happenings in the field for the year. The report covers research and development; technical performance; the economy; AI education; ethical challenges of AI applications; diversity in AI; and AI policy and national strategies.

The Societal Limits of AI Ethics

Recorded webinar / April 19, 2021

Solid overview of contemporary issues related to ethics: open questions about the social, political, economic, and organizational realities that could similarly obstruct our efforts to make AI safer, fairer, and more transparent.

The Ethical Algorithm: The Science of Socially Aware Algorithm Design

Book / 2019

Michael Kearns and Aaron Roth incorporate philosophy, ethics, politics, and economics in their discussion of the tradeoff between accurate prediction and fairness constraints put on those predictions.

Resistance AI Workshop @ NEURIPS 2020

This workshop held on Dec 11, 2020 holds resources for AI researchers and marginalized communities to discuss and reflect on AI-fueled inequity and co-create dreams and tactics of how to work toward Resistance AI.

Computer Vision

CODED Bias

Modern society sits at the intersection of two crucial questions: What does it mean when artificial intelligence increasingly governs our liberties? And what are the consequences for the people AI is biased against? When MIT Media Lab researcher Joy Buolamwini discovers that many facial recognition technologies do not accurately detect darker-skinned faces or classify the faces of women, she delves into an investigation of widespread bias in algorithms. As it turns out, artificial intelligence is not neutral, and women are leading the charge to ensure our civil rights are protected.

Computer Vision: Who is helped and who is harmed?

Recorded webinar, Jan 6, 2021

"Recent papers have also exposed shockingly racist and sexist labels in popular computer vision datasets–resulting in the removal of some. In this talk, Dr. Timnit Gebru highlights some of these issues and proposed solutions to mitigate bias, as well as how some of the proposed fixes could exacerbate the problem rather than mitigate it."

TUtorial on Fairness Accountability and Ethics in Computer Vision

The vision community is well positioned to foster serious conversations about the ethical considerations of some of the current use cases of computer vision technology. This webpage attends to the Fairness, Accountability, Transparency, and Ethics (FATE) of modern computer vision in order to provide a space to analyze controversial research papers that have garnered a lot of attention. The resources present also seeks to highlight research on uncovering and mitigating issues of unfair bias and historical discrimination that trained machine learning models learn to mimic and propagate.

Data Justice

Institute in Critical Quantitative, Computational, & Mixed Methodologies

ICQCM's mission is to advance the presence of scholars of color among those using data science methodologies, and challenge researchers to use those methods in ways that can dismantle the structural barriers to enable human flourishing for underrepresented communities, professionals, and young people.

Distributed AI Research Institute

The Distributed AI Research Institute is a space for independent, community-rooted AI research free from Big Tech’s pervasive influence, launched by Timnit Gebru.

An Ethical CHecklist for Data Science

This resource is run by Carnegie Mellon University and the Data Science for Social Good Foundation. This page includes key questions key questions that data scientists should ask as they pursue their projects. Framing questions include: Who will be affected by our work? How we are ensuring that by doing ‘good’ for one group, we are not inadvertently harming another?

Beyond Bias: Algorithmic Unfairness, Infrastructure, and Genealogies of Data

Recorded webinar April 14, 2021 with Alex Hanna.

Problems of algorithmic bias are often framed in terms of lack of representative data or formal fairness optimization constraints to be applied to automated decision-making systems. However, these discussions sidestep deeper issues with data used in AI, including problematic categorizations and the extractive logics of crowdwork and data mining. In this talk, Dr. Hanna makes two interventions: first by reframing of data as a form of infrastructure, and as such, implicating politics and power in the construction of datasets; and secondly discussing the development of a research program around the genealogy of datasets used in machine learning and AI systems.

Ethical Data Analytics II

Sample tools from IG&H (Netherlands-based company) that incorporate ethics into the design of data projects.

IDA B. WELLS Just Data Lab

Housed in Princeton University’s Department of African American Studies, the IDA B. WELLS Just Data Lab brings together students, educators, activists, and artists to develop a critical and creative approach to data conception, production, and circulation. The aim is to rethink and retool the relationship between stories and statistics, power and technology, data and justice. Founding director of lab is Ruha Benjamin.

Machine Learning

Dario SaNsone: Machine Learning for Economists

Since economists (and many other disciplines!) have been thinking about how to deal with biases like selection bias, omitted variable biases, etc. there is a budding literature on the intersection of machine learning and causal inference.

Developed by Dario Sansome at University of Exeter

Model Cards for Model Reporting

Preprint / Conference paper 2019

"Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups ... and intersectional groups ... that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. "

Bias / Natural Language Processing

Algorithm Oppression: How Search Engines Reinforce Racism

Resources from UCLA scholar Safiya Noble

A critique of the way search engine algorithms are designed to reflect the negative bias of their creators, particularly at the expense of women of color. They offer a skewed perception of the people's thoughts and beliefs, which is often mistaken as reality.

Defining Bias in NLP

Podcast with Su Lin Blodgett

(March 2021)

How do we define bias? Is all bias the same? Is it possible to eliminate bias completely in our AI systems? Should we even try?

Language (Technology) is Power: A Critical Survey of "BIAS" IN NLP

Preprint / Research paper / 2020

Critiquing the recognition of bias in existing NLP research. Provides three recommendations for "how researchers and practitioners conducting work analyzing "bias" in NLP systems might avoid the pitfalls presented".

Search Engine Breakdown

NOVA video / April 14, 2021

Safiya Noble and Latanya Sweeney shed light on the hidden biases in search engine technology and specifically discuss how it marginalizes women of color.

Ethics

Accountability and Power

Technology and Power Curriculum

"This course examines the interactions of technology and power, in particular, how technology enforces and extends both state and privatized forms of power."

Indigenous protocol and AI

Position paper / Jan 2020

Lewis, Jason Edward, ed. 2020. Indigenous Protocol and Artificial Intelligence Position Paper. Honolulu, Hawaiʻi: The Initiative for Indigenous Futures and the Canadian Institute for Advanced Research (CIFAR).

Justice and Fairness

The Algorithmic Justice League: Library of Content

The Algorithmic Justice League is an organization that combines art and research to illuminate the social implications and harms of artificial intelligence.

AJL’s mission is to raise public awareness about the impacts of AI, equip advocates with empirical research to bolster campaigns, build the voice and choice of the most impacted communities, and galvanize researchers, policymakers, and industry practitioners to mitigate AI bias and harms.

Design Justice

Open access text / MIT Press / 2020

by Sasha Costanza-Chock

What is the relationship between design, power, and social justice? “Design justice” is an approach to design that is led by marginalized communities and that aims explicitly to challenge, rather than reproduce, structural inequalities. It has emerged from a growing community of designers in various fields who work closely with social movements and community-based organizations around the world.

Fairness and Machine LEarning: Limitations and Opportunities

Open access text

by Solon Barocas, Moritz Hardt, Arvind Narayanan

This book gives a perspective on machine learning that treats fairness as a central concern rather than an afterthought. We’ll review the practice of machine learning in a way that highlights ethical challenges. We’ll then discuss approaches to mitigate these problems.

Transparency

AI is creating a new colonial world order

Popular press article / MIT Technology Review

April 19, 2022

MIT Technology Review's new AI Colonialism series digs into parallels between AI development and the colonial past by examining communities that have been profoundly changed by the technology.

Colonized by Data

Book / 2019

In The Costs of Connection: How Data is Colonizing Human Life and Appropriating It for Capitalism, Nick Couldry and Ulises A. Mejias argue that the quantified world is not a new frontier, but rather the continuation and expansion of both colonialism and capitalism

measuring diversity

Interactive model to communicate the diversity and inclusion metrics in subset selection. A great way to visualize the problems that arise when quantifying representation and biases with machine learning.

Social Justice

Race / bias / surveillance

DaRK MATTERS: ON THE SURVEILLANCE OF BLACKNESS

Book / 2015

Author Simone Brown write about the conditions of blackness as a key site through which surveillance is practiced, narrated, and resisted.

Dissecting racial bias in an algorithm used to manage the health of populations

Science Research Article / 2019

We show that a widely used algorithm, typical of this industry-wide approach and affecting millions of patients, exhibits significant racial bias: At a given risk score, Black patients are considerably sicker than White patients, as evidenced by signs of uncontrolled illnesses. Remedying this disparity would increase the percentage of Black patients receiving additional help from 17.7 to 46.5%.

Race After Technology

Book / 2019

Race After Technology by Princeton scholar Ruha Benjamin critically examines the ideologies and practices of technology companies in the U.S. and developed countries around the world in the current era of big data, surveillance, and rapid technological development

Gender Justice

Automatic gender recognition tech is dangerous, say campaigners: It's time to ban it

Article/Report / April 2021

"Automatic gender recognition systems can't recognize the transgender people exist and is incompatible with self expression. Algorithmic transparency is needed to enforce a ban on this technology."

Data Feminism

Book by Catherine D'Ignazio & Lauren F. Klein / 2020 / MIT Press

Open Access

Today, data science is a form of power. It has been used to expose injustice, improve health outcomes, and topple governments. But it has also been used to discriminate, police, and surveil. This potential for good, on the one hand, and harm, on the other, makes it essential to ask: Data science by whom? Data science for whom? Data science with whose interests in mind? The narratives around big data and data science are overwhelmingly white, male, and techno-heroic.

Feminist Data Manifest-No

The Manifest-No is a declaration of refusal and commitment. It refuses harmful data regimes and commits to new data futures.

Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification

Gender Shades is a preliminary excavation of the inadvertent negligence that will cripple the age of automation and further exacerbate inequality if left to fester. The deeper we dig, the more remnants of bias we will find in our technology. We cannot afford to look away this time, because the stakes are simply too high. We risk losing the gains made with the civil rights movement and women's movement under the false assumption of machine neutrality. Automated systems are not inherently neutral. They reflect the priorities, preferences, and prejudices—the coded gaze—of those who have the power to mold artificial intelligence.

Joy Buolamwini, Lead Author

Inequity BY DESIGN

Automating Inequality

Book / 2018

"In Automating Inequality, Virginia Eubanks systematically investigates the impacts of data mining, policy algorithms, and predictive risk models on poor and working-class people in America."

AI can be sexist and racist

Nature Commentary / 2018

Computer scientists must identify sources of bias, de-bias training data and develop artificial-intelligence algorithms that are robust to skews in the data, argue James Zou and Londa Schiebinger.

Weapons of math destruction

Book / 2017

Mathematician and data scientist Cathy O’Neil reveals the mathematical models being used today are unregulated and uncontestable, even when they’re wrong. Most troubling, they reinforce discrimination—propping up the lucky, punishing the downtrodden, and undermining our democracy in the process.

Values encoded in ML research

Preprint / 2021

Machine learning (ML) currently exerts an outsized influence on the world, increasingly affecting communities and institutional practices. It is therefore critical that we question vague conceptions of the field as value-neutral or universally beneficial, and investigate what specific values the field is advancing. In this paper, we present a rigorous examination of the values of the field by quantitatively and qualitatively analyzing 100 highly cited ML papers published at premier ML conferences, ICML and NeurIPS. We annotate key features of papers which reveal their values: how they justify their choice of project, which aspects they uplift, their consideration of potential negative consequences, and their institutional affiliations and funding sources. We find that societal needs are typically very loosely connected to the choice of project, if mentioned at all, and that consideration of negative consequences is extremely rare.