Resources

The resources linked have been curated from the working library of the Ethics and Social Justices subcommittee of the Data SRI. Within the field of data science, we have resources categorized by artificial intelligence, computer vision, data, machine learning, and natural language processing. The ethics category covers justice, accountability, fairness, and transparency. For social justice, we have resources loosely organized by race, gender, and intersectional social identities. These socially constructed categories are fluid and we acknowledge that individuals and community members hold multiple and intersectional identities.

If you have resources you would like us to include, please send them our way via the contact form.

A more complete working library of resources can be found HERE. Appreciations to Cal Poly student Divya Satrawada who compiled and supported the curation of these resources.

Data Science

Artificial Intelligence

Book / 2018

Meredith Broussard writes A guide to understanding the inner workings and outer limits of technology and why we should never assume that computers always get it right.

The 2021 AI Index Report summarizing the happenings in the field for the year. The report covers research and development; technical performance; the economy; AI education; ethical challenges of AI applications; diversity in AI; and AI policy and national strategies.

Recorded webinar / April 19, 2021

Solid overview of contemporary issues related to ethics: open questions about the social, political, economic, and organizational realities that could similarly obstruct our efforts to make AI safer, fairer, and more transparent.

Book / 2019

Michael Kearns and Aaron Roth incorporate philosophy, ethics, politics, and economics in their discussion of the tradeoff between accurate prediction and fairness constraints put on those predictions.

This workshop held on Dec 11, 2020 holds resources for AI researchers and marginalized communities to discuss and reflect on AI-fueled inequity and co-create dreams and tactics of how to work toward Resistance AI.

Computer Vision

Modern society sits at the intersection of two crucial questions: What does it mean when artificial intelligence increasingly governs our liberties? And what are the consequences for the people AI is biased against? When MIT Media Lab researcher Joy Buolamwini discovers that many facial recognition technologies do not accurately detect darker-skinned faces or classify the faces of women, she delves into an investigation of widespread bias in algorithms. As it turns out, artificial intelligence is not neutral, and women are leading the charge to ensure our civil rights are protected.

Recorded webinar, Jan 6, 2021

"Recent papers have also exposed shockingly racist and sexist labels in popular computer vision datasets–resulting in the removal of some. In this talk, Dr. Timnit Gebru highlights some of these issues and proposed solutions to mitigate bias, as well as how some of the proposed fixes could exacerbate the problem rather than mitigate it."

The vision community is well positioned to foster serious conversations about the ethical considerations of some of the current use cases of computer vision technology. This webpage attends to the Fairness, Accountability, Transparency, and Ethics (FATE) of modern computer vision in order to provide a space to analyze controversial research papers that have garnered a lot of attention. The resources present also seeks to highlight research on uncovering and mitigating issues of unfair bias and historical discrimination that trained machine learning models learn to mimic and propagate.

Data Justice

ICQCM's mission is to advance the presence of scholars of color among those using data science methodologies, and challenge researchers to use those methods in ways that can dismantle the structural barriers to enable human flourishing for underrepresented communities, professionals, and young people.

The Distributed AI Research Institute is a space for independent, community-rooted AI research free from Big Tech’s pervasive influence, launched by Timnit Gebru.

This resource is run by Carnegie Mellon University and the Data Science for Social Good Foundation. This page includes key questions key questions that data scientists should ask as they pursue their projects. Framing questions include: Who will be affected by our work? How we are ensuring that by doing ‘good’ for one group, we are not inadvertently harming another?

Recorded webinar April 14, 2021 with Alex Hanna.

Problems of algorithmic bias are often framed in terms of lack of representative data or formal fairness optimization constraints to be applied to automated decision-making systems. However, these discussions sidestep deeper issues with data used in AI, including problematic categorizations and the extractive logics of crowdwork and data mining. In this talk, Dr. Hanna makes two interventions: first by reframing of data as a form of infrastructure, and as such, implicating politics and power in the construction of datasets; and secondly discussing the development of a research program around the genealogy of datasets used in machine learning and AI systems.

Sample tools from IG&H (Netherlands-based company) that incorporate ethics into the design of data projects.


Housed in Princeton University’s Department of African American Studies, the IDA B. WELLS Just Data Lab brings together students, educators, activists, and artists to develop a critical and creative approach to data conception, production, and circulation. The aim is to rethink and retool the relationship between stories and statistics, power and technology, data and justice. Founding director of lab is Ruha Benjamin.

Machine Learning

Since economists (and many other disciplines!) have been thinking about how to deal with biases like selection bias, omitted variable biases, etc. there is a budding literature on the intersection of machine learning and causal inference.

Developed by Dario Sansome at University of Exeter

Preprint / Conference paper 2019

"Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups ... and intersectional groups ... that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. "

Bias / Natural Language Processing

Resources from UCLA scholar Safiya Noble

A critique of the way search engine algorithms are designed to reflect the negative bias of their creators, particularly at the expense of women of color. They offer a skewed perception of the people's thoughts and beliefs, which is often mistaken as reality.

Podcast with Su Lin Blodgett

(March 2021)

How do we define bias? Is all bias the same? Is it possible to eliminate bias completely in our AI systems? Should we even try?

Preprint / Research paper / 2020

Critiquing the recognition of bias in existing NLP research. Provides three recommendations for "how researchers and practitioners conducting work analyzing "bias" in NLP systems might avoid the pitfalls presented".

NOVA video / April 14, 2021

Safiya Noble and Latanya Sweeney shed light on the hidden biases in search engine technology and specifically discuss how it marginalizes women of color.

Ethics

Accountability and Power

"This course examines the interactions of technology and power, in particular, how technology enforces and extends both state and privatized forms of power."

Position paper / Jan 2020

Lewis, Jason Edward, ed. 2020. Indigenous Protocol and Artificial Intelligence Position Paper. Honolulu, Hawaiʻi: The Initiative for Indigenous Futures and the Canadian Institute for Advanced Research (CIFAR).

Justice and Fairness

The Algorithmic Justice League is an organization that combines art and research to illuminate the social implications and harms of artificial intelligence.

AJL’s mission is to raise public awareness about the impacts of AI, equip advocates with empirical research to bolster campaigns, build the voice and choice of the most impacted communities, and galvanize researchers, policymakers, and industry practitioners to mitigate AI bias and harms.

Open access text / MIT Press / 2020

by Sasha Costanza-Chock

What is the relationship between design, power, and social justice? “Design justice” is an approach to design that is led by marginalized communities and that aims explicitly to challenge, rather than reproduce, structural inequalities. It has emerged from a growing community of designers in various fields who work closely with social movements and community-based organizations around the world.

Open access text

by Solon Barocas, Moritz Hardt, Arvind Narayanan


This book gives a perspective on machine learning that treats fairness as a central concern rather than an afterthought. We’ll review the practice of machine learning in a way that highlights ethical challenges. We’ll then discuss approaches to mitigate these problems.

Transparency

Popular press article / MIT Technology Review

April 19, 2022

MIT Technology Review's new AI Colonialism series digs into parallels between AI development and the colonial past by examining communities that have been profoundly changed by the technology.

Book / 2019

In The Costs of Connection: How Data is Colonizing Human Life and Appropriating It for Capitalism, Nick Couldry and Ulises A. Mejias argue that the quantified world is not a new frontier, but rather the continuation and expansion of both colonialism and capitalism

Interactive model to communicate the diversity and inclusion metrics in subset selection. A great way to visualize the problems that arise when quantifying representation and biases with machine learning.

Social Justice

Race / bias / surveillance

Book / 2015

Author Simone Brown write about the conditions of blackness as a key site through which surveillance is practiced, narrated, and resisted.

Science Research Article / 2019

We show that a widely used algorithm, typical of this industry-wide approach and affecting millions of patients, exhibits significant racial bias: At a given risk score, Black patients are considerably sicker than White patients, as evidenced by signs of uncontrolled illnesses. Remedying this disparity would increase the percentage of Black patients receiving additional help from 17.7 to 46.5%.

Book / 2019

Race After Technology by Princeton scholar Ruha Benjamin critically examines the ideologies and practices of technology companies in the U.S. and developed countries around the world in the current era of big data, surveillance, and rapid technological development

Gender Justice

Article/Report / April 2021

"Automatic gender recognition systems can't recognize the transgender people exist and is incompatible with self expression. Algorithmic transparency is needed to enforce a ban on this technology."

Book by Catherine D'Ignazio & Lauren F. Klein / 2020 / MIT Press

Open Access

Today, data science is a form of power. It has been used to expose injustice, improve health outcomes, and topple governments. But it has also been used to discriminate, police, and surveil. This potential for good, on the one hand, and harm, on the other, makes it essential to ask: Data science by whom? Data science for whom? Data science with whose interests in mind? The narratives around big data and data science are overwhelmingly white, male, and techno-heroic.

The Manifest-No is a declaration of refusal and commitment. It refuses harmful data regimes and commits to new data futures.

Gender Shades is a preliminary excavation of the inadvertent negligence that will cripple the age of automation and further exacerbate inequality if left to fester. The deeper we dig, the more remnants of bias we will find in our technology. We cannot afford to look away this time, because the stakes are simply too high. We risk losing the gains made with the civil rights movement and women's movement under the false assumption of machine neutrality. Automated systems are not inherently neutral. They reflect the priorities, preferences, and prejudices—the coded gaze—of those who have the power to mold artificial intelligence.

Joy Buolamwini, Lead Author

Inequity BY DESIGN

Book / 2018

"In Automating Inequality, Virginia Eubanks systematically investigates the impacts of data mining, policy algorithms, and predictive risk models on poor and working-class people in America."

Nature Commentary / 2018

Computer scientists must identify sources of bias, de-bias training data and develop artificial-intelligence algorithms that are robust to skews in the data, argue James Zou and Londa Schiebinger.

Book / 2017

Mathematician and data scientist Cathy O’Neil reveals the mathematical models being used today are unregulated and uncontestable, even when they’re wrong. Most troubling, they reinforce discrimination—propping up the lucky, punishing the downtrodden, and undermining our democracy in the process.


Preprint / 2021

Machine learning (ML) currently exerts an outsized influence on the world, increasingly affecting communities and institutional practices. It is therefore critical that we question vague conceptions of the field as value-neutral or universally beneficial, and investigate what specific values the field is advancing. In this paper, we present a rigorous examination of the values of the field by quantitatively and qualitatively analyzing 100 highly cited ML papers published at premier ML conferences, ICML and NeurIPS. We annotate key features of papers which reveal their values: how they justify their choice of project, which aspects they uplift, their consideration of potential negative consequences, and their institutional affiliations and funding sources. We find that societal needs are typically very loosely connected to the choice of project, if mentioned at all, and that consideration of negative consequences is extremely rare.