Many misconceptions about AI arise from overestimating its capabilities and assuming that, as a machine, it makes unbiased decisions. However, AI models are shaped by their training data, which can be biased depending on how it was collected. As a result, AI can generate biased or inaccurate information, raising ethical concerns - particularly when users fail to recognize these biases. This can contribute to broader societal issues such as inequality, racism, and misinformation (Buolamwini, 2019; Crawford, 2021).
To promote algorithmic transparency, “it is very important to train algorithms on non-biased datasets. And it is essential for algorithms not to use sensitive information, such as race, gender, disability, and union affiliation” (Esade, 2024, 2:57).
There are many types of bias stemming from the data used to train AI. Below are some key types of bias to consider when creating data sets for AI models.
AI systems may unintentionally favor or disadvantage certain groups or traits due to flaws in the algorithms or underlying methods.
Faulty data collection tools or methods can introduce errors that distort the data set and lead to inaccurate conclusions.
When collected data fails to accurately represent the broader population, it results in skewed or misleading outcomes.
An AI system can reinforce societal stereotypes by unfairly prioritizing one gender, ethnicity, or group over another.
Incorrect or biased labels in supervised learning data sets often cause AI systems to make inaccurate predictions or classifications.
Omitting certain data from training sets - often because it is deemed unnecessary - can lead to significant gaps in AI decision-making.
Imagine you are tasked with creating an AI tool that can identify weather conditions such as sunny, rainy, or cloudy.
What types of images would you include in the training data? Your goal is to create as comprehensive a data set as possible to minimize AI bias. (Consider factors such as geographic diversity, time of day, and seasonal variations.)
How would you assess whether AI Bias exists in your trained model? (Think about ways to test its accuracy across different locations, lighting conditions, and weather patterns.)