Presents itself when a model is created using data that was gathered from an already biased time period.
An example is gender bias, and how women in various industries have had their accomplishment overlooked.
Also known as sampling bias
Occurs when the data misrepresents the target demographic.
An example is gather data about politics through only social media, which contains a far younger population when compared to everyone who may have a say in politics
Can be seen as a result of bad data gathering mechanisms because the data may vary across different groups.
An example may include using money to predict happiness, even thought the level of money varies from group to group (age) but happiness can have a normal distribution
Results from combining inheritably different groups together during testing or evaluation, resulting in a higher level of error due to the differences
Analyzing crime rates may lead to some aggregation bias, especially if one combines different sections of a city together. The average may show high crime rates, when in reality only certain parts of the town contain high rates, and the others are lower
Presents itself when trying to test a model using data that the model was not initially intended for
An example includes evaluating the accuracy of a model that tests the profitability of restaurants with a set of Asian cuisines when the restaurant was only trained on American cuisines
Refers to a bias that happens during deployment, usually when the model is used for something other than its original intended purpose
Usually seen by the end-user
An example would be trying to use a model that gauges gas mileage of economy cars to predict values of a sports car, resulting in an abnormally low accuracy rating