Polluted Data
Your Current Personal Data Pool Is Polluted
Survivorship Bias and Incorrect Data
History Is Full of Racism, Sexism, and Terrible Things, Therefore Our Big Data Set is Racist, Sexist, and Terrible
Click Bait Titles Appeal to Emotion
Cherry Picking
Nutpicking
Media Consolidation
Your Current Personal Data Pool Is Polluted
Video: 10 Things You have Heard and Re-told but are Completely False - Neil deGrasse Tyson - Cosmology Today - Jun 14, 2016
https://en.wikipedia.org/wiki/List_of_common_misconceptions
Color of the sun is . . . ?
What goes up must come down?
The brightest star in the sky is called . . . ?
Days get longer in the summer and shorter in the winter?
Sun rises in the east and sets in the west?
Solar eclipses are rare?
A day is about 24 hours?
How Many Glasses of Water is a Person Supposed to Drink a Day?
Article: The Water Myth - McGill - Christopher Labos MD, MSc - May 31, 2018
When You Look At Data From A Point of View, There Will Be Distortion
How Big Is Greenland Compared to Africa?
Why is the Mercator projection a distortion of truth and how does it affect our understanding of the Earth?
Video: Why All Maps Are Wrong - Vox - Dec 2, 2016
Data/Charts?
https://www.youtube.com/watch?v=E91bGT9BjYk
Data Needs to Be Presented In A Human Context
A little, a good amount, enormous? Feel free to pick your own words.
How would you describe 0.5%?
How would you describe 5%
How would you describe 50%
How many is 50?
How many is 5 thousand of a thing?
How many is 5 million of a thing?
Image: -
In math, one always means one. Contextualized however, one can mean different things.
As a result of his foolishness, one cookie fell on the floor.
As a result of his foolishness, ten cookies fell on the floor.
As a result of his foolishness, all of the cookies fell on the floor.
As a result of his foolishness, all of the Oreos fell on the floor.
As a result of his foolishness, all of the home-baked cookies fell on the floor.
When you bring people into it, the context can be wildly different
Today, a Sacramento man died of the flu.
Today, your mother died of the flu.
Today, ten people in the United States died of the flu.
Today, ten people in California died of the flu.
Today, ten people in the Natomas community died of the flu.
Today, a thousand people in the United States died of the flu.
Today, a million people in the United States died of the flu.
Today, a thousand Natomas High Schools' worth of people died of the flu.
How we present the information can change how we consider it. A number without human context often gets dismissed as a cold emotionless data point. Sometimes to get attention and to get people to realize the impact of a number in human terms, you have to put the number in a different human context.
Today, a thousand people in the United States died.
Infographic: Covid 19 Cases and Deaths as Percent of US Population 8-14-2020 - data-artist
Image: Vietnam Veterans Memorial
Image: Arlington National Cemetery
Image: Mass Grave for Hundreds of Brazil Coronavirus Victims (Article)
A single death is a tragedy; a million deaths is a statistic.--Uncertain author. This quote is attributed to many in several forms.
What Data You're Not Getting
Survivorship Bias
Survivorship Bias: https://www.youtube.com/watch?v=P9WFpVsRtQg
Autism Prevalence Unchanged in 20 Years (Steven Novella) There is no autism epidemic. The number of diagnoses has increased, but the evidence strongly suggests this is due to better diagnosis, changing definitions, and greater acceptance. A new study looked at autism prevalence around the world; it showed no change from 1990 to 2010.
http://www.usccb.org/issues-and-action/child-and-youth-protection/upload/2019-Annual-Report-Final.pdf Why are there so many new cases against the Catholic Church in 2019? There were 700-1400 allegations per year leading up to 2019, but 2019 has 4,434 allegations?
Page 27: "Compared to 2018, the number of allegations increased significantly. This is in part due to the additional allegations received as a result of lawsuits, compensation programs, and bankruptcies, mak-ing up approximately 37% of allegations. These programs allow those who have previously reported allegations as well as those who have not yet come forward, to be considered for some type of monetary compensation."
Always Think About What Data You're Not Getting Right and Why
Image: Teen Pregnancies - TruthFacts - Wolff and Morgenthaler
Article: This Is How 'False Positives' And 'False Negatives' Can Bias COVID-19 Testing - Ethan Siegel - May 7, 2020
Backup Copy: This Is How 'False Positives' And 'False Negatives' Can Bias COVID-19 Testing - Ethan Siegel - May 7, 2020
Disease Tests have False Positives and False Negatives
Lets say 1 in 10,000 have a disease. A disease test has a false positive test of 1%.
If you test 1,000,000 people, you will find 100 with the disease and 10,000 false positives. Health statistics will report that there are 10,100 cases of the disease in the population. 10,000 will be quarantined, drugged, poked, and injected and highly inconvenienced. The population may be scared and behave irrationally and buy incredible amounts of toilet paper with no explanation why.
False negatives are also a problem. If tests result in a no answer, but they actually had the disease, there are many people who will spread the disease by behaving normally. Even in tests that do not relate to diseases, this has consequences. If there are 6.2 million pregnancies a year in the United States and inexpensive home pregnancy tests have a false negative up to 5% of the time, that makes 310,000 women a year who are pregnant, think they are not, and may unknowingly damage their future child through casual alcohol and drug use.
History Is Full of Racism, Sexism, and Terrible Things, Therefore Our Big Data Set is Racist, Sexist, and Terrible
images.google.com How many out of the first 50 results are White? Persons of Color?
Beautiful Woman
Successful Woman
Intelligent Woman
Expressive Woman
Woman
Convicted Woman
Beautiful Man
Successful Man
Intelligent Man
Expressive Man
Man
Convicted Man
Farmworkers, Farmers
Teacher, Professor
Charts Reveal and Can Also Be Obscuring Data
https://www.youtube.com/watch?v=O-3Mlj3MQ_Q
Arithmetic Growth adds a fixed amount (+10, +10, +10)
Logarithmic Growth
Exponential Growth multiplies a fixed amount (x1.2, x1.2, x1.2...)
Click Bait Titles Appeal to Emotion
Today (6/30/2020) I loaded the YouTube home page and got eight recommendations. A couple of them were political. 5 of the 8 had a word in all capitalized letters. There were a couple screamed verbs ("FOOL", "REACT", and "HAMMERS"), and a couple of screamed noun phrases ("SUPER SNOWFLAKE", and "IMPOSSIBLE card trick").
I reload the page and get "POWERFUL EPIC DEBATE" and "UNCLEARED GLITCH" and I reload again and see "WORST TROLL LEVEL" and "LIARS." Again and again "WRONG", "DETAINED", "VERY", "SUPER TRIGGERED", "REAL REASONS", "THREATENED", "INSANE", "DESTROYS." A few titles are fully capitalized in every word.
Looking at the other titles, you see plenty of editorializing. Titles that are telling you what to think and how to think about the topic before you watch the video. Most brutal fails, Tyrant, Most terrifying, Dishonest. The emotionally charged words and the capitalizing are doing something to the reader.
Titles with capitals and charged words are appealing to your base self. Your emotional reactive self. Your instant gratification self. Your unconscious non-thinking self. The charged up words interrupt a normal flow of reading from left to right. A reader instead will see those one or two words before processing the sentence. The mind will see "------- ------ ------- ----- -- --- ------- EXPOSED!" before you process the topic or person involved. You are being provoked over and over again.
Good Titles Respect The Audience
A title that respects the audience does not have fully capitalized words other than Acronyms (FBI, CIA, USAF).
A title that respects an audience provides information without telling you what to think about it.
A respected audience member is not screamed at to make them listen.
A respected audience is treated like they are capable of making their own decisions.
Articles and videos and sources that use normal language and approach you informationally might not get more hits, but they are far more likely to be quality sources.
Now let's go to some places that are generally respectful with their titles. Here's where I tested and what I found today:
news.google.com: Only capitalized words were Acronyms: (MLB, MTV, COVID-19)
www.npr.org: Some Acronyms, but an all capitalized "CORONAVIRUS LIVE UPDATES"
www.bbc.com: Some place acronyms mostly (US, NY, UK)
www.cnn.com: "LIVE UPDATES", "BREAKING", "CORONAVIRUS", "TRENDING"
https://rtumble.com: Just a few acronyms (BART, US, PG&E, COVID-19)
www.foxnews.com: Just a few acronyms (COVID-19, DOJ, GOP, NYC)
Now I'm going to a few places that are probably pretty bad at this:
www.infowars.com: 20 different articles on the home page. Every word of every title is fully capitalized.
www.breitbart.com: Over 20 different articles on the home page. Every word of every title is fully capitalized.
https://tyt.com: All of the section titles are fully capitalized. All of the shows they produce are fully capitalized. A surprising lack of titles. The site labels everything with the show's name: "THE YOUNG TURKS", "THE DAMAGE REPORT WITH JOHN IADROLA", AND "TYT INVESTIGATES." Prioritizing the medium over the content?
www.tmz.com: "BALTIMORE DISCRIMINATION MOTHER AND SON ANNOUNCE LAWSUIT After Denial Over Dress Code", "POOR JUDGEMENT", RAYSHARD BROOKS MURDER: JUDGE SETS $500 BOND FOR GARRETT ROLF... Brooks' Widow Gives Emotional Impact Statement."
Please keep in mind this is a focused analysis of source titles. This analysis is not addressing other forms of journalistic credibility, political slant, or quality of content of any of these sites. This is just one evaluation method you can use to avoid the worst sources.
Infographic: Media Bias Chart 6.0 - June 2020 - Ad Fontes Media Inc.
Spin, Propaganda, and Polluted Data
Because we live in a human system, everything becomes political. There are political agents who will take data and see it their way. They will only present the parts they agree with. They will hide and delete information that goes against their stance. All of this is bad science. The truth is often lost in human power struggles.
Cherry Picking
Cherry picking is the strategy of collecting data from an experiment and then only using the data that matches what you are trying to prove. A true experiment should have a hypothesis, but a good scientist can not force the data to fit their hypothesis. A good scientist reports all processes and data so that the experiment can be peer reviewed. Cherry picked conclusions will not be reproducible.
Video: BEST TRICK SHOT EVER (LIGHTHOUSE to SHIP)!! - Excerpt starts at 7m40s - How Ridiculous - Jun 2, 2017
Video: We Spent 6 Days Attempting a 200m Basketball Shot in Lesotho, Africa - How Ridiculous - Apr 6, 2018
Example: Five surveys are ordered to be given in different parts of the country. The survey asks questions about if a politician is a good leader. Four out of five of the surveys are very negative. One of the surveys from the politician's home state comes back decently positive. The campaign then issues a press release using the one survey as evidence that people like the politician.
Example: Texas Sharpshooter is a special form of cherry picking when a tester is searching for patterns and trends. Imagine a sharpshooter firing at an unpainted side of a barn. He takes a bunch of shots. He then walks up to the wall and finds a group of three bullet holes that are very close together. The sharpshooter then paints an archery style target on the part of the wall with the close bullet holes. The shooter then poses next to the target and posts on social media how precise they are with their shots. None of the other shots are in the picture. “To be sure of hitting the target, shoot first, and call whatever you hit the target”
Nutpicking
Nutpicking is the strategy of finding the stupidest, craziest, most out there members of a group, and then using those nuts to represent everyone in that group. A claim made by nutpicking has negatively-focused polluted data using the strategy of cherry picking while also mixing in an ad hominem attack, mischaracterizing the people in the group.
Video: Can You Name a Country? - Jimmy Kimmel Show - Jul 12, 2018
Video: Jay Leno Science Quiz - Tonight Show with Jay Leno
Video: Jay Leno Astronomy Quiz - Tonight Show with Jay Leno
Video: Jay Leno JayWalking: Jay Interviews College Students - Tonight Show with Jay Leno - Aired on August 1, 2011
Video: Jay Leno JayWalking: Beach Quiz - Tonight Show with Jay Leno
Discovery Abuse: The Document Dump
Imagine asking a teacher a question, and the teacher replies by pointing at the library and saying "good luck." Imagine this scenario with no search engines, no indexes, no glossaries, no tables of contents, and no reference librarians. How does a person acquire truth while they are drowning in data?
"If you don't find it in the index, look very carefully through the entire catalogue.” - Sears, Roebuck Co. Catalogue: A Window to Turn-of-the-Century America - 1897
During a trial the plaintiffs and defendants are able to demand information from the other side. Often a side is ordered to give documents that they don't want to give because it would hurt their side of the case. The process of requesting and obtaining evidence from the other side is called discovery.
Example: A prosecution thinks the company's financial documents will reveal foul play and files with the court to receive them. The company does not want that information revealed, but fails to get the court to deny the subpoena. The company is now ordered to provide the documents, but still wants to hide things. The company realizes that humans are limited by time, and that lawyers and paralegals cost money. The company provides the full financial records and all related material to the request which is thousands upon thousands of documents and files. The document dump becomes a hunt for the needle in the haystack that the prosecution wants that will hopefully waste their time and resources.
Media Consolidation
https://en.wikipedia.org/wiki/Mayflower_doctrine 1934-1949
https://en.wikipedia.org/wiki/FCC_fairness_doctrine (1949-1987)
https://www.youtube.com/watch?v=hWLjYJ4BzvI Sinclair's script for stations
https://www.youtube.com/watch?v=x6U2Un5kEdI 11 Local TV Stations Pushed the Same Amazon-Scripted Segment
The package—you can view the script Amazon provided to news stations here—was produced by Amazon spokesperson Todd Walker. Only one station, Toledo ABC affiliate WTVG, acknowledged that Walker was an Amazon employee, not a news reporter, and noted that Amazon had supplied the video. Other stations that ran the Amazon-provided content as a news package include:
WTVJ-NBC, Miami, FL
WKRN-ABC, Nashville, TN
WLEX-NBC, Lexington, KY (ran twice)
WVVA-NBC, Bluefield, WV
WTVM-ABC, Columbus, GA (ran twice)
KMIR-NBC, Palm Springs, CA (ran three times)
WBTW-CBS, Myrtle Beach, SC
WOAY-ABC, Bluefield, WV (ran twice)
Video: MSNBC interrupts Congresswoman for report on Justin Bieber - Host Andrea Mitchell interrupts former Congresswoman Jane Harman (D-CA) to report breaking news regarding the arrest of popstar Justin Bieber. Aired on Andrea Mitchell Reports on MSNBC, 22 January 2014.