This course focuses on the essential skills of critical thinking: how to form beliefs responsibly, how to evaluate evidence, and how to reduce systematic error in our own reasoning. While human beings are excellent at generating explanations and telling coherent stories, those same capacities make us prone to serious and predictable mistakes. We are easily misled by small samples, vivid experiences, emotional reactions, and conclusions we already want to accept. In the modern world, where we are bombarded with competing claims about health, politics, and technology, the ability to distinguish between a persuasive story and a proven fact is a vital survival skill.
The scientific method is the most successful system humans have developed for correcting these errors. It is not simply a set of laboratory procedures reserved for professional scientists in white coats. Rather, it is a general method for figuring out what is true when intuition, common sense, and personal experience are unreliable. It applies equally to medicine, biology, physics, psychology, and the social sciences. By the end of this chapter, you should understand why ordinary reasoning so often goes wrong, how the scientific method is designed to counter those failures, and why scientific reasoning is indispensable for both individual decision-making and collective progress.
Anecdotal reasoning is the default mode of human thinking. It relies on personal experience, stories, vivid examples, and what we loosely call “common sense.” When we reason anecdotally, we typically generalize from a very small number of cases—often just one—to broad conclusions about how the world works. As epistemic agents, we are constantly trying to figure out what is really going on in the world as fast as possible with as little information as possible, before we get killed, poisoned, or squashed by a boulder. This "fast and frugal" reasoning was an evolutionary advantage on the savannah, but it is a liability when trying to understand complex modern systems like global economics or clinical pharmacology. Here are some of the biases and errors in reasoning that this approach leaves us prone to:
Negativity Bias. When confronted with both good and bad news, our attention and memory skew heavily toward the bad. We react to potential threats faster because natural selection built our systems to favor noticing what might kill us. In evolutionary terms, "good news" (more food, peace, positive social events) is beneficial, but it rarely makes the difference between living and dying in the immediate moment. "Bad news" (loss of food, physical threats, hostile social situations) can mean the end of your life and your genetic line. Consequently, we err on the side of "false positives"—we would rather mistake a rustle in the grass for a lion than mistake a lion for a rustle in the grass.
This bias remains with us today. Consider the psychological weight of a single cockroach in a bowl of cherries versus a single cherry in a bowl of cockroaches. The cockroach completely wrecks the appeal of the fruit, but the cherry does nothing to improve the filth. In modern life, this explains why a single negative review on a product page often carries more weight for a consumer than fifty five-star reviews. We are wired to believe that bad news is more "informative" than good news. This bias is exploited daily by news media, which focuses on rare, dramatic tragedies rather than the slow, steady progress of human health and safety, leading many to believe the world is far more dangerous than it statistically is.
Hindsight bias is the "knew it all along" mistake. Once an event has occurred, we look back and feel that the outcome was more predictable than it actually was at the time. This creates a distorted view of history and personal experience, making us believe we possess better predictive powers than we do. It leads to the "Monday morning quarterback" phenomenon, where observers criticize a decision made under uncertainty as if the decider had access to the final result before they acted.
Outcome bias is a closely related mistake: we use the information from the outcome to evaluate the quality of the prior decision. Imagine a person who chooses to undergo a surgery that has a 95% success rate and is the only way to save their mobility. If they happen to fall into the 5% "bad outcome" category, they—and their family—often retroactively evaluate the choice to have surgery as a "bad decision." Scientifically, however, the decision was excellent based on the 95% probability of success; only the luck was bad. We see this in everyday statements: "I shouldn't have gotten on that plane," or "The school counselors should have seen that coming." We evaluate the past using "outcome information" that was not available when the decision was actually made, leading to unfair blame and a failure to learn from the actual decision-making process.
Availability Bias. Availability bias occurs when we mistake the ease with which information comes to mind for its frequency in the real world. We make intuitive judgments of probability by reference to how easily instances of a class can be recalled. This is why people are often terrified of shark attacks or plane crashes—events that are vivid, emotional, and heavily covered in the media—while remaining relatively unconcerned about heart disease or car accidents, which are far more likely to kill them but are "boring" and less available in the mental "news feed."
Consider the public perception of crime. If a student is asked, "Is violence on the rise?" they might think of recent mass shootings or terrorist attacks reported on social media. Because these examples are "easy to come to mind," the student concludes that violence is increasing. However, long-term statistical data often shows the opposite. The "EASY TO COME TO MIND" metric is not equal to "PROBABLE." This bias prevents us from properly allocating our resources—both emotional and financial—toward the risks that actually threaten our well-being.
Confirmation Bias is the tendency to seek out, notice, remember, and give greater weight to evidence that supports a belief one already holds, while ignoring, discounting, or failing to look for evidence that would challenge or disconfirm that belief. If you believe that a certain politician is corrupt, you will likely notice every headline that suggests a scandal while skipping over articles that discuss their successful policies. This leads to "over-believing" and jumping to conclusions based on cherry-picked data.
Motivated reasoning is the mistake of wanting a conclusion to be true and then reasoning actively toward it. Perhaps the conclusion protects our identity or our social group—we actively search for arguments that support it and then conclude that the "evidence" justifies our preference. In motivated reasoning, evidence becomes ammunition rather than a constraint. We aren't acting like judges weighing the facts; we are acting like lawyers defending a client. This is why it is so difficult to change someone’s mind on topics like religion, politics, or sports—their reasoning is running backward from a desired conclusion to a justification. Confirmation bias could be one mistake, but no the only one, we commit within the larger category of motivated reasoning.
The Placebo Effect. One of the most common forms of anecdotal reasoning involves health. A person might say, "I took a Vitamin C supplement and my cold got better in two days. It works!" This is a classic anecdote. It uses a sample size of one and fails to account for the placebo effect—a positive reaction based on expectations rather than the treatment itself. Because our bodies have natural healing processes, many people get better regardless of what they take. Without a scientific framework, we mistakenly give credit to the "remedy" when the body or the placebo effect was actually responsible.
In general, anecdotal reasoning as we’ve described it here relies on trusted authority figures, friends, family, small sample sizes, and and personal pattern recognition. While these are useful for choosing a restaurant, they can be disastrous for determining if a medical treatment is effective or if a social policy is working. Anecdotes provide the "vividness" that humans crave, but they lack the "verifiability" that the modern world requires.
The scientific method is not a set of facts; rather it is a method or approach that can rectify many of the mistakes we described above. It acknowledges that individual scientists are just as biased and fallible as anyone else. What makes science powerful is that it does not rely on individual judgment. Instead, it builds procedures, institutions, and norms that systematically counter human cognitive weaknesses. It is an "epistemological flywheel"—a system that transforms unreliable human reasoning into increasingly reliable knowledge. Here are the steps that we will attribute to the scientific method approach:
1. Observe. Scientific inquiry begins with systematic observation. Unlike a casual anecdote (e.g., "I saw a weird light in the sky"), scientific observation focuses on patterns or phenomena that are repeatable and measurable. A single observation is rarely meaningful; what matters is whether a phenomenon can be detected repeatedly, under similar conditions, and by independent observers.
For example, in the mid-19th century, Ignaz Semmelweis observed that women in maternity wards died of "childbed fever" at much higher rates when they were treated by doctors who had just performed autopsies compared to women treated by midwives who did not. This wasn't just a story about one woman; it was a measurable pattern across hundreds of cases. Observation provides the raw material for inquiry, but by itself, it does not yet explain why something is happening.
2. Develop hypotheses From observation, scientists move to hypothesis formation. A hypothesis is a proposed explanation that can be tested against evidence. A well-formed hypothesis makes it possible to specifying conditions under which it would be shown false. Good hypotheses do not merely describe what has already been observed; they generate predictions about what should be observed if the hypothesis is true. (Beware of claims that cannot in principle be proven false by any evidence.)
If a researcher hypotheses that "Vitamin D reduces the severity of viral infections," that hypothesis predicts that in a future study, people with higher Vitamin D levels should have shorter hospital stays than those with lower levels. If the data shows no difference, the hypothesis is undermined. A hypothesis that can explain every possible outcome (e.g., "The medicine works, but sometimes the spirits don't want it to") is not scientific because it cannot be disproven.
Semmelweis, in the maternity ward example from above, began to suspect that doctors who were handling corpses and not washing their hands were then transferring something deadly to the women in the maternity ward.
3. Develop Questions Closely tied to hypothesis formation is the development of precise, measurable questions. Poorly framed questions invite ambiguous answers. Scientists must specify variables, comparisons, and outcomes. Instead of asking, "Are schools bad?" a scientist asks, "Does the implementation of a 30-minute daily creative writing session correlate with a 10% increase in standardized literacy scores among 4th-grade students?"
This precision prevents moving the goalposts. In anecdotal reasoning, if someone takes a supplement and doesn't get better, they might say, "Well, it kept me from getting worse!" By asking a precise question beforehand, the scientific method prevents us from changing the criteria for success after the fact to fit our biases.
4. Gather Data Systematically. Once hypotheses and questions are defined, scientists gather data systematically to overcome the "small-sample" problem. If you ask five friends if they like a movie, you have a small, biased sample. If you survey 5,000 randomly selected people, you have a data set that begins to reflect the "ground truth."
Large samples reduce the influence of random variation and outliers. Systematic data collection also requires "standardized procedures." This means everyone in the study is measured the same way, using the same tools. This ensures that the results are not the product of a researcher's mood, a subject's personality, or a faulty piece of equipment used only once.
5. Aggressively Seek Out Disconfirmation. This is the heart of the scientific method, and it provides the biggest contrast to anecdotal reasoning as characterized above. Rather than seeking only evidence that supports a hypothesis (confirmation bias), scientists actively look for evidence that would show it to be false. Surviving a serious attempt at disproof provides far stronger support than easy confirmation. To do this, scientists exercise many precautions. Three important concepts for our analysis are:
Control Groups: a control group in a scientific study is a baseline comparison group where the treatment, drug, or variable that is under investigation is not present. To determine if a drug works, you cannot just give it to 100 people and see if they get better. You must give it to 50 people (the test group) and give nothing or a neutral substance to the other 50 (the control group). If both groups get better at the same rate, the drug does not work.
Double-Blind Testing: is an experimental design in which neither the participants nor the researchers who interact with them know which participants are receiving the experimental treatment and which are receiving the control or placebo. The purpose of double-blind testing is to prevent expectation effects from distorting results. Participants’ beliefs about whether they are receiving a treatment can influence their reported symptoms or behavior, and researchers’ beliefs can unconsciously influence how they measure, interpret, or record outcomes. This prevents bias, where a doctor might unconsciously interpret a patient's symptoms more positively because they know the patient is on the real medication.
Control for Placebo Effect: is a change in a person’s symptoms, behavior, or perceived well-being that is caused by their expectations or beliefs about a treatment, rather than by the treatment’s actual physical or chemical properties. When people believe they are receiving an effective treatment, that belief alone can produce real, measurable effects—such as reduced pain, improved mood, or changes in physiological responses—even if the treatment itself is inert.
For example, a person who takes a sugar pill but believes it is a powerful painkiller may report genuine pain relief. The improvement is real, but its cause is psychological expectation, not the drug.
The placebo effect is especially strong in areas involving subjective experience, such as pain, anxiety, fatigue, and nausea, but it can also influence measurable outcomes like heart rate or immune response. Because we know that the expectation of healing can cause actual physiological changes, we must ensure that the treatment outperforms a sugar pill. If a new antidepressant makes people 20% happier, but a sugar pill also makes them 20% happier, the antidepressant is a failure.
Many studies fail at this stage. This is a strength of science, not a weakness. It ensures that individual biases, "p-hacking" (manipulating data to find a result), and sloppy thinking are filtered out before the information reaches the public. While peer review isn't perfect, it is the most rigorous "BS-filter" ever created.
If research passes the peer review process it is often published in peer reviewed journals or books, having passed this vital stage of error checking, so it can be shared more generally.
Over time, as different researchers using different methods all point toward the same conclusion, the scientific community reaches a "consensus." This consensus is what allows for progress. We don't have to re-prove that germs cause disease or that gravity exists every morning; we can build new technologies and medicines on top of those corroborated truths.
This is the defining virtue of the scientific method. Where dogmatic belief systems refuse to change in the face of evidence, science thrives on it. Being "wrong" in science is often as important as being "right," because identifying an error brings us one step closer to the truth.
News coverage and personal experience often create the impression that the world is getting worse. But when we examine long-term data rather than anecdotes, a different picture emerges. By nearly every measurable standard, human lives have improved dramatically over time. Life expectancy has increased; child and maternal mortality have declined; violence has decreased across multiple time scales; and access to food, education, and legal protection has expanded.
These trends are not the result of intuition or tradition. They are the result of applying scientific reasoning to real-world problems. Scientific reasoning matters because false beliefs are costly. Acting on inaccurate models of the world—whether that means refusing a vaccine based on an anecdote or passing a law based on a vivid but rare tragedy—reliably leads to harm, failure, and wasted effort. Science reduces those costs by producing more accurate beliefs that allow us to solve problems effectively.
1. Observe
2. Develop hypotheses
3. Develop Questions
4. Gather Data Systematically.
5. Aggressively Seek Out Disconfirmation.
Control Groups
Double-Blind Testing
Control for Placebo Effect
Negativity bias
Hindsight bias
Outcome bias
Availability Bias
Confirmation Bias
Motivated Reasoning
Placebo effect
Small sample sizes
This tool is designed to help you master:
1. Observe
2. Develop hypotheses
3. Develop Questions
4. Gather Data Systematically.
5. Aggressively Seek Out Disconfirmation.
Control Groups
Double-Blind Testing
Control for Placebo Effect
Negativity bias
Hindsight bias
Outcome bias
Availability Bias
Confirmation Bias
Motivated Reasoning
Placebo effect
Small sample sizes
Ask the agent to quiz you on anecdotal reasoning or on the scientific method.
The in-person quizzes and exams use the same structure as the practice tool:
same kinds of arguments,
same questions,
same definitions,
same distinctions.
The only difference is that the AI will not be there.
If you have practiced with the tool until the concepts are automatic, the in-person assessments will feel straightforward. If you have not, they will feel confusing and rushed.
Students who use the tool seriously should expect:
higher quiz scores,
more confidence identifying arguments,
fewer “I knew it but couldn’t explain it” moments.
This tool enforces the definitions used in this course. Philosophers can and do disagree about these definitions in other contexts. For this class, you are being graded on whether you can apply these definitions correctly and consistently.