# MODUS TOLLENS

### The syllogism modus tollens can be used to reject hypotheses.

Modus tollens is a valid deductive syllogism that takes the form:

PREMISE: If A then B.

PREMISE: B is NOT true.

CONCLUSION: Therefore, A is NOT true.

How can we use modus tollens to test hypotheses?

For our first premise, we could imagine making a specific prediction based on a General Hypothesis. For example, we could predict:

PREMISE 1: IF non-repetitive study results in more learning than blocked study of mathematics skills,

THEN serial study (one type of non-repetitive study) will result in significantly higher scores on algebra, geometry, and word problem tests than blocked study during retention tests.

The first part of the premise is very general, implying that all types of non-repetitive study result in more learning than blocked study of mathematics skills. Therefore, the first part of the premise could be thought of as a General Hypothesis. The second part of the premise is a specific prediction (out of MANY possible). The second part of the premise is therefore one possible Measurable Hypothesis.

We could then perform an experiment, collect data, perform statistical tests, and find:

PREMISE 2: Retention test scores on algebra, geometry, and word problem tests were NOT higher than blocked study during retention tests (t-tests; P > 0.05).

Using modus tollens, we could come to the conclusion:

CONCLUSION: Serial study does NOT result in more learning than blocked study of mathematics skills. Non-repetitive study does not result in more learning than blocked study of mathematics skills.

Is the argument a valid deductive argument and a form of modus tollens?

Logically, the argument is valid because it does have the form of modus tollens. If our second premise is also true and leads to the conclusion that serial study does not always result in more learning of mathematics skills than blocked study, then modus tollens provides the opportunity to reject general hypotheses even based on a single experiment.

Is the argument a sound deductive argument?

The argument will be sound if the second premise is true. You might argue: "how could we question its truthfulness without actually seeing the data?" You would have a legitimate point. HOWEVER, there is one problem with Premise 2 that doesn't depend on the data.

The problem with Premise 2 is that in common practice, statistical tests (like t-tests) are asymmetrical. Statistical tests CAN test for differences among groups to a specified level of confidence (e.g. 95%). However, if a statistical test fails to find significant differences among groups, then the statistical test has simply failed. A failed statistical test is NOT strong evidence of the absence of differences among groups (additional analysis such as interval identification or power analysis can determine the probability of Type II error; Giere, 2006. Completely different statistical frameworks such as Bayesian statistics can provide less categorical statistical comparisons (Höfler et al., 2018). However, a more extensive or nuanced approach to statistics is outside the scope of the current module).

A failed statistical test is commonly interpreted as: we still don't know if there is a significant difference between groups or not.

For example, solely because our t-test failed to find a significant difference between the serial study and blocked study groups (Premise 2), we cannot conclude (within our agreed-upon 95% confidence) that serial study and blocked study are NOT different. All we can conclude is that our t-test failed to find a significant difference between groups: we still do not know if there is a difference between serial and blocked practice or not! Therefore, Premise 2 is a non-sequitur. The failure of a statistical test does NOT reasonably lead to the conclusion that serial study results in the same amount of learning than blocked study of mathematics skills.

Why can we NOT conclude that two groups the same if a statistical test fails to find a significant difference?

The reason that we cannot come to firm conclusions based solely on the absence of significant differences is because there are many ways for statistical tests to fail. A true lack of statistical differences between groups is only one potential reason that a statistical test can fail. Other common reasons for a "false negative" are:

* Sample sizes too small to detect differences between groups (lack of statistical "power").

* Violating one of the assumptions of parametric statistical tests (e.g. non-normal distribution).

* Outliers in the dataset that substantially increase the variance of one or more groups.

* Co-variation among variables that increase variance.

Strong study design and data analysis can mitigate some of the problems that affect statistical tests. However, for the parametric statistical tests commonly used in educational settings, failing to reject a hypothesis does not provide sufficient evidence to "accept" the hypothesis.

"Null" Hypotheses allow us to reject hypotheses based on the statistical finding of significant differences.

If a statistical test fails to find a statistically significant difference between groups, without additional analysis we cannot be confident that there is no actual difference between groups. However, if statistical tests are performed correctly, we can be confident (to a specified confidence level) that finding a statistically significant difference between groups indicates that an actual difference exists between groups. The level of our confidence is related to the "P value," which indicates the potential for a "false positive." A false positive means that even though there is NO difference between two groups, our statistical test finds one. P < 0.05 means that there is less than a 5% chance that our statistical test found a difference between groups that wasn't actually there.

Therefore, to use statistics with modus tollens, we must select a reasoning structure that allows us to use significant differences to reject hypotheses. So-called "Null" hypotheses allow us to use modus tollens to reject hypotheses.

DEFINITION: A "null" hypothesis is a prediction of NO differences between or among groups.

Null Hypotheses may seem awkward because we are predicting the absence of differences instead of the presence of differences between groups (even though the presence of differences is commonly why we create the hypotheses in the first place). However, null hypotheses can help to clarify arguments. For example, we could frame our first premise as a null hypothesis:

PREMISE 1: IF non-repetitive study does NOT result in more learning than blocked study of mathematics skills,

THEN serial study will result in scores that are NOT significantly higher than blocked study on algebra, geometry, and word problems during retention tests.

If we conduct an experiment and find:

PREMISE 2: Retention test scores on algebra, geometry, and word problem tests were significantly higher than blocked study during retention tests (t-tests; P < 0.05),

we can use modus tollens to come to the conclusion:

CONCLUSION: We reject our null hypothesis. Serial study results in more learning than blocked study of mathematics skills.

Our conclusion is both valid and sound because we use the valid syllogism modus tollens to reject a hypothesis based on a statistically significant difference.

A reasonable question might be: what if we performed our experiment and still didn't find a significant difference between groups? In the case of the lack of a significant difference, the argument becomes:

PREMISE 1: IF non-repetitive study does NOT result in more learning than blocked study of mathematics skills,

THEN serial study will result in scores that are NOT significantly higher than blocked study on algebra, geometry, and word problems during retention tests.

PREMISE 2: Retention test scores on algebra, geometry, and word problem tests after serial study were NOT higher than scores after blocked study during retention tests (t-tests; P > 0.05).

CONCLUSION: We support our null hypothesis that serial study does NOT result in more learning than blocked study of mathematics skills.

Is there a problem with the final argument?

The problem with the final argument is that it is in the form of a logical fallacy: affirming the consequent. We do not even need to think about the limitations of statistical tests to know that the argument is invalid and cannot be sound. Therefore, null hypotheses can help to clarify reasoning. 