The Theory of Theories

All material at this website © 1998-2005 by Christopher Michael Langan


You know what they say about theories: everybody’s got one. In fact, some people have a theory about pretty much everything. That’s not one Master Theory of Everything, mind you…that’s a separate theory about every little thing under the sun. (To have a Master Theory, you have to be able to tie all those little theories together.)

But what is a “theory”? Is a theory just a story that you can make up about something, being as fanciful as you like? Or does a theory at least have to seem like it might be true? Even more stringently, is a theory something that has to be rendered in terms of logical and mathematical symbols, and described in plain language only after the original chicken-scratches have made the rounds in academia?

A theory is all of these things. A theory can be good or bad, fanciful or plausible, true or false. The only firm requirements are that it (1) have a subject, and (2) be stated in a language in terms of which the subject can be coherently described. Where these criteria hold, the theory can always be “formalized”, or translated into the symbolic language of logic and mathematics. Once formalized, the theory can be subjected to various mathematical tests for truth and internal consistency.

But doesn’t that essentially make “theory” synonymous with “description”? Yes. A theory is just a description of something. If we can use the logical implications of this description to relate the components of that something to other components in revealing ways, then the theory is said to have “explanatory power”. And if we can use the logical implications of the description to make correct predictions about how that something behaves under various conditions, then the theory is said to have “predictive power”.

From a practical standpoint, in what kinds of theories should we be interested? Most people would agree that in order to be interesting, a theory should be about an important subject…a subject involving something of use or value to us, if even on a purely abstract level. And most would also agree that in order to help us extract or maximize that value, the theory must have explanatory or predictive power. For now, let us call any theory meeting both of these criteria a “serious” theory.

Those interested in serious theories include just about everyone, from engineers and stockbrokers to doctors, automobile mechanics and police detectives. Practically anyone who gives advice, solves problems or builds things that function needs a serious theory from which to work. But three groups who are especially interested in serious theories are scientists, mathematicians and philosophers. These are the groups which place the strictest requirements on the theories they use and construct.

While there are important similarities among the kinds of theories dealt with by scientists, mathematicians and philosophers, there are important differences as well. The most important differences involve the subject matter of the theories. Scientists like to base their theories on experiment and observation of the real world…not on perceptions themselves, but on what they regard as concrete “objects of the senses”. That is, they like their theories to be empirical. Mathematicians, on the other hand, like their theories to be essentially rational…to be based on logical inference regarding abstract mathematical objects existing in the mind, independently of the senses. And philosophers like to pursue broad theories of reality aimed at relating these two kinds of object. (This actually mandates a third kind of object, the infocognitive syntactic operator…but another time.)

Of the three kinds of theory, by far the lion’s share of popular reportage is commanded by theories of science. Unfortunately, this presents a problem. For while science owes a huge debt to philosophy and mathematics – it can be characterized as the child of the former and the sibling of the latter - it does not even treat them as its equals. It treats its parent, philosophy, as unworthy of consideration. And although it tolerates and uses mathematics at its convenience, relying on mathematical reasoning at almost every turn, it acknowledges the remarkable obedience of objective reality to mathematical principles as little more than a cosmic “lucky break”.

Science is able to enjoy its meretricious relationship with mathematics precisely because of its queenly dismissal of philosophy. By refusing to consider the philosophical relationship between the abstract and the concrete on the supposed grounds that philosophy is inherently impractical and unproductive, it reserves the right to ignore that relationship even while exploiting it in the construction of scientific theories. And exploit the relationship it certainly does! There is a scientific platitude stating that if one cannot put a number to one's data, then one can prove nothing at all. But insofar as numbers are arithmetically and algebraically related by various mathematical structures, the platitude amounts to a thinly veiled affirmation of the mathematical basis of knowledge.

Although scientists like to think that everything is open to scientific investigation, they have a rule that explicitly allows them to screen out certain facts. This rule is called the scientific method. Essentially, the scientific method says that every scientist’s job is to (1) observe something in the world, (2) invent a theory to fit the observations, (3) use the theory to make predictions, (4) experimentally or observationally test the predictions, (5) modify the theory in light of any new findings, and (6) repeat the cycle from step 3 onward. But while this method is very effective for gathering facts that match its underlying assumptions, it is worthless for gathering those that do not.

In fact, if we regard the scientific method as a theory about the nature and acquisition of scientific knowledge (and we can), it is not a theory of knowledge in general. It is only a theory of things accessible to the senses. Worse yet, it is a theory only of sensible things that have two further attributes: they are non-universal and can therefore be distinguished from the rest of sensory reality, and they can be seen by multiple observers who are able to “replicate” each other’s observations under like conditions. Needless to say, there is no reason to assume that these attributes are necessary even in the sensory realm. The first describes nothing general enough to coincide with reality as a whole – for example, the homogeneous medium of which reality consists, or an abstract mathematical principle that is everywhere true - and the second describes nothing that is either subjective, like human consciousness, or objective but rare and unpredictable…e.g. ghosts, UFOs and yetis, of which jokes are made but which may, given the number of individual witnesses reporting them, correspond to real phenomena.

The fact that the scientific method does not permit the investigation of abstract mathematical principles is especially embarrassing in light of one of its more crucial steps: “invent a theory to fit the observations.” A theory happens to be a logical and/or mathematical construct whose basic elements of description are mathematical units and relationships. If the scientific method were interpreted as a blanket description of reality, which is all too often the case, the result would go something like this: “Reality consists of all and only that to which we can apply a protocol which cannot be applied to its own (mathematical) ingredients and is therefore unreal.” Mandating the use of “unreality” to describe “reality” is rather questionable in anyone’s protocol.

What about mathematics itself? The fact is, science is not the only walled city in the intellectual landscape. With equal and opposite prejudice, the mutually exclusionary methods of mathematics and science guarantee their continued separation despite the (erstwhile) best efforts of philosophy. While science hides behind the scientific method, which effectively excludes from investigation its own mathematical ingredients, mathematics divides itself into “pure” and “applied” branches and explicitly divorces the “pure” branch from the real world. Notice that this makes “applied” synonymous with “impure”. Although the field of applied mathematics by definition contains every practical use to which mathematics has ever been put, it is viewed as “not quite mathematics” and therefore beneath the consideration of any “pure” mathematician.

In place of the scientific method, pure mathematics relies on a principle called the axiomatic method. The axiomatic method begins with a small number of self-evident statements called axioms and a few rules of inference through which new statements, called theorems, can be derived from existing statements. In a way parallel to the scientific method, the axiomatic method says that every mathematician’s job is to (1) conceptualize a class of mathematical objects; (2) isolate its basic elements, its most general and self-evident principles, and the rules by which its truths can be derived from those principles; (3) use those principles and rules to derive theorems, define new objects, and formulate new propositions about the extended set of theorems and objects; (4) prove or disprove those propositions; (5) where the proposition is true, make it a theorem and add it to the theory; and (6) repeat from step 3 onwards.

The scientific and axiomatic methods are like mirror images of each other, but located in opposite domains. Just replace “observe” with “conceptualize” and “part of the world” with “class of mathematical objects”, and the analogy practically completes itself. Little wonder, then, that scientists and mathematicians often profess mutual respect. However, this conceals an imbalance. For while the activity of the mathematician is integral to the scientific method, that of the scientist is irrelevant to mathematics (except for the kind of scientist called a “computer scientist”, who plays the role of ambassador between the two realms). At least in principle, the mathematician is more necessary to science than the scientist is to mathematics.

As a philosopher might put it, the scientist and the mathematician work on opposite sides of the Cartesian divider between mental and physical reality. If the scientist stays on his own side of the divider and merely accepts what the mathematician chooses to throw across, the mathematician does just fine. On the other hand, if the mathematician does not throw across what the scientist needs, then the scientist is in trouble. Without the mathematician’s functions and equations from which to build scientific theories, the scientist would be confined to little more than taxonomy. As far as making quantitative predictions were concerned, he or she might as well be guessing the number of jellybeans in a candy jar.

From this, one might be tempted to theorize that the axiomatic method does not suffer from the same kind of inadequacy as does the scientific method…that it, and it alone, is sufficient to discover all of the abstract truths rightfully claimed as “mathematical”. But alas, that would be too convenient. In 1931, an Austrian mathematical logician named Kurt Gödel proved that there are true mathematical statements that cannot be proven by means of the axiomatic method. Such statements are called “undecidable”. Gödel’s finding rocked the intellectual world to such an extent that even today, mathematicians, scientists and philosophers alike are struggling to figure out how best to weave the loose thread of undecidability into the seamless fabric of reality.

To demonstrate the existence of undecidability, Gödel used a simple trick called self-reference. Consider the statement “this sentence is false.” It is easy to dress this statement up as a logical formula. Aside from being true or false, what else could such a formula say about itself? Could it pronounce itself, say, unprovable? Let’s try it: "This formula is unprovable". If the given formula is in fact unprovable, then it is true and therefore a theorem. Unfortunately, the axiomatic method cannot recognize it as such without a proof. On the other hand, suppose it is provable. Then it is self-apparently false (because its provability belies what it says of itself) and yet true (because provable without respect to content)! It seems that we still have the makings of a paradox…a statement that is "unprovably provable" and therefore absurd.

But what if we now introduce a distinction between levels of proof? For example, what if we define a metalanguage as a language used to talk about, analyze or prove things regarding statements in a lower-level object language, and call the base level of Gödel’s formula the "object" level and the higher (proof) level the "metalanguage" level? Now we have one of two things: a statement that can be metalinguistically proven to be linguistically unprovable, and thus recognized as a theorem conveying valuable information about the limitations of the object language, or a statement that cannot be metalinguistically proven to be linguistically unprovable, which, though uninformative, is at least no paradox. Voilà: self-reference without paradox! It turns out that "this formula is unprovable" can be translated into a generic example of an undecidable mathematical truth. Because the associated reasoning involves a metalanguage of mathematics, it is called “metamathematical”.

It would be bad enough if undecidability were the only thing inaccessible to the scientific and axiomatic methods together. But the problem does not end there. As we noted above, mathematical truth is only one of the things that the scientific method cannot touch. The others include not only rare and unpredictable phenomena that cannot be easily captured by microscopes, telescopes and other scientific instruments, but things that are too large or too small to be captured, like the whole universe and the tiniest of subatomic particles; things that are “too universal” and therefore indiscernable, like the homogeneous medium of which reality consists; and things that are “too subjective”, like human consciousness, human emotions, and so-called “pure qualities” or qualia. Because mathematics has thus far offered no means of compensating for these scientific blind spots, they continue to mark holes in our picture of scientific and mathematical reality.

But mathematics has its own problems. Whereas science suffers from the problems just described – those of indiscernability and induction, nonreplicability and subjectivity - mathematics suffers from undecidability. It therefore seems natural to ask whether there might be any other inherent weaknesses in the combined methodology of math and science. There are indeed. Known as the Lowenheim-Skolem theorem and the Duhem-Quine thesis, they are the respective stock-in-trade of disciplines called model theory and the philosophy of science (like any parent, philosophy always gets the last word). These weaknesses have to do with ambiguity…with the difficulty of telling whether a given theory applies to one thing or another, or whether one theory is “truer” than another with respect to what both theories purport to describe.

But before giving an account of Lowenheim-Skolem and Duhem-Quine, we need a brief introduction to model theory. Model theory is part of the logic of “formalized theories”, a branch of mathematics dealing rather self-referentially with the structure and interpretation of theories that have been couched in the symbolic notation of mathematical logic…that is, in the kind of mind-numbing chicken-scratches that everyone but a mathematician loves to hate. Since any worthwhile theory can be formalized, model theory is a sine qua non of meaningful theorization.

Let’s make this short and punchy. We start with propositional logic, which consists of nothing but tautological, always-true relationships among sentences represented by single variables. Then we move to predicate logic, which considers the content of these sentential variables…what the sentences actually say. In general, these sentences use symbols called quantifiers to assign attributes to variables semantically representing mathematical or real-world objects. Such assignments are called “predicates”. Next, we consider theories, which are complex predicates that break down into systems of related predicates; the universes of theories, which are the mathematical or real-world systems described by the theories; and the descriptive correspondences themselves, which are called interpretations. A model of a theory is any interpretation under which all of the theory’s statements are true. If we refer to a theory as an object language and to its referent as an object universe, the intervening model can only be described and validated in a metalanguage of the language-universe complex.

Though formulated in the mathematical and scientific realms respectively, Lowenheim-Skolem and Duhem-Quine can be thought of as opposite sides of the same model-theoretic coin. Lowenheim-Skolem says that a theory cannot in general distinguish between two different models; for example, any true theory about the numeric relationship of points on a continuous line segment can also be interpreted as a theory of the integers (counting numbers). On the other hand, Duhem-Quine says that two theories cannot in general be distinguished on the basis of any observation statement regarding the universe.

Just to get a rudimentary feel for the subject, let’s take a closer look at the Duhem-Quine Thesis. Observation statements, the raw data of science, are statements that can be proven true or false by observation or experiment. But observation is not independent of theory; an observation is always interpreted in some theoretical context. So an experiment in physics is not merely an observation, but the interpretation of an observation. This leads to the Duhem Thesis, which states that scientific observations and experiments cannot invalidate isolated hypotheses, but only whole sets of theoretical statements at once. This is because a theory T composed of various laws {Li}, i=1,2,3,… almost never entails an observation statement except in conjunction with various auxiliary hypotheses {Aj}, j=1,2,3,… . Thus, an observation statement at most disproves the complex {Li+Aj}.

To take a well-known historical example, let T = {L1,L2,L3} be Newton’s three laws of motion, and suppose that these laws seem to entail the observable consequence that the orbit of the planet Uranus is O. But in fact, Newton’s laws alone do not determine the orbit of Uranus. We must also consider things like the presence or absence of other forces, other nearby bodies that might exert appreciable gravitational influence on Uranus, and so on. Accordingly, determining the orbit of Uranus requires auxiliary hypotheses like A1 = “only gravitational forces act on the planets”, A2 = “the total number of solar planets, including Uranus, is 7,” et cetera. So if the orbit in question is found to differ from the predicted value O, then instead of simply invalidating the theory T of Newtonian mechanics, this observation invalidates the entire complex of laws and auxiliary hypotheses {L1,L2,L3;A1,A2,…}. It would follow that at least one element of this complex is false, but which one? Is there any 100% sure way to decide?

As it turned out, the weak link in this example was the hypothesis A2 = “the total number of solar planets, including Uranus, is 7”. In fact, there turned out to be an additional large planet, Neptune, which was subsequently sought and located precisely because this hypothesis (A2) seemed open to doubt. But unfortunately, there is no general rule for making such decisions. Suppose we have two theories T1 and T2 that predict observations O and not-O respectively. Then an experiment is crucial with respect to T1 and T2 if it generates exactly one of the two observation statements O or not-O. Duhem’s arguments show that in general, one cannot count on finding such an experiment or observation. In place of crucial observations, Duhem cites le bon sens (good sense), a non-logical faculty by means of which scientists supposedly decide such issues. Regarding the nature of this faculty, there is in principle nothing that rules out personal taste and cultural bias. That scientists prefer lofty appeals to Occam’s razor, while mathematicians employ justificative terms like beauty and elegance, does not exclude less savory influences.

So much for Duhem; now what about Quine? The Quine thesis breaks down into two related theses. The first says that there is no distinction between analytic statements (e.g. definitions) and synthetic statements (e.g. empirical claims), and thus that the Duhem thesis applies equally to the so-called a priori disciplines. To make sense of this, we need to know the difference between analytic and synthetic statements. Analytic statements are supposed to be true by their meanings alone, matters of empirical fact notwithstanding, while synthetic statements amount to empirical facts themselves. Since analytic statements are necessarily true statements of the kind found in logic and mathematics, while synthetic statements are contingently true statements of the kind found in science, Quine’s first thesis posits a kind of equivalence between mathematics and science. In particular, it says that epistemological claims about the sciences should apply to mathematics as well, and that Duhem’s thesis should thus apply to both.

Quine’s second thesis involves the concept of reductionism. Reductionism is the claim that statements about some subject can be reduced to, or fully explained in terms of, statements about some (usually more basic) subject. For example, to pursue chemical reductionism with respect to the mind is to claim that mental processes are really no more than biochemical interactions. Specifically, Quine breaks from Duhem in holding that not all theoretical claims, i.e. theories, can be reduced to observation statements. But then empirical observations “underdetermine” theories and cannot decide between them. This leads to a concept known as Quine’s holism; because no observation can reveal which member(s) of a set of theoretical statements should be re-evaluated, the re-evaluation of some statements entails the re-evaluation of all.

Quine combined his two theses as follows. First, he noted that a reduction is essentially an analytic statement to the effect that one theory, e.g. a theory of mind, is defined on another theory, e.g. a theory of chemistry. Next, he noted that if there are no analytic statements, then reductions are impossible. From this, he concluded that his two theses were essentially identical. But although the resulting unified thesis resembled Duhem’s, it differed in scope. For whereas Duhem had applied his own thesis only to physical theories, and perhaps only to theoretical hypothesis rather than theories with directly observable consequences, Quine applied his version to the entirety of human knowledge, including mathematics. If we sweep this rather important distinction under the rug, we get the so-called “Duhem-Quine thesis”.

Because the Duhem-Quine thesis implies that scientific theories are underdetermined by physical evidence, it is sometimes called the Underdetermination Thesis. Specifically, it says that because the addition of new auxiliary hypotheses, e.g. conditionals involving “if…then” statements, would enable each of two distinct theories on the same scientific or mathematical topic to accommodate any new piece of evidence, no physical observation could ever decide between them.

The messages of Duhem-Quine and Lowenheim-Skolem are as follows: universes do not uniquely determine theories according to empirical laws of scientific observation, and theories do not uniquely determine universes according to rational laws of mathematics. The model-theoretic correspondence between theories and their universes is subject to ambiguity in both directions. If we add this descriptive kind of ambiguity to ambiguities of measurement, e.g. the Heisenberg Uncertainty Principle that governs the subatomic scale of reality, and the internal theoretical ambiguity captured by undecidability, we see that ambiguity is an inescapable ingredient of our knowledge of the world. It seems that math and science are…well, inexact sciences.

How, then, can we ever form a true picture of reality? There may be a way. For example, we could begin with the premise that such a picture exists, if only as a “limit” of theorization (ignoring for now the matter of showing that such a limit exists). Then we could educe categorical relationships involving the logical properties of this limit to arrive at a description of reality in terms of reality itself. In other words, we could build a self-referential theory of reality whose variables represent reality itself, and whose relationships are logical tautologies. Then we could add an instructive twist. Since logic consists of the rules of thought, i.e. of mind, what we would really be doing is interpreting reality in a generic theory of mind based on logic. By definition, the result would be a cognitive-theoretic model of the universe.

Gödel used the term incompleteness to describe that property of axiomatic systems due to which they contain undecidable statements. Essentially, he showed that all sufficiently powerful axiomatic systems are incomplete by showing that if they were not, they would be inconsistent. Saying that a theory is “inconsistent” amounts to saying that it contains one or more irresolvable paradoxes. Unfortunately, since any such paradox destroys the distinction between true and false with respect to the theory, the entire theory is crippled by the inclusion of a single one. This makes consistency a primary necessity in the construction of theories, giving it priority over proof and prediction. A cognitive-theoretic model of the universe would place scientific and mathematical reality in a self-consistent logical environment, there to await resolutions for its most intractable paradoxes.

For example, modern physics is bedeviled by paradoxes involving the origin and directionality of time, the collapse of the quantum wave function, quantum nonlocality, and the containment problem of cosmology. Were someone to present a simple, elegant theory resolving these paradoxes without sacrificing the benefits of existing theories, the resolutions would carry more weight than any number of predictions. Similarly, any theory and model conservatively resolving the self-inclusion paradoxes besetting the mathematical theory of sets, which underlies almost every other kind of mathematics, could demand acceptance on that basis alone. Wherever there is an intractable scientific or mathematical paradox, there is dire need of a theory and model to resolve it.

If such a theory and model exist – and for the sake of human knowledge, they had better exist – they use a logical metalanguage with sufficient expressive power to characterize and analyze the limitations of science and mathematics, and are therefore philosophical and metamathematical in nature. This is because no lower level of discourse is capable of uniting two disciplines that exclude each other’s content as thoroughly as do science and mathematics.

Now here’s the bottom line: such a theory and model do indeed exist. But for now, let us satisfy ourselves with having glimpsed the rainbow under which this theoretic pot of gold awaits us.