In the past, connectionist models have concentrated on developing a cause and effect relationship between the model's inputs and its outputs. When a model was successful, analysis would reveal that it had extracted key components (features) from the inputs that were good predictors of the output. This program was successful as long as the inputs were causal determiners of the outputs.
The program was successfully challenged by Pinker and Prince [PINKER88]. In that paper, they identified many problems in the Rumelhart and McClelland [RUMEL86a]model of the acquisition of verbal past-tense forms. Among those problems was the fact that the root form of the verb was not a sufficient predictor of its past-tense form. Much recent connectionist work has been directed toward overcoming the problems identified by Pinker and Prince.
As part of this recent work, Elman has been extending the work of Jordan [JORDAN86]. This work includes time as an element in the causal relationship between inputs and outputs (that is, the connectionist network uses both the input and its own previous state to determine its output). In particular, Elman has found that having the network predict the next input has been a successful method for identifying significant regularities in the input data. In [ELMAN90], he reports on a model that can identify word boundaries when the input is a stream of letters that spell out the words (with no indication of where one word ends and the next begins). Another model can identify word categories when given a stream of input consisting of words that form sentences (with no indication of where one sentence ends and the next begins). In both of these models, the input is not a sufficient predictor of the output. However, the inclusion of the network's state as part of the input allows the network to successfully extract relevant statistical properties from the input. Although this does not directly address the prediction of the verbal past-tense form, it does open a new avenue of research that might aid in solving the problem.
Elman's work is reminiscent of the work performed by the pre-Chomskyan linguists. These "structuralists" believed that the correct way to study language was to look for regularities in actual usage. That is, they believed that word categories, grammar, etc. could be identified by direct observation of the "word streams" produced by native speakers of the language. Unfortunately, the computational tools available to the linguists of the early-1950's were not sufficient to make this program successful.
In the mid-1950's, Chomsky broke from the "structuralist" tradition, starting his revolution in linguistics. His break included a key change in methodology. He believed that actual language was an aberration. It contained far too much "noise" to ever allow the induction of general principles. Instead, he postulated two levels of language: performance and competence. Whereas the "structuralists" were committed to the study of performance, he believed that there was a Platonic, internal language competence. He felt that this internal language must be very orderly, free from the noise present in language user's "word streams." Thus, Chomsky's research program involved: (1) postulating the principles underlying the internal language, (2) deducing grammar, etc. of the internal language from these basic language principles, and (3) describing how the "noisy" language performance arises from the internal language competence. Thus, Chomsky's revolution involved a very significant shift in methodology. Post-Chomsky linguists perform deduction from basic principles to the empirical language data. Pre-Chomsky linguists performed induction from the empirical language data to basic principles.
I believe that Elman's recent connectionist models are built in the pre-Chomskyan tradition. His goal is to have his networks discover linguistic properties from empirical data. In a recent critique of the connectionist research program, Fodor and Pylyshyn [FODOR88] argued forcefully that connectionism was only valid as a computational tool being used to implement post-Chomsky linguistic theories [see NOTE 1]. This conviction is firmly grounded in their commitment to a deductive methodology. They can see no "revolution" in the connectionist program. However, when viewed the way I am proposing, there is a very definite connectionist revolution -- or, better, connectionist counter-revolution. If Elman is successful in his current research program, he will indeed have created a powerful computational tool. It may be just the tool that the "structuralists" of the early-1950's lacked. If so, we may be witnessing the origin of a neo-structuralism.
In the following, I will discuss the implications of a neo-structuralist revolution. Then I will discuss in detail two of Elman's recent models, and contrast them with the current work of Zelig Harris -- a linguist that has continued to follow the structuralist methodology.