Fitz, H. (2009). Neural Syntax. Amsterdam. ILLC publication series.

Hartmut Fitz

Abstract:

Children learn their mother tongue spontaneously and effortlessly

through communicative interaction with their environment; they do not

have to be taught explicitly or learn how to learn first. The ambient

language to which children are exposed, however, is highly variable

and arguably deficient with regard to the learning target.

Nonetheless, most normally developing children learn their native

language rapidly and with ease.

To explain this accomplishment, many theories of acquisition posit

innate constraints on learning, or even a biological endowment for

language which is specific to language.  Usage-based theories, on the

other hand, place more emphasis on the role of experience and

domain-general learning mechanisms than on innate language-specific

knowledge. But languages are lexically open and combinatorial in

structure, so no amount of experience covers their

expressivity. Usage-based theories therefore have to explain how

children can generalize the properties of their linguistic input to an

adult-like grammar.

In this thesis I provide an explicit computational mechanism with

which usage-based theories of language can be tested and

evaluated. The focus of my work lies on complex syntax and the human

ability to form sentences which express more than one proposition by

means of relativization.  This `capacity for recursion' is a hallmark

of an adult grammar and, as some have argued, the human language

faculty itself.

The manuscript is organized as follows. In the second chapter, I give

an overview of results that characterize the properties of neural

networks as mathematical objects and review previous attempts at

modelling the acquisition of complex syntax with such networks.  The

chapter introduces the conceptual landscape in which the current work

is located.

In the third chapter, I argue that the construction and use of meaning

is essential in child language acquisition and adult

processing. Neural network models need to incorporate this dimension

of human linguistic behavior. I introduce the Dual-path model of

sentence production and syntactic development which is able to

represent semantics and learns from exposure to sentences paired with

their meaning (cf. Chang et al. 2006).  I explain the architecture of

this model, motivate critical assumptions behind its design, and

discuss existing research using this model.

The fourth chapter describes and compares several extensions of the

basic architecture to accommodate the processing of multi-clause

utterances.  These extensions are evaluated against computational

desiderata, such as good learning and generalization performance and

the parsimony of input representations. A single-best solution for

encoding the meaning of complex sentences with restrictive relative

clauses is identified, which forms the basis for all subsequent

simulations.

Chapter five analyzes the learning dynamics in more detail. I first

examine the model's behavior for different relative clause

types. Syntactic alternations prove to be particularly difficult to

learn because they complicate the meaning-to-form mapping the model

has to acquire. In the second part, I probe the internal

representations the model has developed during learning. It is argued

that the model acquires the argument structure of the construction

types in its input language and represents the hierarchical

organization of distinct multi-clause utterances.

The juice of this thesis is contained in chapters six to eight.  In

chapter six, I test the Dual-path model's generalization capacities in

a variety of tasks.  I show that its syntactic representations are

sufficiently transparent to allow structural generalization to novel

complex utterances. Semantic similarities between novel and familiar

sentence types play a critical role in this task.  The Dual-path model

also has a capacity for generalizing familiar words to novel slots in

novel constructions (strong semantic systematicity). Moreover, I

identify learning conditions under which the model displays recursive

productivity. It is argued that the model's behavior is consistent

with human behavior in that production accuracy degrades with depth of

embedding, and right-branching is learned faster than center-embedding

recursion.

In chapter seven, I address the issue of learning complex polar

interrogatives in the absence of positive exemplars in the input. I

show that the Dual-path model can acquire the syntax of these

questions from simpler and similar structures which are warranted in a

child's linguistic environment.  The model's errors closely match

children's errors, and it is suggested that children might not require

an innate learning bias to acquire auxiliary fronting. Since the model

does not implement a traditional kind of language-specific universal

grammar, these results are relevant to the poverty of the stimulus

debate.

English relative clause constructions give rise to similar performance

orderings in adult processing and child language acquisition.  This

pattern matches the typological universal called the noun phrase

accessibility hierarchy. I propose an input-based explanation of this

data in chapter eight.  The Dual-path model displays this ordering in

syntactic development when exposed to plausible input

distributions. But it is possible to manipulate and completely remove

the ordering by varying properties of the input from which the model

learns.  This indicates, I argue, that patterns of interference and

facilitation among input structures can explain the hierarchy when all

structures are simultaneously learned and represented over a single

set of connection weights.

Finally, I draw conclusions from this work, address some unanswered

questions, and give a brief outlook on how this research might be

continued.


Fitz, H. (2009). Neural Syntax. Amsterdam. ILLC publication series.