Becoming syntactic

  Chang, F., Dell, G. S., and Bock, K. (2006). Becoming syntactic. Psychological Review, 113(2), 234–272      pdf.


Psycholinguistic research has shown that the influence of abstract syntactic knowledge on performance is shaped by particular sentences that have been experienced. To explore this idea, a connectionist model of sentence production was applied to the development and use of abstract syntax. The model makes use of (a) error-based learning to acquire and adapt sequencing mechanisms and (b) meaning-form mappings to derive syntactic representations. The model is able to account for most of what is known about structural priming in adult speakers, as well as key findings in preferential looking and elicited production studies of language acquisition. The model suggests how abstract knowledge and concrete experience are balanced in the development and use of syntax.

Contents:

1) Software

2) Architecture

3) Representing the Message

4) Input

5) Structural Priming 

6) Downloads

7) Simple Message-Sentence Generator

8) Tutorial

Required Software:

LENS - the program that implements the model (version 2.6 or higher)

Perl  - creates the input and parses the output (version 5.8 or higher)

R     -  statistics (optional)

LENS uses tcl/tk, so knowledge of that language is useful for reading the code below.

Architecture of the model:

Single arrows are learned weights.  Dashed arrows are fixed copy connections.  Thick gray lines are fast-changing message weights.  Double arrows are for target comparison (what unit output as target for cwhat unit are not shown).

The LENS code is below:

addNet dualPath -i 20

## these are the input layers, and their sizes are a bit bigger than the number of units in the paper, because the actual mapping\

 for linguistic units to model units was slightly different (e.g., sometimes 0 was not used).

set semSize 147

set lexSize 165

set eventsemSize 19

set whereSize 20

## hidden layers (as in paper)

set hiddenSize 40

set contextSize $hiddenSize

set compressSize 20

set ccompressSize 20

## create layers

addGroup cword $lexSize ELMAN ELMAN_CLAMP ELMAN_CLAMP -BIASED OUT_NORM

addGroup ccompress $ccompressSize -BIASED

addGroup cwhat $semSize OUTPUT TARGET_COPY -BIASED -WRITE_OUTPUTS

addGroup cwhere2 $whereSize ELMAN ELMAN_CLAMP ELMAN_CLAMP -BIASED

addGroup cwhere $whereSize SOFT_MAX -BIASED

addGroup eventsem $eventsemSize LINEAR -BIASED

addGroup context $contextSize ELMAN OUT_INTEGR -BIASED

addGroup hidden $hiddenSize -BIASED

addGroup where $whereSize -BIASED

addGroup what $semSize -BIASED

addGroup compress $compressSize -BIASED

addGroup targ $lexSize INPUT

addGroup word $lexSize OUTPUT SOFT_MAX STANDARD_CRIT -BIASED

## parameters for connections

## hystersis 1=copy  0=no change

setObj context.dtScale 1

## connect layers

connectGroups cword cwhat -type cwordcwhat

connectGroups cwhat cwhere -type ww

connectGroups where what -type ww

connectGroups what word -type whatword

connectGroups hidden where -type hidwhere

connectGroups context hidden -type conthid

connectGroups cwhere hidden -type prehid

connectGroups cwhere2 hidden -type prehid

connectGroups eventsem hidden -type esemhid

connectGroups hidden compress word -type hidword

connectGroups cword ccompress hidden -type cwordhid

## connect bias

connectGroups bias eventsem -type bt

connectGroups bias what -type low

connectGroups bias cwhat -type low

## copy output of what units as training signal for cwhat units

copyConnect what cwhat outputs

## create elman unit connections and initial states

elmanConnect targ cword -r 1 -init 0.0

elmanConnect word cword -r 1 -init 0.0

elmanConnect cwhere cwhere2 -r 1 -init 0.0

elmanConnect cwhere2 cwhere2 -r 1 -init 0.0

elmanConnect hidden context -r 1 -init 0.5

## turn off learning for what-where cwhat-cwhere message weights

setLinkValue learningRate 0 -t ww

setLinkValues randMean  0 -t ww

setLinkValues randRange 0 -t ww

## turn off learning for event-semantic weights

setLinkValue learningRate 0 -t bt

setLinkValues randMean  0 -t bt

setLinkValues randRange 0 -t bt

## set bias of what units so that normal activation is low

setLinkValue learningRate 0 -t low

setLinkValues randMean  -3 -t low

setLinkValues randRange 0 -t low

## seed and randomize network

randWeights -t low

freezeWeight -t low

Representing the Message in Weights

One novel aspect about the Dual-path model is the use of weights to instantiate the message.  This is done using functions that are called within the environment files (*.ex).

Here is an example pattern for the sentence "a dog sleeps".

name:{  a dog sleep -ss }

proc: { clear;link 1 62;link 5 19;plink .33 5 120;targlink 1 5;}

#mes: AINTRANS 1A=SLEEP,PRES,PERF 5X=DOG,INDEF  TLINK AINTRANS 1A=  5X=

6

t:{word 1.0} 121

t:{word 1.0} 19

t:{word 1.0} 62

t:{word 1.0} 138

t:{word 1.0} 156

t:{word 1.0} 156  ;

In the proc field, four functions are present.

clear - resets the weights to 0

link(A,B,..)  - links the where unit (A) to all the remaining units in the what layer (B,...).

plink(V,A,B,...) - links the where unit (A) with value (V) to all the remaining units in the what layer (B,...)

targlink(A,..)  - sets the event-semantic units to alternation parameter scaled value.   This value is initially 0.5.  When a -1 is in the list, then the value is multipled by the alternation parameter (0.95).  So targlink 1 5 -1 8 would mean that the activation of unit 8 is 0.95 of the value of unit 5.

In normal training, tlink is used instead of targlink.  It is the same except its alternation parameter is either 0.5 or 0.75.

Here are the functions in the prodc.in file.  The proc line above calls these functions before a pattern is processed.

proc clear {} {

    randWeights -t ww

    randWeights -t bt

}

#strength = 6

proc link {input args} {

    global strength

    foreach j $args {

        setObj what.unit($j).incoming($input).weight $strength;

        setObj cwhere.unit($input).incoming($j).weight $strength;

    }

}

# prostrength adjusts strength by the first argument prop.

proc plink {prop input args} {

    global strength

    set prostrength [expr $prop * $strength]

    foreach j $args {

        setObj what.unit($j).incoming($input).weight $prostrength;

        setObj cwhere.unit($input).incoming($j).weight $prostrength;

    }

}

# reducer is the alternation parameter, inittlink is the starting value of 0.5

# randlevel is a random value that makes tred either 0.5 or 0.75.

proc tlink {args} {

    global reducer

    global inittlink

    set tstrength $inittlink

    set randlevel [randInt 2]

    set tred [expr $reducer + $randlevel * 0.25]

    foreach j $args {

        if {$j < 0} {

            set tstrength [expr $tstrength * $tred]

        } else {

            setObj eventsem.unit($j).incoming(0).weight $tstrength;

        }

    }

}

# treduce is the alteration parameter for targets = 0.95

# inittargtlink is the starting value of 0.5

proc targlink {args} {

    global treduce

    global inittargtlink

    set tstrength $inittargtlink

    foreach j $args {

        if {$j < 0} {

            set tstrength [expr $tstrength * $treduce]

        } else {

            setObj eventsem.unit($j).incoming(0).weight $tstrength;

        }

    }

}

Input for the Model

The model is trained using message-sentence pairs.   Here is an example.  The first line is the name of the pattern, and shows the sentence.  The proc line is the tcl code that sets the message.  The third line is a comment that explains what the tcl code is doing in terms of the input grammar.  Next is a number that represents the number of words in the sentence (includes two periods).  And then the input (i:) and target (t:) pairs.  The input is placed into a buffer that is copied to the cword layer on the next timestep.

name:{  mary was hurt -par by the bird  #c}

proc: { clear;link 1 77;link 5 3;link 3 22;plink .66 3 120;tlink 1 15 17 5 -1 3;}

#cmes: THEMEEXP 1A=HURT,PAST,INPERF 5Y=MARY, :CAUSE 3X=BIRD,DEF  TLINK THEMEEXP 1A= PAST INPERF  5Y= -1  3X=

9

i:{targ 1.0} 3

t:{word 1.0} 3

i:{targ 1.0} 160

t:{word 1.0} 160

i:{targ 1.0} 77

t:{word 1.0} 77

i:{targ 1.0} 155

t:{word 1.0} 155

i:{targ 1.0} 134

t:{word 1.0} 134

i:{targ 1.0} 120

t:{word 1.0} 120

i:{targ 1.0} 22

t:{word 1.0} 22

i:{targ 1.0} 156

t:{word 1.0} 156

i:{targ 1.0} 156 

t:{word 1.0} 156  ;

A messageless pattern would be similar except the proc: field would be just "{ clear; }".

Structural Priming

The model is tested for structural priming with this code.   The prime is "trained" in the same way that the model use to learn the language.

proc structuralPrimingExp {wtfile set {lag 0} {fillerlist spfillers}} {

   setObj batchSize 1

  ## run all the pairs in the structural priming experiment

    repeat [expr [getObj $set.numExamples] / 2] {

       ## reset network and weights to final state

       resetNet

       loadWeights $wtfile

       ## train prime

       doExample -train

       updateWeights -algorithm steepest

        ## train fillers if testing lag

        if {$lag > 0} {

         for {set i 0} {$i < $lag} {incr i} {

           doExample $i -train -s $fillerlist

           updateWeights -algorithm steepest

         }

        }

        ## do target

       doExample -train

   }

}

Here is passive prime pattern with a transitive target.  Notice that the prime is messageless.

name:{  the beer is smash -par by marty  #p0}

proc: { clear;}

#pmes: TRANS 1A=SMASH,PRES,PERF 5Y=BEERZZ,DEF :CAUSE 3X=MARTY,  TLINK TRANS 1A=  5Y= -1  3X=

9

i:{targ 1.0} 120

t:{word 1.0} 120

i:{targ 1.0} 35

t:{word 1.0} 35

i:{targ 1.0} 153

t:{word 1.0} 153

i:{targ 1.0} 73

t:{word 1.0} 73

i:{targ 1.0} 155

t:{word 1.0} 155

i:{targ 1.0} 134

t:{word 1.0} 134

i:{targ 1.0} 5

t:{word 1.0} 5

i:{targ 1.0} 156

t:{word 1.0} 156

i:{targ 1.0} 156 

t:{word 1.0} 156  ;

This target has event semantics that is biased towards an active (hence the name uses the active), but there are also patterns with this target that bias towards a passive.

name:{  the boy hurt -ss a sister  #t0}

proc: { clear;link 1 77;link 3 7;plink .66 3 120;link 5 13;plink .33 5 120;targlink 1 3 -1 5;}

#tmes: THEMEEXP 1A=HURT,PRES,PERF 3X=BOY,DEF :CAUSE 5Y=SISTER,INDEF  TLINK THEMEEXP 1A=  3X= -1  5Y=

8

t:{word 1.0} 120

t:{word 1.0} 7

t:{word 1.0} 77

t:{word 1.0} 138

t:{word 1.0} 121

t:{word 1.0} 13

t:{word 1.0} 156

t:{word 1.0} 156  ;

Downloads

April 28, 2007:  Cleaned up directories:  XYZ (52MB)  and Trad Roles

Oct 30, 2005:   Added tar file of final version of the model:    XYZ (102MB)  and Trad Roles (92MB)

May 11, 2005: Added download on  Simple Message-Sentence Generator.

May 3, 2005:  This is a small compressed tar file with a copy of the model.