Let's take a sentence! Any old sentence.
Can we mark the "head" word of each of the phrases? Sure we can!
Now let's just draw an arrow from each head to each "governed" or "subordinate" word. OK.
If we get rid of the phrases, we've still got kind of a nice structure for the sentence, right?
Every word in the sentence has some head? That's cool.
Maybe this is nicer for some purposes?
What does this buy us?
- free word order is cleaner.
- Try writing CFG rules for a free-word-order language. It's hard!!
- do we really believe in constituents, anyway? Some linguists do.
Which constituents? ...
So this sort of a thing is really old, dating back to old Latin syntax, and even older, to ideas about Sanskrit syntax:
http://en.wikipedia.org/wiki/P%C4%81%E1%B9%87ini
This is really clean: we don't have to believe in constituents, we just have to agree that some words are describing other words.
what's hard to represent with dependency syntax?
(probably coordination: you have to make a choice about how to wire it together)
(also you have to make a choice, in doing this at all, about what kinds of dependency relations you have.)
- which way do the arrows point?
- is there a HEAD node?
(whatever, these are just notational differences)
- do we allow non-projective structures?
Wait, what's a non-projective structure?
"John saw a dog yesterday which was a Yorkshire Terrier."
Who did John see yesterday?
These things correspond nicely with what it's hard to express with a CFG. That's kind of nice.
- well, you could have a CKY-style parser
- Eisner 1996 --> this is pretty new. Jason is still working, professor at JHU!
- MST-style parser. It just needs to build a minimum spanning tree over all the words in the sentence. (Ryan McDonald, in Google NYC)
- constraint-based: have a bunch of rules that describe what can link to when, under what circumstances...
- ... or....
MaltParser is one such!
- greedy, deterministic parsing
- very high accuracy, super fast.
- discriminative classifiers (in fact, any classifier you happen to want)
- big win
stack.
buffer.
set of arcs.
transition systems. There are several different possibilities for transition systems!
Let's run through an example of parsing a sentence!!
This example is courtesy of Sandra Kübler, Ryan McDonald, and Joakim Nivre, from their lovely book Dependency Parsing.
It turns out that this sort of minimal one, with just three operations -- shift, left arc, right arc -- is sufficient to get any projective dependency parse. In fact, this transition system can *only* create projective dependency parses. But it can produce *any* of them. It's both sound and complete wrt projective dependency parses.
That doesn't mean that there aren't better transition systems. One that works rather better is what's called the arc-eager transition system, which has four operations instead of three, and the right arc operation doesn't get rid of the dependent. How can this be better if the basic one is sound and complete with respect to projective dependency parses?
This just amounts to training the classifier. But you have to extract all the "instances" from the treebank.
Instances are just all of the configurations that we were in, coupled with the correct transition decision to make in that configuration. If you always make the correct transition decision, then you're going to get the right parses!
This leaves us with two problems:
There's an algorithm for that!
Let's do this live: take a dependency parse for "I ate the bagels" and turn it into a series of configurations and transitions.
Well, that's something you'll have to decide, but common ones include...
- word on the top of the stack?
- word at the front of the buffer?
- next word in the buffer?
- part of speech of the word {on the top of the stack, at the front of the buffer}?
- dependency relationships going into or out of either of those words...
To get these, what do you have to know?
Get this file: pierre.conll
Get a pre-trained parsing model here: http://www.maltparser.org/mco/english_parser/engmalt.html (also has instructions for how to use it)
Get MaltParser here: http://www.maltparser.org/download.html
Now parse pierre.conll !
Do you like the parse that came out?
Jurafsky and Martin section 12.7
https://d19vezwu8eufl6.cloudfront.net/nlp/slides%2FParsing-09-LPCFGs-Intro.pdf
https://d19vezwu8eufl6.cloudfront.net/nlp/slides%2FParsing-10-LPCFGs-Charniak-Model.pdf
https://d19vezwu8eufl6.cloudfront.net/nlp/slides%2FParsing-12-PCFGs-Unlexicalized-Return.pdf
https://d19vezwu8eufl6.cloudfront.net/nlp/slides%2FParsing-14-Dependency-Intro.pdf
https://d19vezwu8eufl6.cloudfront.net/nlp/slides%2FParsing-15-Dependency-Transition-Based.pdf