1. Motivation: Linguistic and computational linguistic Background
We have to consider three categories of sentences to build complete grammars in natural language processing (Roche, 1997). The first category consists of free sentences like 'John likes Mary'. In addition to parsing of this kind of free sentences, there are two more categories which makes us interested in syntactic and semantic parsing of them. The construction 'make a decision' can be replaced by the verb 'decide' and both of them have the same meaning. Another problem is that the complement 'to take this course' is actually an argument of the noun 'decision', not the verb 'make'.
(1) a. I made a decision to take this course.
b. I decided to take this course.
This construction is called as a support verb construction (or light verb construction) because the verb is only related with agreement and tense and does not have a heavy semantic weight like other main verbs. As explained above, because this construction has a different aspect from other verbs in terms of argument structure, the recognition of this construction is important for semantic parsing.
Moreover, this construction is very important in Machine Translation field, because this construction is almost universal phenomenon and each language has various light verbs. Different languages do not have meaning correspondence between them. For example,
(2) a. I made a decision to take this course.
b. na-nun i swuep-ul tut-kiro kyelceng-ha-ess-ta
I-TOP this class-ACC take decision-do-PAST-DEC
'(literally) I this class take decision-do-PAST'
The sentence in (2b) is a Korean counterpart of 'made a decision' in English. The verb 'make' in English is translated into DO in Korean. Therefore, this support verb construction should be considered in natural language processing
The third group of sentences is a frozen sentence, or idiomatic expressions like 'kick the bucket'.
(3) John kicks the bucket
Because we cannot interpret the meaning of 'kick the bucket' by the same parsing as that of a free sentence, we have to find the way to treat this kind of frozen sentence.
2. The Aim of This Project
This project aims to implement a part-of-speech tagger that can treat the three constructions I showed above correctly by extending DCG in prolog.
3. Future Work
The problems triggered by support verb constructions and idiomatic expressions are semantic parsing problems.
First, I implement a part-of-speech tagger by extending DCG.
The next step will be introducing feature structure for semantic parsing.
4. What to Do
1) XTAG Verb Class Checking
(1) Support Verb Class (light verb class)
- light verbs
e.g. give groan, have discussion, make comment
- ditransitive light verbs
e.g. give wave, give look, make promise
(2) Idiomatic Expressions
- Idiom with V, D, and N anchors
e.g. kick the bucket, bury the hatchet, break the ice
- Idiom with V, D, A , and N anchors
e.g. have a green thumb, sing a different tune
- Idiom with V and N anchors
e.g. draw blood, cry wolf
- Idiom with V, D, A, N, and Prep anchors
e.g. make a big deal about, make a great show of
- Idiom with V, A, N, and Prep anchors
e.g. make short work of
- Idiom with V, N, and Prep anchors
e.g. look daggers at, keep track of
- Idiom with V, D, N, and Prep anchors
e.g. make a mess of, keep the lid on
2) Extend DCG based on the XTAG Trees
I extend DCG for a support verb construction and idiomatic expressions with a help of XTAG trees.
3) DCG extension experiment until now
: light verb construction
(1) DCG code
s-->np,vp.
np-->det, n.
prednp-->det, predn.
np-->pron.
vp-->v.
vp-->tv, np.
vp-->tv, prednp.
vp-->lv, prednp.
det-->[the].
det-->[a].
det-->[every]
n-->[man].
n-->[woman].
n-->[park].
pron-->[we].
predn-->[discussion].
tv-->[likes].
tv-->[loves].
lv-->[had].
v-->[walks].
(2) Query
?- s([we, had, a , discussion],[]).
yes
?- s([we, had, a, woman],[]).
no
?- s([a, man, likes, a, discussion],[]).
yes
5. References
- Amitabha Mukerjee, Ankit Soni, and Achla M Raina, 2006, Detecting Complex Predicates in Hindi using POS Projection: across Parallel corpora, in Proceedings of the 5th Workshop on Important Unresolved Matters, p. 11-18, Sydney, July 2006. Association for Computational Linguistics. www.inf.ufrgs.br/~avillavicencio/mwe-papers/mukerjee.pdf
- P. Blackburn & Striegnitz, K., 2002, Natural Language Processing Techniques in Prolog. in http://www.coli.uni-saarland.de/~kris/nlp-with-prolog/html/index.html
- Emmanuel Roche, 1997, parsing with Finite-State Transducers, in Emmanuel Roche and Yves Schabes, eds., Finite-State Language Processing, MIT Press, Cambridge. http://www.merl.com/papers/docs/TR96-30.pdf
- XTAG Research Group, 2001, XTAG English Grammar: Release 2.24.2001, in http://www.cis.upenn.edu/~xtag/gramrelease.html
6. other Links
- ProNTo - Prolog Natural Language Tools
- NLP Software Registry
- NLP Glossary
- Natural Language Processing
- Mary D. Taffet's NLP homepage
- The Copenhagen Tree Tracer Project
- An Introduction to Language Processing with Perl and Prolog
- www.sics.se/~jussi/Opetus/texter/nl-kompendium.ps
- www.phon.ucl.ac.uk/home/hans/courses/plinc101/building_a_parser/building_a_parser.pdf
- http://online.mq.edu.au/pub/COMP248/practicals/practical07a.html