The project aims to compare argument structure in typologically different languages, and to develop tools for a quantitative analysis of the data regarding argument structure. This should lead to a better understanding of the relation between the verb and the other sentence components, while in the meantime enhancing available methods for language comparison. In the future, our results might also lead to practical improvements deriving from a deeper understanding of translatability of verbs and clauses.
We will use two typologically distant languages, i.e. Ancient Greek and Yucatec Maya, and will develop a tool that allows to query argument structure in annotated copora of texts in the two languages. The data will then be analyzed and compared through quantitative methodologies. In particular, statistical methods will be used to assess frequency and productivity of argument structures, as well as their developments or tendencies in their possible development.
typological and sociolinguistic differences exhibited by the two
languages will grant the results of our research a wider range of
usability, and will allow the applications of methods and tools
developed by our research to other languages.
Basic concepts in
our understanding of language and language structure are those of
text and usage. Following this view, linguistic structures emerge
from usage, rather than existing before and independently of it. For
this reason we make extensive use of large electronic corpora, whose
size makes it possible to reach statistically significant results.
Numerous and useful tools for corpus research have been developed
over the last few decades, among which syntactic Treebanks, which,
however, do not yet rely on wide corpora. For this reason we will
need a preparatory stage during which we will build on existing
corpora, in order to enlarge the usable database and smoothen down
differences between corpora in the two languages. Query tools for
argument structures exist and are implemented for several languages,
including Indo-European and non-Indo-European, but never for head
marking languages such as Yucatec. For this reason, a stage in our
research will be devoted to the implementation of such tools. In
addition, conversion tools will be developed to deal with
diferent types of annotation, This will lead especially to an
enlargenment of the Greek corpus, as corpora already exist based on
different annotation standards (PerseusTreebank, Prague Treebank,
Proiel, Exmaralda, Penn Treebank).
The envisaged results of the project are summarized as follows: