Abstracts

Talks

Gianluca Lebani and Alessandro Lenci  

"Beware the Jabberwock, dear reader!" Testing the distributional reality of construction semantics.

We show that a simple corpus-based representation can be efficiently exploited to model the semantic content of verb argument constructions. By relying on the idea that the meaning of aconstruction can be inferred from the verbs that most frequently appear in it, we modeled it in adistributional space as the weighted centroid of the vectors representing its typical vectors. Wetested our proposal on two tasks. First we replicated the priming effect described by Johnson and Golberg (2013) as a function of the semantic distance between a construction and its prototypicalverbs. In a second step, we evaluated whether our distributional information can be used to modelbehavioral data collected from crowd-sourced production experiment.

Anastasiya Lopukhina and Konstantin Lopukhin 

Regular polysemy: from sense vectors to sense patterns

Regular polysemy was extensively investigated in lexical semantics, but this phenomenon has been very little studied in distributional semantics. We propose a model for regular polysemy detection that is based on sense vectors and allows to work directly with senses in semantic vector space. Our method is able to detect polysemous words that have the same regular sense alternation as in a given example (a word with two automatically induced senses that represent one polysemy pattern, such as ANIMAL / FOOD). The method works equally well for nouns, verbs and adjectives and achieves average recall of 0.55 and average precision of 0.59 for ten different polysemy patterns.

Vered Shwartz and Ido Dagan 

Path-based vs. Distributional Information in Recognizing Lexical Semantic Relations

Recognizing various semantic relations between terms is beneficial for many NLP tasks. While path-based and distributional information sources are considered complementary for this task, the superior results the latter showed recently suggested that the former’s contribution might have become obsolete. We follow the recent success of an integrated neural method for hypernymy detection (Shwartz et al., 2016) and extend it to recognize multiple relations. The empirical results show that this method is effective in the multiclass setting as well. We further show that the path-based information source always contributes to the classification, and analyze the cases in which it mostly complements the distributional information.

Vivian Santos, Manuela Huerliman, Brian Davis, Siegfried Handschuh and Andre Freitas.  

Semantic Relation Classification: Task Formalisation and Refinement

The identification of semantic relations between terms within texts is a fundamental task in Natural Language Processing which can support applications requiring a lightweight semantic interpretation model. Currently, semantic relation classification concentrates on relations which are evaluated over open-domain data. This work provides a critique on the set of abstract relations used for semantic relation classification with regard to their ability to express relationships between terms which are found in a domain-specific corpora. Based on this analysis, this work proposes an alternative semantic relation model based on reusing and extending the set of abstract relations present in the DOLCE ontology. The resulting set of relations is well grounded, allows to capture a wide range of relations and could thus be used as a foundation for automatic classification of semantic relations.

Mohammed Attia, Ayah Zirikly and Mona Diab

The Power of Language Music: Arabic Lemmatization through Patterns

The interaction between roots and patterns in Arabic has intrigued lexicographers and morphologists for centuries. While roots provide the consonantal building blocks, patterns provide the syllabic vocalic moulds. While roots provide abstract semantic classes, patterns realize these classes in specific instances. In this way both roots and patterns are indispensable for understanding the derivational, morphological and, to some extent, the cognitive aspects of the Arabic language. In this paper we perform lemmatization (a high-level lexical processing) without relying on a lookup dictionary. We use a hybrid approach that consists of a machine learning classifier to predict the lemma pattern for a given stem, and mapping rules to convert stems to their respective lemmas with the vocalization defined by the pattern.

Mikael Kågebäck and Hans Salomonsson

Word Sense Disambiguation using a Bidirectional LSTM

In this paper we present a clean, yet effective, model for word sense disambiguation. Our approach leverage a bidirectional long short-term memory network which is shared between all words. This enables the model to share statistical strength and to scale well with vocabulary size. The model is trained end-to-end, directly from the raw text to sense labels, and makes effective use of word order. We evaluate our approach on two standard datasets, using identical hyperparameter settings, which are in turn tuned on a third set of held out data. We employ no external resources, e.g. knowledge graphs, part-of-speech tagging, etc), language specific features, or hand crafted rules, but still achieve statistically equivalent results to the best state-of-the-art systems, that employ no such limitations.

Michael Zock and Chris Biemann

Towards a resource based on users' knowledge to overcome the Tip-of-the-Tongue problem.

Language production is largely a matter of words which, in the case of access problems, can be searched for in an external resource (lexicon, thesaurus). In this kind of dialogue the user provides the momentarily available knowledge concerning the target and the system responds with the best guess(es) it can make given this input. As tip-of-the-tongue (ToT)-studies have shown, people always have some information about the target (meaning fragments, number of syllables, ...) even if its complete form is eluding them. We will show here how to tap on this knowledge to build a resource likely to help authors (speakers/writers) to overcome the ToT-problem. Yet, before doing so we need a better understanding of the various kinds of knowledge people have when looking for a word. To this end, we asked crowdworkers to provide some cues for a given target, in the hope that this could help others to find the elusive word. Next, we checked how well a given search strategy worked when being applied to differently built lexico-semantic networks. The results showed quite dramatic differences, which is not really surprising. After all, different networks are built for different purposes, hence each one of them is more or less suited for a given task. What was more surprising though is the fact that we were not able to make use of relational information for elusive word retrieval in WordNet. Since we do believe in the virtues of relational information at the input, we will revisit the problem of navigation. To this end, we plan to combine resources and to build a hybrid semantic network, that is, an association thesaurus containing typed and untyped relations. Our ultimate goal is the creation of a resource that helps people to overcome the ToT problem.

Posters

Natsuno Aoki and Kentaro Nakatani

A Study of the Bump Alternation in Japanese from the Perspective of Extended/Onset Causation

This paper deals with a less studied object/oblique alternation phenomenon in Japanese, what we call the bump alternation. This peculiar alternation is similar to English with/against alter-nation (e.g., hit the wall with the bat [=immobile-as-direct-object frame] vs. hit the bat against the wall [=mobile-as-direct-object frame]), but it does not involve any changes in case-marking: the mobile theme such as bullet and the immobile object such as target are altered between two case markers, accusative -o and dative -ni, without changing its meaning, accord-ing to Sadanobu (1990). Although we question Sadanobu’s acceptability judgment of the im-mobile-as-direct-object example (in Japanese), casting doubt on the existence of the alternation itself, we claim that the causation type (i.e., whether the event is an instance of onset or ex-tended causation; Talmy 1988, 2000) could make a difference: an extended causative interpre-tation could improve the acceptability of the otherwise awkward immobile-as-direct-object frame. We examined this through a rating study. Results showed an interaction between the Causation type (extended/onset) and the Object type (mobile/immobile) in the direction we predicted. We propose that the "extended causation" advantage is caused by a perspective shift on what is moving.

Stefan Bott, Nana Khvtisavrishvili, Max Kisselew, and Sabine Schulte im Walde

GhoSt-PV: A Representative Gold Standard of German Particle Verbs

German particle verbs represent a frequent type of multi-word-expression that forms a highly productive paradigm in the lexicon. Similarly to other multi-word expressions, particle verbs exhibit various levels of compositionality. One of the major obstacles for the study of compositionality is the lack of representative gold standards of human ratings. In order to address this bottleneck, this paper presents such a gold standard data set containing 400 randomly selected German particle verbs. It is balanced across several particle types and three frequency bands, and accomplished by human ratings on the degree of semantic compositionality.

Mohammad Daoud

Discovering Potential Terminological Relationships from Twitter’s Timed Content

This paper presents a method to discover possible terminological relationships from tweets. We match the histories of terms (frequency patterns). Similar history indicates a possible relationship between terms. For example, if two terms (t1, t2) appeared frequently in Twitter at particular days, and there is a ‘similarity’ in the frequencies over a period of time, then t1 and t2 can be related. Maintaining standard terminological repository with updated relationships can be difficult; especially in a dynamic domain such as social media where thousands of new terms (neology) are coined every day. So we propose to construct a raw repository of lexical units with unconfirmed relationships. We have experimented our method on time-sensitive Arabic terms used by the online Arabic community of Twitter. We draw rela-tionships between these terms by matching their similar frequency patterns (timelines). We use dynamic time warping as a similarity measure. For evaluation, we have selected 630 possible terms (we call them preterms) and we matched the similarity of these terms over a period of 30 days. Around 270 correct re-lationships were discovered with a precision of 0.61. These relationships were extracted without consid-ering the textual context of the term.

Alexsandro Fonseca and Fatiha Sadat

Lexfom: a lexical functions ontology model

A lexical function represents a type of relation that exists between lexical units (words or expressions) in any language. For example, the antonymy is a type of relation that is represented by the lexical function Anti: Anti(big) = small. Those relations include both paradigmatic relations, i.e. vertical relations, such as synonymy, antonymy and meronymy, and syntagmatic relations, i.e. horizontal relations, such as objective qualification (legitimate demand), subjective qualification (fruitful analysis), positive evaluation (good review) and support verbs (pay a visit, subject to an interrogation). In this paper, we present the Lexical Functions Ontology Model (lexfom) to represent lexical functions and the relation among lexical units. Lexfom is divided in four modules. Moreover, we show how it combines to other lexical ontology, the Lexical Model for Ontologies (lemon), for the transformation of lexical networks into the semantic web formats, enriched with the semantic information given by the lexical functions, such as the representation of syntagmatic relations (e.g. collocations) usually absent from lexical networks.

Marie-Claude L'Homme, Carlos Subirats and Benoît Robichaud

A Proposal for combining "general" and specialized frames

The objectives of the work described in this paper are: 1. To list the differences between a general language resource (namely FrameNet) and a specialized resource on the environment; 2. To devise solutions to merge their contents in order to increase the coverage of the general resource. Both resources are based on Frame Semantics (Fillmore 1985; Fillmore and Baker 2010) and this raises specific challenges since the theoretical framework and the methodology derived from it provide for both a lexical description and a conceptual representation. We propose a series of strategies that handle both lexical and conceptual (frame) differences and implemented them in the specialized resource. We also show that most differences can be handled in a straightforward manner. However, some more domain specific differences (such as frames defined exclusively for a specialized domain or relations between these frames) are likely to be much more difficult to take into account since some appear to be domain-specific.

Andreana Pastena and Alessandro Lenci

Antonymy and Canonicity: Experimental and Distributional Evidence

The present paper focuses on the phenomenon of antonym canonicity. In particular, we examine the behaviour of antonymous Italian adjectives. The main question is why some pairs of antonyms – canonical pairs – are perceived to be better examples of opposition than others, and are so considered representative of the whole category. Two different approaches have dealt with this issue. The first one – lexical-categorical approach – finds the cause of canonicity in the high frequency of co-occurrence of the two members of a canonical pair. The other one – cognitive-prototype approach – affirms that two adjectives form a canonical pair because they are aligned along a simple and salient dimension. We propose to join the two approaches, showing that the cause of canonicity has a cognitive basis (salience of dimension), which can however can find a better empirical characterization in a distributional perspective.

Vivian Silva, André Freitas and Siegfried Handschuh

Categorization of Semantic Roles for Dictionary Definitions

Understanding the semantic relationships between terms is a fundamental task in natural language processing applications. While structured resources that can express those relationships in a formal way, such as ontologies, are still scarce, a large number of linguistic resources gathering dictionary definitions is becoming available, but understanding the semantic structure of natural language definitions is fundamental to make them useful in semantic interpretation tasks. Based on an analysis of a subset of WordNet’s glosses, we propose a set of semantic roles that compose the semantic structure of a dictionary definition, and show how they are related to the definition’s syntactic configuration, identifying patterns that can be used in the development of information extraction frameworks and semantic models.

Mutsuko Tomokiyo and Christian Boitet

Corpus and dictionary development for classifiers/quantifiers towards a French-Japanese machine translation

Although quantifiers/classifiers expressions occur frequently in everyday communications or written documents, there is no description for them in classical paper bilingual dictionaries, nor in machine-readable dictionaries. The paper describes corpus and dictionary development for quantifiers/classifiers, and their usage in the framework of French-Japanese machine translation (MT). They often cause problems of lexical ambiguity or its set phrase recognition on analysis phase, in particular for long-distance language pair like French and Japanese. For the development of a dictionary aiming an ambiguity resolution for expressions including quantifier and classifier meanings, our corpus is annotated by adopting extended UNL-UWs (Universal Networking Language-Universal words) dictionary. Keywords : classifiers, quantifiers, phraseology study, corpus annotation, UNL (Universal Networking Language), UWs dictionary, Tori Bank, French-Japanese machine translation (MT).

Shared task

 Mohammed Attia, Suraj Maharjan, Younes Samih, Laura Kallmeyer and Thamar Solorio

GHHH - Detecting Semantic Relations via Word Embeddings

This paper describes our system submitted to the CogALex-2016 Shared Task on the Corpus-Based Identification of Semantic Relations. The evaluation results of our system on the test set is 88.1\% (79.0\% for TRUE only) f-measure for Task-1 on detecting semantic similarity, and 76.0\% (42.3\% when excluding RANDOM) for Task-2 on identifying more finer grained semantic relations. In our experiments, we try word analogy, linear regression, and multi-task Convolutional Neural Networks (CNN) with word embeddings from publicly available word vectors. We found that linear regression performs better in binary classification (Task-1), while CNN has better performance in multi-class semantic classification (Task-2). We assume that word analogy is more suited for deterministic answers rather than handling the ambiguity of one-to-many and many-to-many relationships. We also show that classifier performance could benefit from balancing the frequency of labels in the training data.

Emmanuele Chersoni, Giulia Rambelli and Enrico Santus

ROOT18

In this paper, we describe ROOT 18, a classifier using the scores of several unsupervised distributional measures as features to discriminate between semantically related and unrelated words, and then to classify the related pairs according to their semantic relation (i.e. synonymy, antonymy, hypernymy, part-whole meronymy). Our classifier participated in the CogALex-V Shared Task, showing a solid performance on the first subtask, but a poor performance on the second subtask. The low scores reported on the second subtask suggest that distributional measures are not sufficient to discriminate multiple semantic relations at once.

Stefan Evert

Mach5 -- A traditional DSM approach to semantic relatedness

This contribution provides a strong baseline result for the CogALex-V shared task based on a traditional ``count''-type DSM. Parameter tuning experiments show some surprising effects and suggest that the use of random word pairs as negative examples may be problematic.

Chinnappa Guggilla

Classifying Semantic Relations using Convolutional Neural Networks

In this paper, we describe a system(CGSRC) for classifying four semantic relations: synonym, hypernym, antonym and meronym using convolutional neural networks (CNN). We have participated in CogALex-V semantic shared task of corpus-based identification of semantic relations. We have achieved 43.15\% weighted-F1 accuracy on subtask 1 (whether a relation between terms exists) and 25.24\% on subtask 2 (detecting relation types) with CNN-based deep neural networks approach leveraging pre-compiled word2vec distributional neural embeddings.

Kanan Luce, Jiaxing Yu and Shu-Kai HSIEH

LOPE

Automatic discovery of semantically-related words is one of the most important NLP tasks, and has great impact on the theoretical psycholinguistic modeling of the mental lexicon. In this shared task, we employ the word embeddings model to testify two thoughts explicitly or implicitly assumed by the NLP community: (1). Word embedding models can reflect syntagmatic similarities in usage between words to distances in projected vector space. (2). Word embedding models can reflect paradigmatic relationships between words.

Vered Shwartz and Ido Dagan

LexNET: Integrated Path-based and Distributional Method for the Identification of Semantic Relations.

We present a submission to the CogALex 2016 shared task on the corpus-based identification of semantic relations, using LexNET (Shwartz and Dagan, 2016), an integrated path-based and distributional method for semantic relation classification. Combined with a common similarity measure, LexNET performs fairly good on subtask 1 (word relatedness). Subtask 2, however, has shown to be more difficult, and while LexNET performs better than the baselines, the results are still mediocre, emphasizing the need to develop additional methods for this task.

Christian Wartena and Rosa Tsegaye Aga

HsH-Supervised -- Supervised similarity learning of word pairs using dot-product of the context vectors

As a conclusion, using directly distributional features on a classifier could not outperform the considered baseline for the given task and data as expected. Instead, the simple cosine method that we considered as a baseline performed better than our system. Our system is based on supervised approach. We took the pairs context vector which is constructed by a dot product of the pair of words context vector, and used as an input on SVM directly. However, it has not performed better as expected.