Monday: - Mehrnoosh Sadrzadeh: An overview of categorial statistical models of meaningThis is a short introduction to the theme of the workshop. I will briefly overview recent advances in combining logical and statistical models of meaning. My focus will be on models whose starting point is a categorial grammar. - Gemma Boleda: The interplay between conceptual and referential aspects of meaningImagine that you and me are in my office, in front of a table with several objects. If I ask you to pass me the mug, you will be able to do it because you are able to solve the \emph{reference} of the linguistic expression "the mug". That includes, among other skills, knowing how to identify mugs (such that you grab the mug rather than the mouse), and understanding the instructions encapsulated in the singular definite determiner "the" (e.g. expecting a single mug rather than several). Reference, thus, crucially requires us to both handle generic, conceptual aspects of meaning and reason about individual entities in particular situations. I will discuss the interplay between conceptual and referential aspects of meaning based on results obtained with empirical computational linguistic methods. In particular, I will examine referential effects in semantic composition and present a neural network that can learn to refer directly from reference acts, integrating visual and linguistic information. - Laura Rimell: Compositional Distributional Semantics for Relative ClausesIn this talk I will describe RELPRON, a new dataset of subject and object relative clauses for the evaluation of compositional distributional semantic models. RELPRON targets an intermediate level of grammatical complexity, between content-word pairs and full sentences. The dataset consists of pairs of terms and representative properties, such as 'telescope: device that astronomer uses' and 'popularity: quality that wins elections', and the associated task involves matching terms with their properties by producing vector representations for the properties that are close to the vectors for the terms. Relative clauses are an interesting test case for compositional distributional semantic models because they contain a closed class function word and a long-distance dependency. They pose a new challenge for type-based approaches, where evaluations have mostly focused on short phrase types such as adjective-noun and subject-verb-object, due to the difficulty of learning the higher-order tensors required for more complex grammatical constructions. I will present results on RELPRON obtained within a type-based composition framework, using a variety of approaches to simplify the learning of higher-order tensors, as well as results obtained using neural networks for composition. In line with many existing datasets, vector addition provides a challenging baseline for RELPRON, but it is possible to match or improve on the baseline by finding appropriate training data and models for the semantics of the relative pronoun. Tuesday:- Glyn Morrill: Remarks on quantification in logical grammarWe comment on displacement calculus and the simulation of Montague's treatment of quantification which it affords. - Richard Moot: Computational semantics for debatesI will present a preliminary investigation into how we can apply the tools of computational semantics to the analysis of online debates. How far can we go inferring discourse relations and potential logical fallacies from a structured debate? I will touch upon themes from categorial grammar, discourse representation theory, ontologies and their interactions. - Jules Hedges and Mehrnoosh Sadrzadeh: A generalised quantifier theory of natural language in categorical compositional distributional semantics with bialgebrasCategorical compositional distributional semantics is a model of natural language that combines the statistical vector space models of words with the compositional models of grammar. Recently in the paper http://arxiv.org/pdf/1602.01635.pdf, submitted for publication elsewhere, we formalised in it the generalised quantifier theory of natural language, due to Barwise and Cooper. The underlying setting is that of a compact closed category with bialgebras. We developed an abstract categorical compositional semantics, then instantiated it to sets and relations and to finite dimensional vector spaces and linear maps. We proved the equivalence of the relational instantiation to the truth theoretic semantics of generalized quantifiers and provided concrete corpus-based instantiations. The contributions of our work is three fold: first, it is the first time quantifiers are formalised in categorical compositional distributional semantics, second, it is the first time bialgebras are used, third, it is the first time equivalence of the setting to a truth-theoretic semantics is formally proved (and not just exemplified). Wednesday:- Glyn Morrill: Remarks on relativisation in logical grammarWe comment on displacement calculus and ACG and the treatment of relativisation which they afford, and why we think a treatment in terms of a relevant contraction modality is preferable. - Mark Steedman: A Theory of Content for NLPThe talk compares collocation-based and extension-based distributional semantics for NLP question answering with respect to compatibility with logical operators. Recent work with Mike Lewis and others seeking to define a novel form of semantics for relational terms using semi-supervised machine learning methods over unlabeled text is described. True paraphrases are represented by the same cluster identifier. Common-sense inference as represented by an entailment graph is represented directly in the lexicon, rather than delegated to meaning postulates and theorem-proving. The method can be applied cross-linguistically, in support of machine translation. Ongoing work extends the method to extract multi-word items, light verb constructions and an aspect-based semantics for temporal/causal entailment. This representation of content has interesting implications concerning the nature of the hidden language-independent conceptual language that must underlie all natural languages in order for them to be learnable by children, but which has so far proved resistant to discovery. - Gijs Wijnholds: Non-Local Information Flow in Compositional Vector Space Models: From enhanced syntax to enhanced semanticsThe DisCoCat framework offers a categorical framework for combining a type-logical account of meaning composition with a distributional account of lexical semantics with vector space models. In previous work, we have extended the DisCoCat framework to incorporate some of the refinements of the Lambek Calculus to overcome syntactic restrictions of the basic models, notably the Lambek-Grishin Calculus (adding a symmetric decomposition operator) and modal Lambek Calculus (adding unary operators that behave as residuals). Although this shows that the DisCoCat approach is not limited to a simple syntactic engine, the interpretation process is still inadequate: symmetric or modal operators are lost in translation, semantic effects are limited. One strategy to distinguish the role of the different operators in the Lambek-Grishin Calculus is to introduce a continuation-passing style (CPS) translation using a focused sequent calculus. This approach has shown to have semantic applications already in the simple 'Lambek' fragment of the symmetric calculus. In this talk, we show how the CPS strategy can be incorporated in the larger DisCoCat framework, and illustrate its effects on a practical example of scope construal. Thursday:Learning from Entailment for Semantic ParsingThere has been a lot of recent interest in NLP on semantic parsing, or the task of translating text to formal (compositional) meaning representations. Such work has focused on learning translations using parallel collections of text/meaning pairs, often using techniques from statistical machine translation and parsing. We describe a recent approach (Richardson and Kuhn, TACL 2016) that uses natural logic reasoning as tool to learn representations and entailment patterns. Rules of the logic are learned using high-level judgments about entailment as the main supervision. Results are reported on a benchmark dataset, and ongoing work is discussed. - Dimitri Kartsaklis: A Coordination Account for the DisCo ModelAn open problem with categorical compositional distributional semantics (informally referred to as the “DisCo” model) is the representation of words that are considered semantically vacuous from a distributional perspective, such as determiners, prepositions, relative pronouns or coordinators. This work outlines a construction that addresses the majority of coordination cases in language, by exploiting the compact closed structure of the underlying category and Frobenius operators canonically induced over the fixed basis of finite-dimensional vector spaces. Linguistic intuitions are provided, and the importance of the Frobenius operators as an addition to the compact closed setting with regard to language is discussed. - Desislava Bankova, Bob Coece, Martha Lewis, Dan Marsden: The categorical compositional distributional model of natural language provides a conceptually motivated procedure to compute the meaning of sentences, given grammatical structure and the meanings of its words. This approach has outperformed other models in mainstream empirical language processing tasks. However, until recently it has lacked the crucial feature of lexical entailment -- as do other distributional models of meaning. In this paper we solve the problem of entailment for categorical compositional distributional semantics. Taking advantage of the abstract categorical framework allows us to vary our choice of model. This enables the introduction of a notion of entailment, exploiting ideas from the categorical semantics of partial knowledge in quantum computation. The new model of language uses density matrices, on which we introduce a novel robust graded order capturing the entailment strength between concepts. This graded measure emerges from a general framework for approximate entailment, induced by any commutative monoid. Quantum logic embeds in our graded order. Our main theorem shows that entailment strength lifts compositionally to the sentence level, giving a lower bound on sentence entailment. We describe the essential properties of graded entailment such as continuity, and provide a procedure for calculating entailment strength.
- Mehrnoosh Sadrzadeh and Reinhard Muskens: We sketch a vector semantics for natural language using a simply typed lambda calculus in the tradition of Montague. Our approach is based on a dynamic interpretation of distributions of words. These take vectors and contexts as arguments and return updated contexts. Sentences are ``context change potentials'' in the style of Heim, they input and output contexts. Contexts for us are co-occurence matrices formed from distributions of words in corpora of text. The update instructions thread context updates compositionally in the phrases and sentences of language, based on the syntactic roles and the dynamic meanings of the words. - Nicholas Asher, Tim Van de Cruys and Antoine Bride: In this talk, Iexplore an integration of a formal semantic approach to lexical meaning and an approach based on distributional methods. I argue that such an integration is beneficial, and I |