The notion of meaning has been a central topic of study in many disciplines including Philosophy, Linguistics, Cognitive Science, Computer Science, and Artifical Intelligence (AI). From an AI perspective, developing a computational theory of meaning --- i.e. one which can be represented in, and exploited by, a computer --- is crucial to developing computer programs which can display intelligent behaviour.
The DisCoTex project (Distributional Compositional Semantics for Text Processing) is focused on a sub-branch of AI called Natural Language Processing (NLP, also known as Computational Linguistics), which is concerned with the automatic processing, analysis, and generation of natural language. The need to develop effective NLP technology is becoming more pressing as humans produce more information in the form of electronic text. Biomedical scientists are producing exponentially increasing numbers of articles each year, far too many to read and study individually. Vast quanties of "social text'' are appearing via social media such as Twitter, which commercial organisations would like to analyse to understand the needs of their customers. Government agencies track the large amounts of electronic communication happening daily in order to detect potential terrorist activity. Automatic tools are required for all of these applications to process and extract information from the textual data.
The ambitious proposal for this project is to develop a novel computational theory of meaning and exploit it in NLP tasks, thereby addressing a fundamental problem in a number of disciplines, and also helping to develop next-generation text processing technology.