ANAQONDA - Analogy Queries by Ontology-based Data Analytics

Despite the huge success of mobile and Web-based information systems, interaction with such systems still follows system-centric interaction paradigms such as hierarchical categorization, list browsing, or keyword searches. Human-centered approaches like natural language queries or question answering, which try to emulate the natural interaction of humans with each other, are still few and in the early stages of their infancy. The core goal of this research proposal is to advance the ambitious vision of human-centered information systems. Here, analogy-enabled information systems are at the center of interest. However, also other approaches as novel browsing paradigms, information system personalization techniques, and also human-aided information processing are researched.

Analogies are a core concepts in human cognition, and it has been suggested that analogical inference is the “thing that makes us smart”. An analogy can be seen as a pattern of speech leading to a cognitive process transferring some high-level meaning from one particular subject (often called the analogue or the source) to another subject, usually called the target. When using analogies, one emphasizes that the “essence” of source and target is similar, i.e. their most discriminating and prototypical behaviors are perceived in a similar way. For example, the brief statement “West Shinjuku (a district of Tokyo) is like Lower Manhattan” allows readers who know New York but not Tokyo to infer some of the more significant properties of the unknown district with a certain vagueness (e.g., it is an important business district, hosts the headquarters of many companies, features many skyscrapers, etc.). A more metaphorical analogy example is the famous Rutherford analogy “atoms are like the solar system”, an analogy which explained the complex and newly discovered mechanics of the microcosm by pointing out similarities to the well-known celestial mechanics.

However, understanding the semantics of analogies is surprisingly hard, and requires, besides being able to handle the linguistic peculiarities, extensive domain knowledge in order to perform the analogical reasoning. In this project, some of these issues as extracting analogy statements from texts, building knowledge bases, or modelling the required notion of ‘similarity’ are addressed.

In addition to analogies, also other aspects of human-centered information systems are considered in this project. Here, especially Skyline queries are of particular interest. Skyline queries are a powerful technique for database query personalization, only returning those items which are optimal with respect a user’s preferences. Despite its popularity, Skyline queries still have many open issues, especially with respect to their semantics and the requirements to the underlying datasets. Some of these aspects, as for example data incompleteness, are also researched within this project.  

The third focal topic which finds its application in the research both on analogy-enabled information systems as well as in query personalization techniques is the challenge of integrating human feedback into information processing. Even more, obtaining feedback from actual humans is essentially mandatory for the design of every human-centered information system. Here, especially crowd-sourcing techniques are particular promising in eliciting feedback from human workers efficiently in a just-in-time fashion. Crowd-sourcing can be used for example for training machine learning algorithms, providing missing information in knowledge bases or databases, or for evaluation the performance of human-centered information systems. Here, the challenge lies in an intelligent design of the crowd-sourcing tasks and quality control mechanisms in order to maximize the result quality while minimizing the costs, even when being faced by potential malicious users. 

Funded since October 2012 by DAAD FIT.