Brief Overview

Research in the area of Natural Language Interfaces (NLIs) has been around for more than four decades. From the end-users point of view natural language is easy to use as it is used everyday in human to human communication, and is therefore considered as a useful and efficient way for people to interact with computers [Ogden and Bernick, 1997]. NLI systems have Natural Language questions as input and are built for various purposes. Most of them are concerned with the knowledge access problem, and among these, they further differ in terms of the underlying knowledge structure, and therefore can be grouped into three main categories:

  1. NLIs to structured data. NLIs to structured data allow users to interact with a system using written or spoken language (e.g. English) to perform tasks that usually require knowledge of a formal query language. The intention behind building NLIs to structured data is enabling users with no knowledge of formal languages to use them with minimal (ideally no) training. These systems are often domain-specific, and are usually referred to as closed-domain question answering systems. Two major subgroups include:
    1. NLIs to relational databases (NLIDBs) translate Natural Language into SQL in order to retrieve answers from the database. Most of the developed NLIs to structured data belong to this group (e.g., Popescu et al., 2003, Thompson et al., 2005, Hallett et al., 2007, to mention a few recent ones). Recently, these evolved towards interfaces to semantically-richer data in the form of ontologies.
    2. NLIs to ontologies translate a Natural Language query into the formal query language which is used to retrieve the knowledge expressed in one of the knowledge representational languages (such as OWL). The most common query language is SPARQL. Recently developed systems include ORAKEL [Cimiano et al., 2007], AquaLog [Lopez et al., 2007], PANTO [Wang et al., 2007], and Querix [Kaufmann et al., 2006].
  2. NLIs to unstructured or semi-structured data differ from the previous group in that they do not translate the Natural Language query into any formal language but they rather process the collection of documents (e.g. News articles on the Web, or Frequently Asked Questions as in Burke et al. [1996]). However, similar to the previously mentioned NLIs, the aim of these systems is to also find the answer to the question posed by the user. The most prominent systems of this kind are open domain Question-Answering systems which process large collections of documents in order to find answers. Examples include MURAX [Kupiec, 1993], MULDER [Kwok et al., 2001], and AnswerBus [Zheng,2002]. Another group which belongs here are Reading Comprehension systems such as Deep Read [Hirschman et al., 1999], which are used to test the reading level of children. They find the answer to the set of questions related to a story which is written in simple Natural Language.
  3. Interactive NLIs are systems which are used for dialogue systems, e.g., a chat bot called Asimov which answers simple questions in English (http://asimovsoftware.com). These kind of systems do not consider a set of questions as an independent collection, but rather act as agents or robots which are involved in a conversation with the user, with the capability to remember the sequence of previously asked questions, and to interpret the input from the user, and learn the answers to questions which they could not answer before. These are more challenging to develop in comparison to the previously mentioned NLIs, due to the requirement to model multiple conversational turns. These turns can refer one to another, and such systems must have the ability to remember and respond to all this context.

Lastly, a few NLI systems are developed for purposes other than knowledge access, such as systems to replace a programming language, e.g., the NLC system [Biermann et al., 1983].

References

  • William Ogden and Philip Bernick. Using Natural Language Interfaces. In M. Helander, editor, Handbook of Human-Computer Interaction. Elsevier Science Publishers B.V. (North-Holland), 1997.
  • Ana-Maria Popescu, Oren Etzioni, and Henry Kautz. Towards a theory of natural language interfaces to databases. In IUI ’03: Proceedings of the 8th international conference on Intelligent user interfaces, pages 149—157, New York, NY, USA, 2003. ACM. ISBN 1-58113-586-6. doi: http://doi.acm.org/10.1145/604045.604070.
  • Craig W. Thompson, Paul Pazandak, and Harry R. Tennant. Talk to Your Semantic Web. IEEE Internet Computing, 9(6):75–78, 2005.
  • Catalina Hallett, Donia Scott, and Richard Power. Composing Questions through Conceptual Authoring. Computational Linguistics, 33(1):105–133, 2007.
  • Philipp Cimiano, Peter Haase, and J¨org Heizmann. Porting Natural Language Interfaces Between Domains: an Experimental User Study with the ORAKEL System. In IUI ’07: Proceedings of the 12th international conference on Intelligent user interfaces, pages 180—189, New York, NY, USA, 2007. ACM. ISBN 1-59593-481-2. doi: http://doi.acm.org/10.1145/1216295.1216330.
  • Vanessa Lopez, Victoria Uren, Enrico Motta, and Michele Pasin. Aqualog: An ontology-driven question answering system for organizational semantic intranets. Web Semantics: Science, Services and Agents on the World Wide Web, 5(2):72–105, June 2007.
  • Chong Wang, Miao Xiong, Qi Zhou, and Yong Yu. PANTO: A Portable Natural Language Interface to Ontologies. In The Semantic Web: Research and Applications, pages 473—487. Springer, 2007. doi: 10.1007/978-3-540-72667-8 34.
  • Esther Kaufmann, Abraham Bernstein, and Renato Zumstein. Querix: A natural language interface to query ontologies based on clarification dialogs. In 5th International Semantic Web Conference (ISWC 2006), pages 980–981. Springer, November 2006.
  • Robin D. Burke, Kristian J. Hammond, Vladimir Kulyukin, Steven L. Lytinen, Noriko Tomuro, and Scott Schoenberg. Question answering from frequently-asked question files: Experiences with the faq finder system Technical report, AI Magazine, 1996.
  • Julian Kupiec. MURAX: A Robust Linguistic Approach for Question Answering Using an On-Line Encyclopedia. In Robert Korfhage, Edie M. Rasmussen, and Peter Willett 0002, editors, SIGIR, pages 181–190. ACM, 1993. ISBN 0-89791-605-0.
  • Cody Kwok, Oren Etzioni, and Daniel S. Weld. Scaling Question Answering to the Web. ACM Trans. Inf. Syst., 19(3):242–262, 2001. ISSN 1046-8188. doi: http://doi.acm.org/10.1145/502115.502117.
  • Zhiping Zheng. AnswerBus Question Answering System. In Proceedings of the second international conference on Human Language Technology Research, pages 399–404, San Francisco, CA, USA, 2002. Morgan Kaufmann Publishers Inc.
  • Lynette Hirschman, Marc Light, Eric Breck, and John D. Burger. Deep Read: A Reading Comprehension System. In In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, pages 325–332, 1999.
  • A.W. Biermann, B.W. Ballard, and A.H. Sigmon. An experimental study of natural language programming. International Journal of Man-Machine Studies, 18:71–87, 1983.