REPARQA: REthinking PAssage Retrieval for Question-Answering
1st International Workshop, in conjunction with CIKM2021, from November 01 to November 05, 2021, Online
1st International Workshop, in conjunction with CIKM2021, from November 01 to November 05, 2021, Online
Research on Question-Answering (QA) systems has recently achieved considerable success in simplified closed-domain settings such as the SQuAD dataset, which provides a preselected passage. Researchers tackled open-domain QA that presents a key challenge in natural language processing (NLP). Open-domain QA considers a large text corpus such as Wikipedia pages instead of a preselected passage for answering a given question. In this context, the Natural Questions (NQ) dataset has presented a more challenging problem. In fact, instead of providing one short passage for each question, NQ gives an entire Wikipedia page which is significantly longer than the passage provided in the other datasets.
An effective open-domain QA system must be able to successfully retrieve the document and the passage on one hand, and comprehend the question context to answer on the other. The current state-of-the-art of deep learning-based research for open-domain QA is often complicated and consist of mainly two components: (1) a passage retriever that selects a small subset of passages from documents (e.g., Wikipedia pages), and then (2) a machine comprehension that examines the retrieved passages to identify the final answer. Several studies showed that passage retrieval impact and impact and can significantly improve question answering task
Several elements are important for the passage retriever, such as question and passage representation, similarity and attention mechanism between the question and passages, passage ranking techniques, etc.
The REPARQA workshop is the first one that tackles the issue of passage retrieval for open-domain QA. It aims to bring together experts from industry, science, and academia to exchange ideas and discuss ongoing research in open-domain QA and, more precisely, the passage retrieval component. We encourage the description of novel problem definition of passage retrieval for open-domain QA and new datasets in this context. Furthermore, we also encourage contributions developing new techniques for document retrieval for open-domain QA problems.
Traditional research on passage and document retrieval mainly focuses on superficial similarities between the question and the passage (respectively document), such as cosine similarity. The main distinguishing focus of this workshop will be the use of deep neural networks and encoders for passage retrievals, such as the use of encoders to represent questions and passages, integrate attention mechanisms in the passage retrieval framework, etc.
This workshop aims to discover the recent advances in passage retrieval for open-domain QA and improve open-domain QA systems. Thereby, the REPARQ workshop is an opportunity to inspire experts and researchers to share theoretical and practical knowledge of the various aspects of QA systems, to have focused discussions on the topic leading to converting the novel ideas into future innovations.