Review Opinion Diversification

IJCNLP Shared Task 2017

December 1, 2017

Taipei, Taiwan

This shared task has concluded. See you at Baltimore, Maryland during ACM Hypertext (9-12 July'18) for RevOpiD-2018

Published Papers:

The shared task aims at producing, for each product, top-k reviews from a set of reviews such that the selected top-k reviews act as a summary of all the opinions expressed in the reviews set. The three independent subtasks incorporate three different ways of selecting the top-k reviews, based on helpfulness, representativeness and exhaustiveness of the opinions expressed in the review set.

In the famous Asch Conformity experiment, individuals were asked to decide which of 2 sticks (which they were shown separately) was longer. The same task was then to be performed with a group of people (all of them actors, deliberately giving the wrong answer). The error rate leapt from 1% to 36.8% when the people around expressed the wrong perception. This goes to show how heavily can others’ opinions influence our own. For example, if on searching for 'iPhone reviews', we see results (ranked by, say, PageRank) that coincidentally happen to be against the product, then one might form an incorrect perception of the general opinion around the world regarding the smartphone. To avoid such a misconception, while summarizing documents, Opinion Diversification needs to be incorporated. As an introductory impetus to this approach, we propose this shared task, focusing on Product Reviews Summarization (in the form of a ranked list).

Reviews have always played a crucial role for customers to select products informatively ever since information technology became a common part of life. Considering the large volume of reviews available at present, it becomes a difficult task for the customers to extract relevant information from huge amounts of data and they can often end up skipping some useful content, thereby making wrong choices. Thus, it is important to extract a representative set of reviews from a large set of data, while keeping all important content available in this representative set.