Data - Please Register to Get the Dataset - Registration Form

Task 1: Argument-based Sentiment Analysis (Research Report & Earnings Conference Call)

Research Report

There are three subtasks in the argument-based sentiment analysis task: (1) argument classification, (2) premise sentiment analysis, and (3) claim sentiment analysis.  Each instance has one 'argument classification label' (claim/premise). If it is a "claim", the  'sentiment analysis label' will be bullish, neutral, or bearish. If it is a "premise), the  'sentiment analysis label' will be positive, neutral, or negative. 

Data example:

{'report_id': 1093,

 'paragraph_num': 0,

 'annotator_id': 0,

 'start_index': 269,

 'end_index': 338,

 'argument classification label': 'claim',

 'sentiment analysis label': 'Bearish'}

Earnings Conference Call

Argument Unit Classification

The goal of this task is to classify a given argumentative sentence into one of two


Class 0: Premise.

Class 1: Claim.

Data examples:

["First of all, I want to remind you that Q3 is typically a lower operating income quarter as we're preparing for the Q4 holiday peak.", 1] ,

["On the international, on an FX neutral basis, the growth was 15% in Q3 and 19% in Q4.", 0]

Argument Relation Detection and Classification

Given two sentences, the task is a three-class classification problem:

Class 0: There is no detected relation between the two sentences.

Class 1: There is a “Support” relation from sentence 1 to sentence 2.

Class 2: There is an “Attack” relation from sentence 1 to sentence 2.

Data example:

["Some have a 24-month clock, and there are even some that have a 30-month clock.", "They come back in and they pay less for the service but they pay more for their smartphone.", 0],

["Japan as a geography for us is a high transactional market.", "The improvement in that in Q3 is obviously very high margin and also the bottom.", 1],

["And that in fact in Q1 caused the market to expand.", "So, at least in the intermediate timeframe, we do not see cannibalization.", 2]

Task 2:  Identifying Attack and Support Argumentative Relations in Social Media Discussion Threads (Social Media)

Each instance contains two posts, and there are three labels in this dataset: support (1), attack (2), and none (0). The goal is to identify whether the second post support/attack the first post. "none" label denotes that these two posts do not have a support/attack relationship. 

Data Example 1: 



2330已經成為世界級的晶圓代工廠,無庸置疑. 這種世界級的企業也必然吸引世界級的客戶,此外,她還會有訂價能力,毛利率能維持或提升,獲利自然持續增加. 此外,5G的時代已經來到,以及網路雲端的更多需求,都需要各種更高階的運算處理器,而高階的處理器需要高階的晶圓製程才做得出來,2330正好處於這樣的有利位置,論技術無人能及,論需求正要大量爆發,2330能不得利於此嗎? 當然,產業競爭隨時隨在,2330目前取得領先不必然未來也能領先,但,只要企業高層不亂來,持續投入心力,培養優秀接班人才,長保競爭力,持續成為世界一級企業,應該也不致於是幻想吧?,



Data Example 2


我設定看到 7.99雖然不一定會來 買進 預計用今年領到的股利現金64萬多和現金30萬 總共94萬多 全部跟他賭一把 先用房貸借出64萬 一拿到股利現金 馬上還掉,

金融股這麼多檔有必要一直跟這隻拼嗎 ??? 買這隻的盲點就是價低殖利率高 未來要考慮的是 商譽已經嚴重受損???生意越來越少??? 未來它的獲利????,




The annotated dataset is licensed under the Creative Commons Attribution-Non-Commercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.