Data
Task 1: Argument-based Sentiment Analysis (Research Report & Earnings Conference Call)
Research Report
There are three subtasks in the argument-based sentiment analysis task: (1) argument classification, (2) premise sentiment analysis, and (3) claim sentiment analysis. Each instance has one 'argument classification label' (claim/premise). If it is a "claim", the 'sentiment analysis label' will be bullish, neutral, or bearish. If it is a "premise), the 'sentiment analysis label' will be positive, neutral, or negative.
Data example:
{'report_id': 1093,
'paragraph_num': 0,
'annotator_id': 0,
'start_index': 269,
'end_index': 338,
'argument classification label': 'claim',
'sentiment analysis label': 'Bearish'}
Earnings Conference Call
Argument Unit Classification
The goal of this task is to classify a given argumentative sentence into one of two
Classes:
Class 0: Premise.
Class 1: Claim.
Data examples:
["First of all, I want to remind you that Q3 is typically a lower operating income quarter as we're preparing for the Q4 holiday peak.", 1] ,
["On the international, on an FX neutral basis, the growth was 15% in Q3 and 19% in Q4.", 0]
Argument Relation Detection and Classification
Given two sentences, the task is a three-class classification problem:
Class 0: There is no detected relation between the two sentences.
Class 1: There is a “Support” relation from sentence 1 to sentence 2.
Class 2: There is an “Attack” relation from sentence 1 to sentence 2.
Data example:
["Some have a 24-month clock, and there are even some that have a 30-month clock.", "They come back in and they pay less for the service but they pay more for their smartphone.", 0],
["Japan as a geography for us is a high transactional market.", "The improvement in that in Q3 is obviously very high margin and also the bottom.", 1],
["And that in fact in Q1 caused the market to expand.", "So, at least in the intermediate timeframe, we do not see cannibalization.", 2]
Task 2: Identifying Attack and Support Argumentative Relations in Social Media Discussion Threads (Social Media)
Each instance contains two posts, and there are three labels in this dataset: support (1), attack (2), and none (0). The goal is to identify whether the second post support/attack the first post. "none" label denotes that these two posts do not have a support/attack relationship.
Data Example 1:
[
最近兩天2330似乎已經走量縮漸漲的穩健走勢,若能如此,表示長線投資人看好2330,消息面一直顯示她的利多,個人以為應以長線投資方式持有她,
2330已經成為世界級的晶圓代工廠,無庸置疑. 這種世界級的企業也必然吸引世界級的客戶,此外,她還會有訂價能力,毛利率能維持或提升,獲利自然持續增加. 此外,5G的時代已經來到,以及網路雲端的更多需求,都需要各種更高階的運算處理器,而高階的處理器需要高階的晶圓製程才做得出來,2330正好處於這樣的有利位置,論技術無人能及,論需求正要大量爆發,2330能不得利於此嗎? 當然,產業競爭隨時隨在,2330目前取得領先不必然未來也能領先,但,只要企業高層不亂來,持續投入心力,培養優秀接班人才,長保競爭力,持續成為世界一級企業,應該也不致於是幻想吧?,
1
]
Data Example 2:
[
我設定看到 7.99雖然不一定會來 買進 預計用今年領到的股利現金64萬多和現金30萬 總共94萬多 全部跟他賭一把 先用房貸借出64萬 一拿到股利現金 馬上還掉,
金融股這麼多檔有必要一直跟這隻拼嗎 ??? 買這隻的盲點就是價低殖利率高 未來要考慮的是 商譽已經嚴重受損???生意越來越少??? 未來它的獲利????,
2
]
License
The annotated dataset is licensed under the Creative Commons Attribution-Non-Commercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.