Datasets

Dataset for Task 1:

Opinion dataset with classes (Noise, objective, Positive, Negative, neutral sentiment, question, ads, miscellaneous)

Dataset length - 5000 for each social media (twitter and reddit)
This dataset has three level-annotations:

Level 1: It has three classes NOISE, OBJECTIVE, SUBJECTIVE and these three classes are marked with 0,1,2 respectively.

Level 2: It divides the SUBJECTIVE class further into three categories: NEUTRAL, NEGATIVE,POSITIVE and these are marked with 0,1,2 respectively.

Level 3: It divides the NEUTRAL class further into Four categories: NEUTRAL SENTIMENTS, QUESTIONS, ADVERTISEMENTS, MISCELLANEOUS and these are marked with 0,1,2,3 respectively.

A post which is in QUESTIONS class will have Level 1 marking - 2

Level 2 marking - 0

Level 3 marking - 1

A post which is in OBJECTIVE class will have Level 1 marking - 1

Level 2 marking -[Blank]

Level 3 marking -[Blank]

A post which is in NEGATIVE class will have Level 1 marking - 2

Level 2 marking - 1

Level 3 marking - [Blank]

Dataset for Task 2:

QnA Dataset.

Queries and comments
length - approx 30k for reddit dataset
marked as relevant and not relevant.

Here every Qs is given with respective comments and relevant score/likes.
The column "Relevant" contains binary labels -- relevant or not

Training data: Mailed to the registered candidates

Test Data: Mailed to the registered candidates.