Opinion dataset with classes (Noise, objective, Positive, Negative, neutral sentiment, question, ads, miscellaneous)
Level 1: It has three classes NOISE, OBJECTIVE, SUBJECTIVE and these three classes are marked with 0,1,2 respectively.
Level 2: It divides the SUBJECTIVE class further into three categories: NEUTRAL, NEGATIVE, POSITIVE and these are marked with 0,1,2 respectively.
Level 3: It divides the NEUTRAL class further into Four categories: NEUTRAL SENTIMENTS, QUESTIONS, ADVERTISEMENTS, MISCELLANEOUS and these are marked with 0,1,2,3 respectively.
A post which is in QUESTIONS class will have Level 1 marking - 2
Level 2 marking - 0
Level 3 marking - 1
A post which is in OBJECTIVE class will have Level 1 marking - 1
Level 2 marking -[Blank]
Level 3 marking -[Blank]
A post which is in NEGATIVE class will have Level 1 marking - 2
Level 2 marking - 1
Level 3 marking - [Blank]
For Reddit:
Queries - Approx 1K
Comments - Approx 26K
Marked as relevant and not relevant.
Here every Qs is given with respective comments and relevant score/likes.
The column "Relevant" contains binary labels -- relevant or not
For YouTube:
Two sub-tasks built from cryptocurrency YouTube transcripts.
Question Answering (Q&A): answer a question using its source transcript.
Multiple-Choice (MCQ): select the correct option (1 of 4) for a question.
Each sub-task is split into train / validation / test (1000 / 250 / 500).
Detailed file formats and fields will be provided with the data.