Dataset Contributions

Here I am listing dataset contributions of my work which have been accepted at the proceedings of various computational linguistics venues.

Transport Complaint Dataset

This dataset is a collection of 3,700 tweets related to complaints in the domain of transport annotated with two classes - complaints and non-complaints. For more details regarding collection, usage, and pre-processing, please refer to the accompanying publication.

Dataset Link | Publication

#MeTooMA

This dataset contains 9,973 tweets related to the #MeToo social movement over Twitter that was manually annotated for five different linguistic aspects: (i) relevance, (ii) stance (support, opposition), (iii) hate-speech (directed hate, generalized hate), (iv) sarcasm (sarcastic, non-sarcastic), (v) dialogue acts (allegation, refutation, justification). This dataset has support with the HuggingFace datasets library. For more information, please refer to the links below.

Dataset Link | Publication | Github

Page updated

Report abuse