StackOverflow Dataset

The Stack Overflow Dataset

To acquire the dataset it is necessary to fulfil the following steps

  • Go to the StackOverflow Academic Partnership Programme Page, read the information on the website and open the application form.
  • You can speed up the process of the application form by using our boilerplate text (below) for the question "What research is being proposed? What are the specific requirements of the project? What datasets are you interested in, if any?"

“We are requesting comment data, without user information from StackExchange which has been removed from the site as a result of moderation. The data will be used to research computational methods to detection of abusive or harmful content, with the aims of submitting our work to the 3rd Workshop on Abusive Language online. To investigate automated methods for abuse detection, we require a dataset which contains the labels available in the StackOverflow flagging structure - though a part may be unlabelled as to experiment with empirical evaluation on the trained methods.”