Data

Most social media data is not sharable, or at least not sharable in its raw format. Below are links to some data sets available for use. For more information about specific data sets, see the link next to the data set to obtain author contact information.

  • Buzzfeed Election Data Set2 - This data set, gathered during the months leading up to the 2016 United States Presidential Election, is a collection of real and fake news stories with the highest Facebook engagement. Buzzfeed News gathered this data using keyword searches on the content analysis tool BuzzSumo (Horne & Adali, 2017). [LINK]
  • Buzzfeed Hyperpartisan Facebook Page Dataset - (Granik & Mesyura, 2017; Potthast, Kiesel, Reinartz, Bevendorff, & Stein, 2017). Not to be confused with the previous Buzzfeed dataset, this dataset contains a series of articles published on Facebook over the span of a week in late September 2016. Each article was fact-checked by 5 Buzzfeed journalists. The corpus includes 1,627 articles—828 from mainstream news agencies, 356 from left-wing sources, and 545 from right-wing sources. [LINK]
  • Dataworld - There are 6 datasets on disinformation. [LINK]
  • Media Cloud 2016 election data - [LINK]
  • Obama Administration Social Media Archives - [LINK]
  • PLOS one: ISIS - Twitter data collected to identify networks of actors who were members of or supported ISIS, de-identified [LINK]
  • ProQuest Congressional Government Social Media - Search Facebook and Twitter Members of Congress and Government Agencies going back to 2013. [LINK]
  • Statista - Data and facts about fake news. [LINK]
  • Social Media for Public Health - Flu Vaccination Tweets, Vaccination Sentiment and Relevance Tweets, and Zika Conspiracy Tweets Data Sets [LINK]
  • TAMU Twitter honeypot dataset - [LINK]
  • Twitter synchronized malicious behavior data - [LINK]
  • Wikipedia personal attack dataset - by Ellery Wulczyn; A collection of data sets on Wikipedia Talk page discussions. [LINK]