Methodology and Data Sources Used

  • Utilized the open source database, on www.breachlevelindex.com, to source the top 100 breaches, by records lost, for 2014 – 2018 totally 500 companies.
  • Corroborated that data against the data bases on www.privacyrightsclearinghouse.org, to ensure an added layer of validity to the data set.
  • Added organizational details to the dataset, including employee base, revenue, brand value, breach data information type and confidentiality.
  • Researched and added social media exposure to the data set from the top applicable social sites including Facebook, Twitter, Instagram, Youtube, Reddit, LinkedIn, and Pinterest.
  • Utilized a combination of Postman and the Webhose.io API to pull down dark web mentions for the breach data set.
  • While all additional data elements were applied to the entire set of 500 companies, for the purposes of research, we isolated the top 50 U.S. organizations, excluding healthcare groups and government agencies.
  • Removed any inconsequential data – some groups listed had no specific organizational name (i.e.. “Australian website”).
  • Researched any “major” news event, within 12 months of the listed breach date, to isolate possible actions that might have acted as a beacon for breach; this included open source
  • To ensure that the date was sound, a control group was used to test the breach set against.
  • Used Forbes' 2018 Top 100 Companies by Brand Value as the control group
  • Added organizational details to the dataset, including employee base, revenue, brand value, breach data information type and confidentiality.
  • Researched and added social media exposure to the data set from the top applicable social sites including Facebook, Twitter, Instagram, Youtube, Reddit, LinkedIn, and Pinterest.
  • Utilized a combination of Postman and the Webhose.io API to pull down dark web mentions for the breach data set.