Schedule:
1:30 – 1:45 Welcome and workshop overview
1:45 – 2:15 Brief participant introductions
2:15 – 3:00 Review of questions/issues; preparation for group exercise
3:30 – 3:45 Break
3:45 – 4:45 Group Exercise
4:45 – 5:30 Report out, Discussion, and Wrap up
Exercise Topics:
Researchers plan to scrape public comments from online newspaper pages to predict election outcomes. They will aggregate their analysis to determine public sentiment. The researchers don’t plan to inform commenters, and they plan to collect potentially identifiable usernames. Scraping comments violates the newspaper’s terms of service.
Researchers plan to scrape profile photos, which are visible to any member of the service, from a dating site to build models that predict sexual preference or behavior. Researchers will not inform the dating site users, but they will not collect any identifying information and their photograph dataset will not be released publicly. Creating a fake profile, necessary to access the photos, violates the dating site’s terms of service.
Researchers plan to scrape public posts and interactions from Facebook to study group-level dynamics. They plan to collect informed consent from the original poster, but not those they interacted with, and they may collect identifying information. Scraping posts with permission of the original poster does not violate Facebook’s terms of service.
Researchers plan to scrape data from an open teen forum and combine it with scraped tweets to predict mental health conditions. The researchers will not inform forum users, and they may collect potentially identifying information. Scraping data violates neither the health forum nor Twitter’s terms of service.
Researchers plan to scrape Facebook and Twitter where people are self-identifying as COVID+ and cross referencing it with voter registrations to track political affiliation and other demographic information connected to public COVID+ individuals.
Exercise Instructions:
Discuss the following items. Be prepared to report out if you were starting this project, what did you decided to do and why.
Outline the tension(s) related to the following stakeholders:
Researchers
The people/person creating the data,
Funding/manuscript reviewer,
Person from the online platform, and/or
Other stakeholders as needed (e.g. parent of a teen).
Propose how researchers might address these tension(s)?
What are the open questions related to the tension(s)?