ICWSM 2024 Data Challenge:

Research Data in a Post-API, Decentralized,
and Walled Garden Web


Workshop colocated with ICWSM 2024
at the Jacobs School of Medicine, Buffalo, NY.
on June 3rd, 2024

For the 5th ICWSM Data Challenge, we invite papers that discuss strategies for collecting or sharing data in an increasingly post-API, decentralized, and walled garden web. This includes collecting or sharing data from longstanding online platforms, as their APIs become increasingly inaccessible (e.g. Twitter and Reddit), from newer online platforms, whose decentralized nature creates new challenges (e.g. Mastodon and Bluesky), or from other websites and apps whose content is delivered in walled gardens (e.g. WhatsApp, Discord, in-app browsers). 

Authors might write about and share specific software tools from past, current, or upcoming research projects that could be brought to bear on these new challenges. In cases in which sharing software or data has become difficult due to these challenges, authors could also address general strategies, or specific techniques, for gathering or handling data. Additionally, we invite submissions that seek to explain the evolution of these challenges to their current state, how they might evolve in the future, and the effects they've had on research thus far. 

Finally, we are also interested in papers (position papers or empirical) that discuss the role that the ICWSM community, and similar ones, might play in addressing ethical issues related to data gathering or handling in light of these challenges. 

To encourage open sharing of tips and tricks within the community, we will accept for review submissions intended to be published anonymously, and this workshop will be held under the Chatham House Rule; participants are free to use the information received, but neither the identity nor the affiliation of the speaker(s) or any other participant may be disclosed.



Important Dates

Workshop Announcement: Mar 8, 2024

Submission Deadline: April 22, 2024 (AoE) May 5, 2024 (AoE)

Submission Notifications: May 6, 2024 (AoE) May 19, 2024 (AoE)

Camera-Ready Deadline: May 13, 2024 (AoE) May 26, 2024 (AoE)

Workshop Date: June 3, 2024


Call for Participation 


Tracks


We welcome submissions on various topics that address collecting or sharing data in an increasingly post-API, decentralized, and walled garden web. The data challenge includes three tracks that may overlap and are broadly focused on:


1) Researcher-driven data collection: Tools and techniques for data collection and sharing in light of the post-API and decentralization challenges discussed above. This might include software for collecting platform data from publicly accessible endpoints (e.g., GoGettr), browser extensions that collect data as you scroll (e.g. Zeeschuimer), frameworks for simulating user behavior, or techniques for sharing existing datasets. It could also include new techniques, considerations/caveats, and resources for gathering data from decentralized social media sites (e.g. Mastodon) or walled gardens (e.g. WhatsApp, Discord, in-app browsers).


2) User-driven data collection: Tools and techniques that enable researchers to recruit or ask users for their data. For instance, work aiming to support “grassroots” data donation, citizen science, peer production, or passive data collection. This may also include techniques, considerations/caveats, and resources for obtaining participants, which has been a central challenge to such studies. Some aspects of the submissions relevant to this track may overlap with Track 1, and that is okay.


3) New ethical issues, or new discussions of old issues, with web scraping and web scale computational social science. Submissions to this track might also add to the discussion of how and why the “post-API” and “walled gardens” era came to be, or argue for a particular future paradigm. Some other questions submissions in this track might tackle include (a) the re-use of old datasets when new data is hard to come by, and the implications that such re-use may have for future research and (b) the use of “simulations”, especially those using LLM-powered agents, and the implications of turning to simulation based work.


Archival and Non-archival Submissions


All tracks allow for archival or non-archival submissions. Archival submissions will be posted on the ICWSM website, but not as part of the AAAI proceedings. We especially encourage authors who want to discuss challenging situations handling data that might be required to be “dehydrated” or might require some degree of plausible deniability about the specific data access strategy to consider sharing challenges and successes. To further encourage open sharing of tips and tricks within the community, we will also accept for review submissions intended to be published anonymously.


Participation


The data challenge is open to everyone. Given the nature of the discussions that may occur, this workshop will be held under the Chatham House Rule; participants are free to use the information received, but neither the identity nor the affiliation of the speaker(s) or any other participant may be disclosed. Similarly, we encourage in-person attendance, and this workshop will not be recorded, live streamed, or hybrid.


Submission Instructions


Submission Website: https://easychair.org/conferences/?conf=icwsmdc2024 


Submission Guidelines: Submissions should follow the formatting guidelines for ICWSM 2024:


Workshop Organizers

Data Challenge Chairs



Contact


Please send questions to data.challenge@icwsm.org 


Acknowledgements 


We are grateful to the past organizers of the ICWSM Data Challenge for sharing their templates and resources, and to the organizers of the Post-API conference, which was an inspiration for this year's theme.