The financial crisis of 2007-8 was an extremely complex, world-wide event. The U.S. government's response to the crisis was arguably as complex, but better documented. We invite researchers with natural language processing expertise to consider a corpus of reports, hearings, bills, and other transcripts related to the crisis. We have organized a research competition around the data and these questions:
Contrasting with “shared tasks” -- common exercises in the NLP community -- an unshared task does not specify a quantitative performance measure for comparing solutions and does not even specify what a solution might look like. Instead, the organizers provide data and an open-ended prompt. Participants are invited to explore the use of NLP methods to help scholars in political science, communications, and other related fields make sense of a large, complicated corpus. Participants are invited to show what they can do in the form of short papers describing exploratory research and optional demos. We believe many such papers will discuss quantitative and qualitative analysis of existing NLP tools and systems on portions of the data, though new implementations are also welcome, as are newly processed datasets that may be more directly usable in future research projects.
Papers will be reviewed by a panel of judges. These judges will author public responses discussing the relevance of unshared task submissions, suggesting uses in that may be unfamiliar to NLP researchers, as well as new research directions. Above all, an emphasis is placed on evaluating the potential for future interdisciplinary research stemming from unshared task entries. In addition, the panel of judges may present an award to the entry (or entries) with the greatest potential. Our hope is that new collaborations between NLP researchers and those with substantive interests in political science will develop as a result of the unshared task.
The following comprise the official data sources for the unshared task.
Finally, we suggest some other kinds of data that might be interesting, but which will require you to start from scratch. We ask that any data you gather be shared publicly with other researchers in its original and most useful formats.
Who is organizing this competition?
PoliInformatics leverages advances in computer science, machine learning, and data visualization to promote analyses of very large and unstructured datasets related to the study of government and politics. The PoliInformatics Research Coordination Network (PInet) is a working group funded by the National Science Foundation to build community and capacity for data-intensive research using open government data. PInet has focused its work on the 2007-8 financial crisis, government policy relating to the crisis, and public response to that policy. PInet has provided the data and the panel of judges who will respond to the unshared task entries.
The NLP unshared task in PoliInformatics is being organized by: