Visualizing Data for Digital Humanities

Producing Semantic Maps with Information Extracted from Corpora and Other Media

Monday, 29 June, 2015, 13:30-16:30 Building EA, University of Western Sydney, Parramatta Campus
Workshop at the 2015 Digital Humanities Conference (DH2015)

The goal of this workshop is to explore efficient methods to extract information from large quantities of texts and produce meaningful and readable representations. The session will Include formal presentations, demonstrations and discussions with the audience. Different kinds of texts will be considered, ranging from literary to social science texts, including technical corpora. Different visualisation techniques and toolkits will also be presented. The last part of the workshop will be devoted to hands on activities: participants will be able to interact with the workshop organizers to understand how to use relevant pieces of software and apply them to different kinds of data, including participants' own data.

The Challenge

It is well known that we are now facing an information deluge and experts in different domains, like social sciences and literary studies, have long realized that available texts—and more generally the mass of data available through different media—now constitute an important source of knowledge. However, computers are unable to directly access information encoded through texts: this information should first be extracted, normalized, and structured in order to be usable. Moreover, meaningful representations need to be provided so as to make it usable by humans. This whole process is not trivial and many groups face this dilemma: information is there, available on the web or in more remote databases, but its manipulation is hard since it requires a complex process most of the time out of the hands of social scientists and experts in literary studies. See for example this quotation (from the Médialab in Paris) that is typical of the current situation:

Qualitative researchers (…) arrive at the Médialab  bringing rich data and longing to explore them. Their problem is that qualitative data cannot be easily fed into network analysis tools. Quantitative data can have many different forms (from a video recording to the very memory of the researcher), but they are often stored in a textual format (i.e. interviews transcriptions, field notes or archive documents…). The question therefore becomes: how can texts be explored quali-quantitatively? Or, more pragmatically, how can texts be turned into networks? (Venturini and Guido, 2012, "Once Upon a Text")

The goal of this workshop is to address the question practically. It will include three presentations detailing some challenges and solutions. A large part of the workshop will be devoted to the presentation of practical tools and to discussion with the audience.