Schedule

For information on speakers, see the participants page.

Friday, May 26, 2017

9:00 am

Registration & Continental Breakfast

Open to all.

10:00 am

"The Semantics of Race Under Empire"

Hoyt Long

This talk is an early effort to identify and characterize patterns of racial semantics present in Japanese fiction in the early 20th century. This period, marked by aggressive imperial expansion on the part of the Japanese state, coincided with the rise of new representations of race capable of justifying the power relations that resulted from such expansion. This "neo-racism," as scholars have described it, relied on various cultural or linguistic markers to identify racial difference and subsequently inscribe it into existing semiotic codes. Fiction provided a critical testing ground for the instantiation, repetition, and evolution of such codes.

Here, I draw on previous computational work on racial semantics to understand the broader contours of this ground. By applying word embedding models to a corpus of several thousand fictional works from Japan’s colonial period, I explore how racial otherness was encoded via sets of terms appearing adjacent to common ethnic markers (e.g., Japanese, Chinese, Korean, Westerner). I then use these trained models to address two important literary-historical questions. First, is there any correspondence between the racial semantics of Japanese writers and those found in Anglo-American fiction of the 19th century, particularly with regard to East Asia? Second, within Japanese fiction, are there notable differences in racial semantics across different genres of writing (e.g., high-brow, popular, leftist)? Answers to such questions will help to form the basis of a new understanding of how the literary semantics of race operate at scale.

11:00 am

"What Figures? Danish Literature as World Literature"

Mads Rosendahl Thomsen

It has long been possible to get a rough idea of the impact of individual authors through the sheer number of translations of their works, but the access to information of such matters has been vastly improved and expanded in the past decade. To give the writers for the edited volume Danish Literature as World Literature a more nuanced picture of how the involved authors circulate, we collected information from a number of sources and made a series of comparisons to authors from Scandinavia and Europe. This includes first and foremost data on translations and library holdings gathered through UNESCO’s Index Translationum and worldcat.org, the vast and varied data collected by Google, while Amazon Sales Rank and Goodreads complemented the picture of contemporary interest and gave a more nuanced understanding of which works that were in particular demand today. This approach produced insights into to a wider and stronger impact than one would typically assume, or a much broader pattern of translations that can easily be forgotten. The global dissemination of Hans Christian Andersen’s work is astounding, also compared to authors’ from major literatures. However, the data also showed that the international reception of certain authors took place rather narrowly and that in particular there is no guarantee that success in one European country will be noticeable in the other larger literatures of Europe, not to mention the world as such. In this presentation, I will reflect on the use of more findings from this research as well as the methodological challenges involved with making sense of such a varied number of sources.

noon

Lunch

Lunch provided for participants and registered attendees.

1:00 pm

"The Jay Z Dataset"

Kenton Rambsy and Howard Rambsy II

Over the years, we have collaborated on projects that seek to address the divide between digital humanities and African American literary studies. We gathered quantitative data from a variety of sources in order to build and explore large bodies of artistic compositions by black writers and performers. One result was the development of our "Jay Z Dataset," a collection of information on the rap artist, including the lyrics from 189 songs comprising his 12 solo albums. For our presentation, we will discuss some of the main components of our dataset as well as activities for students that have emerged from the project. In addition, we will highlight the ways that building this Jay Z dataset has strengthened our abilities to address challenges and opportunities concerning the relationships between DH and African American literary studies. Overall, our presentation addresses the idea of integrating computational methods in hip hop studies and more broadly, highlighting the promise of thinking about Black Studies data projects as a way of addressing a challenging set of problems in the fields.

2:00 pm

"Beyond Poet Voice: Sampling the Vocal Performance Styles of 100 American Poets"

Marit MacArthur

In literary studies, and in polemics about poetry readings and performance styles, scholars and poets frequently opine about trends in poetry performance and their evolution. However, little quantitative research been done on actual audio recordings of poetry readings, nor are literary scholars typically familiar with linguistic and neuroscientific perspectives on prosody and speech perception. Minimally, the acoustic features of performative speech, as in a recorded poem, include pitch and timing. Such features might be further defined as average pitch (the fundamental frequency of the human voice, as measured in hertz) and pitch range (often measured in octaves), intonation patterns, volume/intensity, speaking rate, tempo, rhythm, emphasis, and vocal timbre. This talk will share new research applying linguistic and computational methods of analysis to recordings of 100 American poets (drawn from PennSound, the Academy of American Poets website, Harvard’s Woodberry Poetry Room, and other sites) and 20 conversational speakers (from the Buckeye Corpus at Ohio State University). Some comparisons will also be made to performance styles in religious and political contexts, stand-up comedy, and film acting. While speculations will be offered in regard to cultural questions of expressivity— including gender, race/ethnicity, class, historical period, geographic region, venue, and media format—the emphasis will be on learning to interpret quantitative data about performative speech, and to explore the potential and limitations of such analysis for sound studies and humanistic research on the audio archive.

3:00 pm

Coffee

3:30 pm

"Quantitative Intertextuality: Analyzing the Markers of Information Reuse"

Walter Scheirer & Chris Forstall

A growing number of studies in the Digital Humanities take text reuse as a proxy for intertextuality, but how do we calibrate such quantitative metrics against the human experience of intertextual significance? In this talk, we demonstrate computational models designed to predict reader response to allusion in Latin epic and then examine their performance in present-day intertextual contexts.

We consider two case studies in social media: allusions by Twitter users to the HBO television show "Game of Thrones" and the novels on which it is based, and allusions to Christian and Islamic religious texts in amateur fiction on the online writing platform Wattpad. We consider variants of two popular feature-sets for intertext discovery: word-based n-grams and topic models via latent semantic indexing.

Results show that, although developed for a specific literary context distant in time and genre, existing tools function more or less "out of the box" to surface easily recognizable instances of intertextual behavior from data mined on the net. Differences that appear between the ancient and modern results shed new light on the strengths and weaknesses of particular features and on the facets of intertextual practice to which each is best suited.

4:30 pm

"Object Lessons: Modeling Characters and Things in Novels"

David Bamman and Richard Jean So

In this paper, we develop a computational model to parse the story-based interactions between humans and objects within a large corpus of US novels 1880-2000. This model includes tracking the following interactions: the "affect" attached to objects by both narrator and characters; the confusability of humans and objects within narrative; the shifting role played by objects as the subjects of stories. With this model, we then identify clusters of texts with particular densities of these relations, pointing to a new orientation for US literary history across the 20th century, especially after 1950. Further, it contributes to recent work in "Thing Theory" and "Object Oriented Ontology" in understanding the evolving relationship between humans and objects in society, particularly in the past century.

5:30 pm

"Impact Assessment of Information Products and Data Provenance"

Jana Diesner

The emerging field of human-centered data science has led to several transformative advances in research and technology: With groups of people generating digital data, some social effects can be measured instead of having to be estimated. Also, the availability of such data may allow us to listen to people's signals instead of having to ask them questions. Finally, both the structure and content of human interactions can be considered for data analysis, and applying mixed methods to such data is becoming a routine approach.

These advances have broadened the scope in possibilities in impact assessment research, among other fields. I present our work on developing new computational solutions for identifying the impact of information products on people by leveraging theories from linguistics and the social sciences as well as methods from natural language processing and machine learning. I focus on a study where we developed and evaluated a theoretically grounded categorization schema, codebook, corpus annotation, and prediction model for detecting multiple practically relevant types of impact that documentary films can have on individuals, such as change versus reaffirmation of people’s behavior, cognition, and emotions. This work uses reviews as a form of user-generated content. We use linguistic, lexical, and psychological features for supervised learning; achieving an accuracy rate of about 81% (F1).

The outlined advances also imply several challenges: Verifying the accuracy of large-scale data is crucial for enabling collaborations, sharing data, and generating reliable results, but is challenging if the data provenance process lacks transparency. While choices about data collection, preparation and analysis are increasingly embedded in datasets and technologies, we still have a poor understanding of the impact of these decisions on research results and further actions. I present on our work on entity resolution of social network data, highlight the impact of common strategies and shortcomings on node and graph level properties, and discuss implications of biased results for decision and policy making.

7:00 pm

Dinner

Dinner for speakers and invited guests at the Morris Inn.

Saturday, May 27, 2017

9:00 am

Continental Breakfast

Open to all.

10:00 am

"What is Cultural Analytics? Four Propositions"

Lev Manovich

11:00 am

"Urbanization and Geographic Attention in Twentieth-Century Fiction"

Matthew Wilkens

The rise in urbanization is one of the most striking and important demographic trends of the twentieth century. Globally, the fraction of people living in cities rose to 50% by 2000 from just over 10% in 1900; in the United States, the fraction more than doubled (to over 80%) in the same period. In both cases, however, there has been significant regional variation. This presentation uses data from the Textual Geographies project to assess the ways in which urbanization was reflected in (and shaped by) twentieth-century fiction in the United States, Great Britain, Germany, and selected other sites. It also assesses the temporal relationships between demographic, economic, and literary-geographic changes, with the aim of better understanding how cultural conditions interact with literary production.

noon

Lunch

Lunch provided for participants and registered attendees.

1:00 pm

"The Shape of Reading"

Mark Algee-Hewitt

This project joins other recent efforts in computational literary analysis to invert the longstanding research focus on either the formal features or the authorial creation of the text. Instead, it seeks to provisionally recover the critical reading practices of the nineteenth and twentieth centuries using sequence alignment to identify patterns of quotation in published periodicals of the period. By locating the loci of critical attention - which parts of what novels are quoted in articles and reviews published in which periodicals - we study how practices of quotation and citation change over the course of the past two centuries in the wake of the professionalization of both literary criticism and reviewing. The results from the study allow us to interrogate the language of the 'quotable' and explore how the historically contingent selection processes reveal the changing criteria of 'representativeness' across time and between different types of publications and novels.

2:00 pm

"Bestsellers and Critical Favorites, 1850-1949"

Ted Underwood

An article modeling prestige in poetry 1820-1919 recently appeared in MLQ; this is the second part of that project, on fiction. Our sources for the reception of fiction are richer than for poetry, so we're able to add a new dimension to the project by contrasting different kinds of prominence. We look at a subset of texts reviewed in elite periodicals, but also a subset of bestsellers, and a subset of pulp magazines. Some aspects of change are the same in poetry and fiction: critical standards change slowly, and literature seems to move in the direction of critical judgment steadily across long timelines. But considering different kinds of prominence makes literary stratification more visible. When you look at any single measure of prominence, it's hard to detect Huyssen's "great divide" between elite and mass culture: critical favorites are not easier to pick out in the twentieth century than they were in the nineteenth; neither are bestsellers. But we do find clues that different forms of prominence diversified, and became less tightly correlated with each other, in the early decades of the twentieth century.

3:00 pm

Coffee

3:30 pm

Round table

On the place of cultural analytics in and around the humanities.