Analytical Digressions

Image credit: Prof. Hall's photo of notes about project development, May 8, 2023.

Digressions During Our Search

Our explorations initially took us in many directions, many of which seemed fruitful, several of which required more data, time, or analytical tools than we could access during the first semester of this project. 

The Importance of Sounds

To what extent does the poem emphasize the sound of the first letters of Orlando's name: "or"?

The image to the left shows the number of times that the pattern of alphabetical characters "or" appears in each octave in the first canto of the poem. The peaks represent an average  of nearly once per line. After reviewing several of these plots, we arrived at important questions, chief among them:

Answering a question like this will require more digitized texts from the period (beyond just the most obvious competition, Torquato Tasso's Gerusalemme liberata from 50 years later).

Untranslated Virtues

One digression pointed us to the challenges of working bilingually for text mining and analysis. One member of the group embarked on an ambitious exploration of virtue as part of our larger investigation of the moon:

The English translation took such liberty with the terms for these virtues that the preliminary results were unreliable. The inconsistencies created discrepancies that could not be resolved during analysis. For example, greedy (21 times in English) appears as avido/a only twice in the Italian text. Ultimately, because these terms are so subject to cultural context, we do not recommend this approach in a bilingual setting.

NLP Does Not Understand Epic Poetry

In order to perhaps save the work of cleaning the OCR from Reynolds' index, we thought that natural language processing (NLP) could identify characters even when they were referred to as pronouns or epithets.

On the left is an excerpt from using Stanford's CoreNLP model to infer people in the Rose translation of the Orlando furioso. Given results like "prepared", "reposing", "the", "like", "running", and "shoots", we had to abandon this work. The data cleaning of Reynolds' index was much faster!

Epic Requires its Own Sentiment Dictionary

Many of our discussions revolved around how characters were being presented to readers in positive or negative lights. In addition to the virtues question raised above, we wondered about positive and negative descriptions of people and places. But positive and negative in the context of the poem have very different definitions than the sentiment models that are built on customer reviews or use crowd-sourced evaluation of modern English. We remain convinced that a fun iteration of this course would engage directly with this question and build a custom sentiment model!

Alternative Indexing of the Moon

https://docs.google.com/spreadsheets/d/1vobemxDKdPEbrISDcVHOcSQL-r_AIFJD8XjU9UH0DVA/edit?usp=share_link

The moon presented several difficult cases in which there is room for interpretation of the relevancy or "personhood" of certain items, people, and ideas within the poem. Reynolds, it seemed, more often than not, chose to leave these out to create a more concrete index with more information and cross-listing per entry. As an experiment, we chose to take the 60-some octaves Astolfo and St. John spend on the moon and index them ourselves. 

This index does not claim to be more complete, more accurate, or even more useful for our specific purposes of the project. Rather, re-indexing the moon was an attempt to capture some of the personifications and other gray-area subjects that Reynolds did not index. While the index for the octaves on the moon was interesting, especially in conversation with Reynolds', ultimately it proved too labor-intensive and redundant to continue for the rest of the poem.