Connections in Cultural Heritges and Biases in Museums
When we look into the marvellous pieces in the museums, do you see the “tags” behind? Cultural Heritage has a myriad of labels behind it. Although these labels cannot replace the meaning and value of Cultural Heritage itself, they show us the connection of the artifacts. This project focuses on analyzing the textual descriptions of Cultural Heritage, analyzing the number of keyword occurrences and visualizing them with wordcloud, in order to find the similarities and differences between different cultures.
Copyright: © Valentina Cosentino
The Harvard Art Museums webpage after adding filter "Chinese".
The Harvard Art Museums website is visually appealing but lacks guidance for non-specialists in exploring Cultural Heritage. While the filter function helps users narrow their search, the image grid presentation of artifacts makes it hard to see the collection as a whole. To improve this, I plan to use the Categorical Scraping API to analyze the website’s CSV files. This will help me quickly identify relationships and patterns between artifacts, offering a more comprehensive and structured view of the collection, beyond the fragmented individual presentations.
Upon analyzing the CSV file containing information about the cultures represented in the Harvard Art Museums’ collections, several intriguing patterns emerge.
Firstly, the museum appears to prioritize artifacts from well-known cultures, particularly those that were influential during the Renaissance period. Cultures such as Chinese, Japanese, European, Egyptian, Byzantine, and American are heavily represented, with over 1,000 artifacts from each. However, lesser-represented cultures, including Lebanese, Turkestan, South Asian, and African cultures, receive comparatively less attention. Additionally, some categorizations seem redundant, such as the separate listings for Chinese, Japanese, and Korean cultures alongside an overarching "East Asian" category, which encompasses the former. This suggests a potential bias in how cultural groups are classified and represented within the museum's collection.
To further explore this, I used the Categorical Scraping API to isolate artifacts from the "Chinese" culture in the 20th century. Initial statistical results reveal a significant disparity in artifact viewership, with one object consistently receiving more attention than others. This could be attributed to its more frequent display in the museum, leading to higher visibility and interest among visitors.
In the next step, I would like to explore the chronological biases on heritages from certain cultures. I focused on the "Chinese," "Japanese," and "Korean" cultures from the Harvard Art Museums' collection.
The program first filters all the artifacts associated with these cultures. The analysis reveals that the number of artifacts from these three cultures significantly exceeds the average for other cultures in the collection. Japanese and Chinese artifacts are particularly well-represented, with Chinese artifacts outnumbering Korean ones. The statistical graph further shows that Japanese works peaked in accession numbers around 1933, while Chinese artifacts saw a spike in 1943. These patterns could be attributed to historical events: Japan’s involvement in World War II during the 1930s likely brought increased global attention to its culture, while the Second Sino-Japanese War (1937-1945) may have led to the displacement of Chinese artifacts, possibly taken by retreating military forces.
By leveraging the Categorical Scraping API, this project demonstrates how computational tools can help reveal hidden patterns and structural biases in digital cultural heritage collections. Through the systematic extraction and analysis of metadata from museum databases, we are able to identify imbalances in cultural representation—such as the overrepresentation of dominant cultures and the marginalization of others—as well as temporal and categorical inconsistencies in how artifacts are classified.
This approach highlights the power of data-driven methods in critically engaging with institutional narratives. Rather than passively accepting what museums present, Scraping API allows researchers to reframe and reinterpret collections at scale, offering a more holistic and inclusive view of cultural heritage. In doing so, it not only exposes biases embedded in curation and digitization practices but also opens new possibilities for transparency, accountability, and more equitable storytelling in the digital age.