We are using the “Technique du Peuple Annamite” dataset, which was commissioned by a French officer, Henri Oger. While there must have been Vietnamese contributors, who approached this project with their cultural perspective, overall the dataset is dominated by the French colonial perspective – from the French captioning (and to that extent the English translations) to the selection of scenarios and images being depicted.
We manually sorted and organized all 4,000+ images and categorized them based on each step in food preparation and enjoyment. This includes images of eating food, food dishes, preparing the dish, preparing the dinner table, gathering food and materials, and cooking tools. We chose to manually label and organize the dataset because we wanted a more in depth detail of each image that was not seen in the original dataset. However, because of this, our method of cleaning is subjective and may be considered unreliable, as every person in our team was unable to categorize every image in the entire dataset due to time constraints.
This pie chart shows the distribution of the dataset. Clearly, most of the images were classified as "N/A", which means they depicted other aspects of Vietnamese life at the beginning of French colonization and the urbanization of Saigon.
The next biggest category was "cooking tools" which is what we focused on since our project is on the food practices and etiquette in Vietnam during the French colonial period!
This pie chart shows the distribution of the images that were classified in a specific food-related category, essentially everything except "N/A". We can tell that "cooking tools" is the most populated category, but "gathering food/materials" and "preparing dish" are close seconds.
In order to further visualize the connections in this dataset, we used a radial spanning tree.
This helps us to see the interconnectedness of our datasets and allows us to observe and explore different patterns and findings within our data.
We used this code to create the spanning tree, using the keyword "Exquisite Meal" as the center.
We chose this specific phrase, because we found the accompanying picture to be a good representation of the topics our project is focusing on. We referred to it as a "golden image", including visuals of people eating, food dishes, table settings, etc.
Once the tree above was created, we went through and recreated the structure using the actual images to make it easier for us to understand. The images were also color-coded based on the previously-discussed categories.
Out of all 85 images represented in the RST:
29 are "N/A"
18 are "preparing dish"
16 are "gathering food/materials"
11 are "eating food"
5 are "cooking tools"
3 are "preparing dinner table"
3 are "food dishes"
There are some interesting conclusions based on this processed RST. We can see that while the majority of the images are still "N/A", the next most represented category is "preparing dish". This is different to the overall distribution of the data, as "cooking tools" is one of the less-represented categories.