Social interaction of a person during meal times has been known as an important indicator for detecting certain mental health problems such as eating disorders, depression and PTSD. We proposed an unobtrusive wearable device, combining RGB camera and thermal camera that implements facial recognition, gaze direction and mouth tracking to monitor social interaction.
Social interaction is generally defined as the process where people act and react with others.
Research has shown that an individual's social behavior property can reflect their mental state. One’s social interaction during mealtimes is also a strong indicator to the person’s mental state. It has been shown that social anxiety disorder and eating disorders are highly comorbid, and social appearance has a strong relationship with bulimic symptoms. Eating alone during mealtimes can lead to several health issues including depression, eating disorders, or post-traumatic stress disorder.
These illnesses and disorders can be harmful to a person, it is important to monitor if a person is eating alone or not during mealtimes. If this can be detected, then it might be possible to detect several health problems earlier on. Additionally, doctors will be able to look for these signals earlier and therefore will be able to help prevent several health disorders including depression and eating disorders.
The design and implementation of the unobtrusive wearable device combining RGB and thermal cameras that record RGB frames and thermal IR data, alongside a computational approach for analyzing and classifying data as eating alone or not alone.
We design a specific experimental paradigm to test our prototype on three broad scenarios for model training: eating alone, eating with social interaction, and eating in the presence of other people but without any interaction. This experimental paradigm is designed under the consideration of facilitating the acquisition of more effective facial and gesture features.
An in-the-wild evaluation of our prototype with 1 participants over two meal times, during which participants wore our prototype and record their eating state (alone or not alone).
Overview of the modeling process. This device is worn on the chest and can be tied around the neck like a necklace or attached to clothes and the camera is facing out. First, we record the RGB and thermal video simultaneously. We reduce the recording rate to 5 FPS, so that we can reduce the cost of storage and computation while ensuring the measuring effect. The first stage works on IR data to detect whether another person is being recorded. Once someone has been detected, we run the more computationally expensive OpenPose system on these frames. Next, we extract facial landmarks detected by OpenPose. Finally, we use SVM to classify whether the person’s mouth is talking or not and whether the person’s face is direct to wearer. Since in a mealtime people are not always talking and looking directly at each other, it is necessary to predict the overall social interaction time from the scattered talking time.
Left: The main board and main components of our camera device, the upper is the MLX90640 thermal camera and the below is the OpenMV camera. We integrated the two sensors together using a PCB board shield;
Right: We 3D-printed case for our device so that it can be wore in front of a wearer's chest.
Aditi Agarwal
George Pan
Chixiang Wang
Xuan Zhang
* The authors are listed in alphabetical order.