UTA-RLDD

UTA Real-Life Drowsiness Dataset

"In honor of my grandfather (G.Hossein Zarif) who taught me to love unconditionally. May he be blessed by every life that is saved as a result of this dataset.": R.G

Description :

The University of Texas at Arlington Real-Life Drowsiness Dataset (UTA-RLDD) was created for the task of multi-stage drowsiness detection, targeting not only extreme and easily visible cases, but also subtle cases when subtle micro-expressions are the discriminative factors. Detection of these subtle cases can be important for detecting drowsiness at an early stage, so as to activate drowsiness prevention mechanisms. Subtle micro-expressions of drowsiness have physiological and instinctive sources, so it can be difficult for actors who pretend to be drowsy to realistically simulate such expressions. Our UTA-RLDD dataset is the largest to date realistic drowsiness dataset.

The RLDD dataset consists of around 30 hours of RGB videos of 60 healthy participants. For each participant we obtained one video for each of three different classes: alertness, low vigilance, and drowsiness, for a total of 180 videos. Subjects were undergraduate or graduate students and staff members who took part voluntarily or upon receiving extra credit in a course. All participants were over 18 years old. There were 51 men and 9 women, from different ethnicities (10 Caucasian, 5 non-white Hispanic, 30 IndoAryan and Dravidian, 8 Middle Eastern, and 7 East Asian) and ages (from 20 to 59 years old with a mean of 25 and standard deviation of 6). The subjects wore glasses in 21 of the 180 videos, and had considerable facial hair in 72 out of the 180 videos. Videos were taken from roughly different angles in different real-life environments and backgrounds. Each video was self-recorded by the participant, using their cell phone or web camera. The frame rate was always less than 30 fps, which is representative of the frame rate expected of typical cameras used by the general population.

Each video was self-recorded by the participant, using a cell phone or web camera of the participant. The frame rate was always less than 30 fps, which is representative of the frame rate expected of normal cameras used by the general population.

Data Collection :

Sixty healthy participants took part in the data collection. Subjects were instructed to take three videos of themselves by their phone/web camera (of any model or type) in three different drowsiness states, based on the KSS table(Table 1), for around 10 minutes each, and upload the videos as well as their corresponding labels on an online portal provided via a link. Subjects were given ample time (20 days) to produce the three videos. Furthermore, they were given the freedom to record the videos at home or at the university, any time they felt alert, low vigilant or drowsy while keeping the camera set up (angle and distance) roughly the same. All videos were recorded in such an angle that both eyes were visible, and the camera was placed within one arm length away from the subject. These instructions were used to make the videos similar to videos that would be obtained in a car, by phone placed in a phone holder on the dash of the car while driving. The proposed set up was to lay the phone against the display of their laptop while they are watching or reading something on their computer((Fig 1). Also, they were asked to do the same task (reading, watching or idle) in all of the three videos for consistency reasons. The three classes were explained to the participants as follows:

1) Alert : One of the first three states highlighted in the KSS table in Table 1. Subjects were told that being alert meant they were completely conscious so they could easily drive for long hours.

2) Low Vigilant : As stated in level 6 and 7 of Table 1, this state corresponds to subtle cases when some signs of sleepiness appear, or sleepiness is present but no effort to keep alert is required. While subjects could possibly drive in this state, driving would be discouraged.

3) Drowsy : This state means that the subject needs to actively try to not fall asleep (level 8 and 9 in Table 1).

Table 1 : KSS Table

Fig. 1 : Recommended Camera Set up for Video Recording

Content :

This dataset consists of 180 RGB videos. Each video is around 10 minutes long, and is labeled as belonging to one of three classes: alert (labeled as 0), low vigilant (labeled as 5) and drowsy (labeled as 10). The labels were provided by the participants themselves, based on their predominant state while recording each video. This type of labeling takes into account and emphasizes the transition from alertness to drowsiness. Each set of videos was recorded by a personal cell phone or web camera resulting in various video resolutions and qualities. The 60 subjects were randomly divided into five folds of 12 participants, for the purpose of cross validation. The dataset has a total size of 111.3 Gigabytes.

Fig. 2 : Sample frames from the UTA-RLDD dataset in the alert (first row), low vigilant (second row) and drowsy (third row) states.

Publishable Images

36 out of 60 participants allowed us to publish their faces on the paper or any future publication resulting from this dataset. While these images can be shown, the identity of the subjects (which we do not provide) cannot be revealed in any way.

Image Publishable

RLDD Dataset Videos :

You can access the data using this link:

http://vlm1.uta.edu/~athitsos/projects/drowsiness/

Instruction :

In order to make sure the results of all the research on this dataset are comparable, we encourage researchers to follow the below instruction to evaluate their results.

Use one fold of the UTA-RLDD dataset as your test set and the remaining four folds for training. After repeating this process for each fold, the results would be averaged across the five folds.

Citation

All documents (such as publications, presentations, posters, etc.) that report results, analysis, research, or equivalent that were obtained by using the UTA-RLDD database should cite the following research paper:

https://arxiv.org/abs/1904.07312

@inproceedings{ghoddoosian2019realistic,

title={A Realistic Dataset and Baseline Temporal Model for Early Drowsiness Detection},

author={Ghoddoosian, Reza and Galib, Marnim and Athitsos, Vassilis},

booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops},

pages={0--0},

year={2019}

}

Contact

For any questions regarding the UTA Real-Life Drowsiness Dataset (UTA-RLDD), please contact Reza Ghoddoosian reza.ghoddoosian@mavs.uta.edu.