The dataset was collected at the MIDAS (Multimodal Digital Media Analysis) Lab, Indraprastha Institute of Information Technology-Delhi, India. It was designed to facilitate research on typing through gestures, also referred to as touchless typing.
The database contains video recordings from 22 subjects typing three types of gestures: words, short phrases, and sentences. The goal was to capture the head-movements of the participants via cameras and motion sensors (accelerometer and gyroscope) built into a headband (Muse 2) while the users instinctively looked at the cluster one by one. The layout of the on-screen keyboard has been shown below, where the colors represent different clusters of letters. So, there is a total of 9 clusters (8 key clusters and 1 for the spacebar).
The dataset is available for FREE for public research. To download it, you will have to sign this agreement and upload the signed copy on this form. The link for the dataset and the readme file will be shared to you after a review by the organizers.
Contact Yaman Kumar in case of any issues.
The data collection setup consisted of three cameras placed on tripod at -45 degree, 0 degree, and 45 degree, a virtual QWERTY keyboard was displayed on a 17" screen, and a headband (Muse 2) worn by the participant. The camera placed at 0 degree was facing the participant. The three cameras captured the visual aspect of the participant's head movement, whereas headband's accelerometer and gyroscope sensors captured the acceleration and rotation of the head, respectively.
The videos were recorded at 30 frames per second with a resolution of 1920x1080 pixels. The focus was to record mostly the torso of the participant. The recording was made in a regular lab environment with adequate lighting conditions.
We have released only the central view video and the sensor data, which will be used for the Grand Challenges competition.
Figure 1: The data collection environment consisting of a monitor on which the virtual keyboard was displayed, three cameras placed on tripods recording the head movement-based gestures of the participant, in addition to Muse 2 which was worn by the participant on his/her forehead, a moderator’s laptop, and a laser light. All three cameras were facing the participant.
Figure 2: The color coded QWERTY keyboard that was displayed to the participants on a monitor of 17".
Table 1: Text to clusters-- examples of word, phrase, sentences, and corresponding cluster sequences.
Table 2: List of all the words, phrases, and sentences that were typed by each participant. The exercise was repeated three times.