MIRACL-VC1 is a lip-reading dataset including both depth and color images. It can be used for diverse research fields like visual speach recognition, face detection, and biometrics. Fifteen speakers (five men and ten women) positioned in the frustum of a MS Kinect sensor and utter ten times a set of ten words and ten phrases (see the table below). Each instance of the dataset consists of a synchronized sequence of color and depth images (both of 640x480 pixels). The MIRACL-VC1 dataset contains a total number of 3000 instances.

Details and results for visual speech recognition can be found in this following paper:

Ahmed Rekik, Achraf Ben-Hamadou, Walid Mahdi: A New Visual Speech Recognition Approach for RGB-D Cameras. ICIAR 2014: 21-28

The MIRACL-VC1 dataset is made available for research purposes only. In order to have access to the dataset please follow these 15 links, each one corresponds to the data of one person. For more information do not hesitate to sent an email to Ahmed Rekik (rekikamed@gmail.com),

It is recommended to use the following link to get access to the full data on google drive :

Link to google drive location.

Otherwise you can use the following links:

F01 - F02 - F04 - F05 - F06 - F07 - F08 - F09 - F10 - F11 - M01 - M02 - M04 - M07 - M08

Download the Sensor calibration to align the depth maps with color images.


You are required to cite our work whenever publishing anything directly or indirectly using the data:


author = {Ahmed Rekik and Achraf {Ben-Hamadou} and Walid Mahdi},

title = {A New Visual Speech Recognition Approach for {RGB-D} Cameras},

booktitle = {Image Analysis and Recognition - 11th International Conference, {ICIAR} 2014, Vilamoura, Portugal, October 22-24, 2014}

year = {2014}, pages = {21--28} }

Folder architecture of the dataset is explained as follows: