— Open Data Sets

Introduction

Research in general, and machine learning in particular, depend on big data.  Red Hen Lab seeks to create Open Data Sets and to list open data sets here that might be useful for research in multimodal communication. For Red Hen, at least, "Open" does not necessarily mean "public." It may be that there are data sets available to only certain researchers, such as Red Hens, and only under certain research licenses.

Related pages


Open Data Sets

  • ViMELF - The Corpus of Video-Mediated English as a Lingua Franca Conversations, Version 1.0.
    • Dataset: ViMELF contains 20 fully transcribed Skype conversations with gestures and pragmatic elements between 40 speakers from Germany (20 speakers), Spain (5), Italy (5), Finland (5), and Bulgaria (5), totaling 744.5 minutes (ca. 12.5 hours), with an average conversation length of 37.23 minutes. The corpus comprises 113 670 words in the plain text version and 152 472 items in the annotated version. The transcripts are available as .docx and .txt files; the anonymized videos in MPEG4 format. Several versions are available: the fully annotated pragmatic version as text and XML (XTranscript, Gee 2018), a lexical version (XTranscript, Gee 2018), and a POS-tagged version (auto-tagged with the CLAWS C7 tagset).
    • Website and further info: http://umwelt-campus.de/case
    • Access: ViMELF transcripts are freely available for non-commercial research purposes. If you would like to use the dataset, please register via the project website – you will then receive download instructions. The video and audio data is available separately for viewing/listening via a dedicated university server.
    • Project coordination: Stefan Diemer & Marie-Louise BrunnerLanguage & Communication, Trier University of Applied Sciences, Germany
    • Citation: To cite ViMELF in your own research, please use the following citation:
      ViMELF. 2018. Corpus of Video-Mediated English as a Lingua Franca Conversations. Birkenfeld: Trier University of Applied Sciences. Version 1.0. The CASE project [http://umwelt-campus.de/case].
    • Contact: sk@umwelt-campus.de
  • Red Hen Interview Gesture Collection (RHIGC)
    • Dataset: The RHIGC is based on 30 interviews from the Ellen De Generes Show which were hand-annotated for gesture by Suwei Wu and Yao Tong at VU Amsterdam for their PhD projects under the supervision of Prof. Alan Cienki. It will contain video snippets of hand gestures (and possibly of similar shots without hand gestures). An alternative version with pre-annotated data generated with OpenPose may also be made available.
    • Project coordination: Yao Tong (VU Amsterdam) & Peter Uhrig (Universität Osnabrück/FAU Erlangen-Nürnberg).

Some Open Data Sets for Gesture Recognition

Hat tip to Søren Gran for this section!
 DatasetWebsite2D or 3DNumber of GesturesNumber of PeopleActual Video Files Available?Description
MSRGesture3D https://www.uow.edu.au/~wanqing/#Datasets3D1210 No
  • Gestures were American Sign Language
  • Dataset captured via Kinect device 
MSRDaily Activity3Dhttps://www.uow.edu.au/~wanqing/#Datasets3D  16 10 No
  • Gestures were everyday activites (e.g. drink, eat, read book, etc.)
  • Dataset captured via Kinect deviceMSRDaily Activity3D 
Kinect Gesture Data Set https://www.microsoft.com/en-us/download/details.aspx?id=52283&from=http%3A%2F%2Fresearch.microsoft.com%2Fen-us%2Fum%2Fcambridge%2Fprojects%2Fmsrc12%2F3D 12 30 Yes
  • Captured via Kinect
  •  
Two-Handed Datasets http://www-prima.inrialpes.fr/FGnet/data/04-TwoHand/main.html3D 7 7 Yes
  • 7 different two-handed gestures (rotations in all the 6 directions and a "push" gesture)
  • 4 people for training, 3 for testing
CVVR-HANDS 3D http://cvrr.ucsd.edu/LISA/hand.html3D 198Yes
  • Captured via Kinect
  • Focuses on driving-related motions 
Praxis Gesture Dataset http://riemenschneider.hayko.at/vision/dataset/task.php?did=4523D 29 64Yes 
  • Captured via Kinect v2
  • 2 types of gestures: dynamic (14) and static (15)
ChAirGesthttps://project.heia-fr.ch/chairgest/Pages/Overview.aspx3D 13 10 Yes
  • Captured via Kinect and Intertial Motion Units 
CHALEARN Multi-modal Gesture Challenge http://sunai.uoc.edu/chalearn/2D/
3D
 2742 Unsure No
  • Captured via Kinect
  • Common Italian gestures 
Sheffield Kinect Gesture dataset http://riemenschneider.hayko.at/vision/dataset/task.php?did=1703D 10 6 Unsure
  • Captured via Kinect 
  • Hand gestures
Sebastien Marcel Dynamic Hand Gesture Database http://www.idiap.ch/resource/gestures/2D 4 10 No
  • 2D hand trajectories in a normalized body-face space 
VIVA Hand Gesture Challenge Dataset http://cvrr.ucsd.edu/vivachallenge/index.php/hands/3D 19 8 Yes
  • Captured via Kinect
  • Real-world driving gestures 
 Dynamic Hand Gesture 14/28 dataset http://www-rech.telecom-lille.fr/DHGdataset/2D/
3D
 14 20 No
  • Captured via Intel Real Sense Depth camera
  • Hand gestures 


Top of Form

Top of Form

Bottom of Form

Bottom of Form

Bottom of Form