Machine Learning

Introduction

Multimodal annotation is time-consuming, and Red Hen is involved in a series of projects that recruit the help of computers to simplify the task. Computer Scientists are actively working on developing tools that develop new classifiers that encode the regularities and patterns in a particular set of manual annotations. Such classifiers can in turn be used to propagate the manual annotations to a larger dataset robotically. Such automated gesture recognition can then generate new metadata that makes new forms of communication research possible.

The methods are imperfect, and the types of manual annotations rich and varied, so high-quality classifiers typically need feedback from the user in a recursive learning process. One of Red Hen's goals is to integrate Elan into such semi-supervised machine learning systems.

Machine learning projects

Red Hen is involved in a series of machine learning projects, and is rapidly developing new capacities in this area.

Chinese speech to text (ASR)
Gesture detection with OpenPose
Social trait analysis with Jungseock Joo and Song-Chun Zhu (since 2014). Weixin Li has set up a facial analysis pipeline on the Case HPC that uses torch7.
iMotion project with Heiko Schuldt's team at the University of Basel (since Spring 2015)
Soumya Ray's graduate course project on timeline gestures (since Fall 2015); see Gesture detection 2017 and Tagging for Likelihood of Gesture Data, which involves not a classifier but a motion detector.

Red Hen mentored several projects in Google Summer of Code 2016, 2017, and 2018 focused on machine learning; see Summer of Code.

Resources

Deep learning frameworks

Facebook's wav2letter automatic speech recognition
DL4J vs. Torch vs. Theano vs. Caffe vs. TensorFlow -- tool comparison
TensorFlow
- TensorLayer -- Documentation
- Tensorflow tutorials
torch -- a scientific computing framework with wide support for machine learning algorithms that puts GPUs first
DMLC for Scalable and Reliable Machine Learning
theano -- python-based deep learning
MXNet (about) -- deep learning
Caffe -- deep learning framework, mainly computer vision
Audiocaffe
WaveNet (released by Google December 2016)
Deep learning benchmarks

Reading

Yoshua Bengio, Ian Goodfellow, and Aaron Courville (2015). Neural Networks and Deep Learning.
Andrew Ng's Machine Learning course on Coursera
2015-06-11 Microsoft researchers tie for best image captioning technology
Charles Manning (2015). Computational Linguistics and Deep Learning
2016-06-01 Deep Learning Trends @ ICLR 2016
fastML -- Machine learning made easy
Yann LeCun's twitter
Lip reading with Tensorflow