Sign Language Recognition from LSTM/RNN/GRU

Project by Devin White, An Do, Aaron Daniel

Abstract

People who are deaf or hard of hearing use sign language to communicate to others. This can create a communication barrier if someone does not know sign language. Research has been extensively conducted in this area such as [1], which use Recurrent Neural Networks and Convolutional Neural Networks to watch a user sign a word and accurately predict the word. Current methods utilize a Long-Short Term Memory (LSTM) layer for recurrent information and gesture prediction, however these are very computationally intensive. We propose that for short phrases or words that a Gated-Recurrent Unit (GRU) could work as good if not better than the more computationally intensive LSTM, while lowering the training time and maintaining a similar inference time. Our results show that GRU networks can perform better under certain conditions, while being quicker to train and slightly longer inference time. 

[1] Real-Time Sign Language Gesture (Word) Recognition from Video Sequences Using CNN and RNN

[2] Real-Time Sign Language Detection using Human Pose Estimation