Project Homepage of SP-EMD for Hand Gesture Recognition with Kinect

Q&A

Here we list some frequently asked questions related to SP-EMD and our system.

What is in-plane rotation?

In our work, in-plane rotation is referred to the hand rotation limited in the image plane, w.r.t. the camera, as shown blow. Therefore, it does not affect the size of the fitted maximum inscribed circle too much.

What is out-of-plane rotation?

The rotation direction is out of the image plane so that visible part of the hand changes dramatically as shown below. Therefore, the out-of-plane rotation has more impact on the size of the palm circle.

How to remove the wrist part from the segmented hand shape?

Regarding the second concern on visible wrist part, we use a distance thresholding method to remove it. All hand shapes are first rotated to the same orientation according to the joint points of hand, wrist and elbow. Therefore, the wrist is always located below the palm center. Then, all the pixels lower than a distance threshold (normally the radius of the palm circle) from the palm center are cropped. An example of this step is shown in the figure bellow, (a) and (b) are the image before and after rotation, while (c) and (d) are the segmented hand shape before and after wrist removal, respectively.

What is the camera setup for view angle sensitivity test?

The samples of -20^o, -10^o, +10^o and +20^oview angles are captured using the camera setup shown below.

Why is gesture 7 confused with gestures 6 and 9 in SP-EMD?

The confusing cases between gesture pairs 7 and 6, and 7 and 9 are mainly caused by a special habit of subject 4 when he performs the gesture 7. As shown below, subject 4’s gesture 7 is quite unique comparing to other 4 subjects’. In particular, we can see from the figure that his middle finger, ring finger and little finger are only half folded, which is highlighted in yellow.

Such “abnormal” gesture unfortunately generates additional finger-like superpixels and leads to inaccurate ICP alignment as well. More precisely, gesture 7 of subject 4 may be wrongly recognized as gesture 9 of other subjects (instead of their gesture 7). These erroneous samples will also degrade the recognition accuracy when they are used as the templates in the proposed algorithm. For example, in L4O CV, these erroneous samples are the only available templates of gesture 7 when subject 4 is treated as the training subject. As a result, other subjects’ gesture 7 may have a relatively larger SP-EMD distance to subject 4’s gesture 7 than his other gestures, such as gesture 6. This is why gesture 7 is confused with gesture 6 in some cases. However, we note that this undesired confusion does not happen in the case of LOO CV, because the templates of gesture 7 of other 3 subjects in LOO CV are always available.

More questions?

Let us know! ^_^

Page updated

Google Sites

Report abuse