Dance Emotion Recognition

Emotion Recognition from Modern Dance Performance

Abstract - We present a vision-based method that directly recognizes human emotion from monocular modern dance image sequences in real-time. The method only exploits the visual information within image sequences and does not require cumbersome attachments such as sensors. This makes the method an easy-to-use and human-friendly one.

As shown in figure, our approach is divided into three parts.

Feature Extraction

To extract information from given frames, first of all, we must segment it. That is, we remove needless information and extract the information concerned with desired object in frames. In our work, we remove background because it’s needless one. Next, we extract meaningful features from the preprocessed frames. Generally, one more camera is needed to capture the movement in three dimensions. However, we use only one camera and adopts the rectangle surrounding a moving object. That means we try to track the movement of rectangle instead of object. It is impossible to track a subtle motion, while suitable to extract the elements of movement (space, time, energy) based on Laban’s theory. In our experiments, we extract the features same as the magnitude of box, the coordinate of centroid, etc.

Boundingbox-Based Motion Tracking

Without contour information, we couldn’t discriminate between similar but contextually different human motions. We introduced the number of the dominant points on the boundary of the silhouette area as a new feature. We use Teh-Chin’s algorithm to detect the dominant points because it has shown reliable results even if the object is dynamically scaled or changed.

Similar but Contextually Different Human Motion

With only a bounding box, we cannot discriminate between two motions. In the right, a little motion of her left leg gives rise to new dominant points. Then we can discriminate between two motions.

Feature Analysis

We use PCA to know tha stocastic characteristics of features. First of all, we apply SVD(Singular Value Decomposition) to the matrix containing the extraced features and get contribution measure by LMS(Least Mean Square) method. Using this measure, we calculate principal values from the features.

Singular Value Decomposition

a is the values of features.

Least Mean Square

α is a contribution measure.

Calculation of principal values

Feature Classification

We use TDMLP(Time-Delay Multi-Layer Perceptron) to classify the features. It is inherently nonlinear classification scheme and use time-dependency between features. It is adequate to classify dynamic data same as motion.

TDMLP

A set of features and its time delay set are used as input vectors.

Simulation Result

We obtained above 70% recognition rate in outside the training sequence. This is really nice. People cannot recognize other person’s emotion with that precision when they watch only his/her natural motion. In practice, we confirmed through a subjective evaluation on college students that people have the correctness of approximately 50~60%.

Publication

Hanhoon Park, Jong-Il Park, Un-Mi Kim, and Woontack Woo, "A Statistical Approach for Recognizing Emotion from Dance Sequence," Proceedings of International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC) 2002, vol. 2, pp. 1161-1164, Phuket, Thailand, July 2002 [pdf]
Hanhoon Park, Jong-Il Park, Un-Mi Kim, and Woontack Woo, "Emotion Recognition from Dance Image Sequence Using Contour Approximation," Lecture Notes in Computer Science 3138, Advances in Statistical, Structural and Syntactical Pattern Recognition, Proceedings of S+SSPR'04, pp. 547-555, Aug. 2004 [paper link]

Page updated

Google Sites

Report abuse