Group Activity Recognition Problem

Human activities can be classified to the following 4 types.

We discussed before action recognition, where typically one person is doing a specific atom action. In a more complex setup, a scene can contain several people doing an activity. The group activity recognition problem is about that.

Following 2 examples will highlight more the problem nature. In the first one, a group of people in street doing activities such as walking and standing. One can classify the scene based on what the majority are doing. E.g. if 6 people are walking and 2 are standing this is a walking scene. This is called collective activity.

The second example is a rally in volleyball scene. Here, the high-level activity is determined by the main activity taking place, i.e. a player in the left side involved spiking. Therefore, we could label this scene as left_spike.

As you see: Group Activity = Function (actions list of the humans).

A recent deep learning technique by [1] is based on 2 stage model as following:

Other deep learning work are [2] [6] [7] . From the manual features era TODO [4] [5]. An old survey [3]

There are few people working in this problem, such as Prof Greg Mori from Simon Fraser University. So it is a nice area to put effort in. One of the obstacles in this area is the lack of datasets. Even existing ones are not large scale. The problem is the need for 2-levels annotation for this problem (people annotations and scene activity annotation).

One of the old datasets is the "Collective Activity" Dataset. A recent one by [1] is volleyball dataset. A new approach published by Google research [2] finds the scene activity without explicitly learning the people annotations. They publish a larger scale dataset for basketball.


[1] A Hierarchical Deep Temporal Model for Group Activity Recognition

[2] Detecting events and key actors in multi-person videos

[3] Machine recognition of human activities: A survey.

[4] Discriminative latent models for recognizing contextual group activities

[5] Social roles in hierarchical models for human activity recognition

[6] Structure Inference Machines: Recurrent Neural Networks for Analyzing Relations in Group Activity Recognition

[7] Deep Structured Models For Group Activity Recognition