Introduction
In many surveillance videos, objects move only in a certain part of the video. For example, objects usually move on the road in the centre of the video, but not on the surrounding grass. Therefore the video volume has a block sparse structure, that is the sparsity (that is the foreground) is constrained to appear only in certain regions.
Formulation
The formulation of this optimization problem is as follows:
This is similar to the RPCA formulation as we have a nuclear norm minimization coupled with a sparsity inducing norm. the ||S||2,I norm acts as a sparsity inducing norm as it tries to minimize the number of non-zero element containing blocks. A tuning factor a can be modified to get desirable results.
Results
Due to lack of experience in optimization, we used cvx package for this, which did not seem to be suitable for large data. Hence we constructed a small toy dataset of very low resolution, that has a background that looks like the top view of a road, on which objects are travelling. The moving objects only appear on the road, thus making the toy video block sparse.
Here are some images showing screenshots of the toy video
Fig: Background (Road top view)
Fig: Video frame with road and objects
Fig: Separation with left part showing objects and right part showing the background (road)
Here are the input videos and the separation video.
Fig: Input video
Fig: Separated video