t-GRASTA

This webpage introduces a variant of GRASTA with geometric transforms: t-GRASTA. Our work was inspired by the insightful review comments on our CVPR paper on GRASTA in 2012. The reviewers shared with us a nice approach to robust PCA incorporating geometric transforms on each image frame-- RASL. Working with my undergraduate student Dejiao Zhang (now PhD student with Prof. Laura Balzano), we propose the algorithm t-GRASTA (transformed GRASTA), an online transformed RPCA approach which iteratively performs incremental gradient descent constrained to the Grassmann manifold of subspaces in order to simultaneously estimate a decomposition of a collection of images into a low-rank subspace, a sparse part of occlusions and foreground objects, and a transformation such as rotation or translation of the image.

Our work can be regarded as an extension of GRASTA and RASL: We extend GRASTA to transformations, and extend RASL to the incremental gradient optimization framework. So, t-GRASTA is amenable to streaming and real-time applications. For example in the case of camera jitter, we can see from the following figure that t-GRASTA successfully separates the foreground and the background and simultaneously align the perturbed images.

Fig.1 Video background and foreground separation by t-GRASTA despite camera jitter.

1st row: misaligned video frames by simulating camera jitters;

2nd row: images aligned by t-GRASTA;

3rd row: background recovered by t-GRASTA;

4th row: foreground separated by t-GRASTA;

5th row: background recovered by GRASTA;

6th row: foreground separated by GRASTA.

Update

[July 9, 2013]: Setup the project webpage. t-GRASTA v1.0 is released.

The Key Point

For each video frame I, due to the nonlinear geometric transform I \circ \tau, directly learning the low-rank subspace online is problematic. Here we approach this as a manifold learning problem, supposing that the low- dimensional image subspace under nonlinear transformations forms a nonlinear manifold. We propose to learn the manifold approximately using a union of subspaces model U^l, l = 1, . . . , L. And the locally linearized model for the nonlinear problem is :

Then for each misaligned image I and the unknown transformation τ, we iteratively update the union of subspaces U^l, l= 1,...,L, and estimate the transformation τ.

Applications

Here we show that t-GRASTA can handle robust image alignment tasks and tackle the interesting background foreground separation problem despite of video jitters. Please refer to our paper[1][2] for detailed performance evaluation.

Robust Image Alignment

Fig. 2 Aligning handwritten digits taken from MINST database.

Robust BK/FG Separation Despite of Camera Jitter

Fig. 3 Video background and foreground separation with jittered video

1st row: misaligned video frames randomly selected from artificially perturbed images;

2nd row: images aligned by t-GRASTA;

3rd row: background recovered by t-GRASTA;

4th row: foreground separated by t-GRASTA;

5th row: background recovered by GRASTA;

6th row: foreground separated by GRASTA.

Sidewalk.avi

Video 1. We apply the "fully online" mode of t-GRASTA to the real world jittered video "sidewalk" from "change-detection" dataset.

Al_Gore.avi

Video 2. We apply "fully online" mode of t-GRASTA to "Gore" dataset. Here, we simply crop the face from each image by a constant rectangle with size 68 × 44, which has the same position parameters for all frames. So in this case, the jitters are caused by the differences between the motion and pose variation of the target and the stabilization of the constant rectangle.

Visural Tracking via Image Alignment

car4.avi

Video . We apply the fully online mode of t-GRASTA to visual tracking.

t-GRASTA Code Package

Package Files

The current released t-GRASTA package v.1.0, around 50 MB including 5 datasets used in our paper, can be downloaded here.

In the code package, there are three slightly different type of algorithms -- "Batch mode" / "Fully online mode" / and "Trained online mode" which are corresponding to the algorithms of our paper [1].

- "Batch mode" - referring to tgrasta_batch_training:
  - Similar to RASL, say we treat the unaligned images as a batch and the aligned subspaces are iteratively learned, but the subspace learning method in each iteration follows GRASTA. We always use this approach to train the initial K-subspaces for the other online mode algorithms.
- "Fully online mode" - referring to tgrasta_fully_online:
  - The main contribution of our work, this approach treats each video frame as the input and iteratively updates the union of subspaces
- "Trained online mode" - referrring to tgrasta_trained_online:
  - The simplified version of t-GRASTA, this approach works if we know the well-aligned subspace as a prior, for example we can use the "Batch mode" to train a good subspace, then apply this method to align the remained images.

Disclaimer: Some basic image transform functions used in t-GRASTA depend on RASL, here we put those useful functions in the subdirectory 'RASL_toolbox_2010'.Interested users should refer to the authors' webpage for the latest version. http://perception.csl.illinois.edu/matrix-rank/rasl.html

License

The source code for the t-GRASTA package can be distributed and/or modified under the terms of the GNU Lesser General Public License (LGPL) as published by the Free Software Foundation, either version 3 of the License or (at your option) any later version.

For other usage, please contact the authors.

References

[1] Jun He, Dejiao Zhang, Laura Balzano, Tao Tao. “Iterative Grassmannian Optimization for Robust Image Alignment.” Image and Vision Computing, Best of Automatic Face and Gesture Recognition 2013, vol.32. 10 (2014): 800–813. doi:10.1016/j.imavis.2014.02.015.

[2] Jun He, Dejiao Zhang, Laura Balzano, Tao Tao. Iterative Online Subspace Learning for Robust Image Alignment, In IEEE Conference on Automatic Face and Gesture Recognition (FG), April 2013. (Oral, PDF)