Abstract: Human tracking is one of the most challenging problems in computer vision. Past work show use of efficient inference algorithms and motion models based on kinematic priors to deal with the problem. However, such algorithms fail especially in case of complex human poses and occlusion. Results can be improved by incorporating knowledge about the dynamics of the system in the inference algorithm. Little work has been done to incorporate system dynamics for accurate tracking of articulated human subject. In this work, we propose the use of biomechanical and statistical motion models to track a single articulated human subject without using any motion capture data. Prior based on a 3D physical dynamics simulation incorporating control and dynamics of the motion are introduced into the Bayesian framework. Such prior not only take into account knowledge of the range of motion of human joints, which is often pose dependent, but will also help in establishing the feasibility of a predicted motion. This plausibility of motion takes into consideration person’s mass, interaction with the ground, self-collisions etc., and the likelihood of a human motion for a given pose.
Human tracking is typically formulated as a Bayesian filtering problem, based on a Particle Filter (PF). In PF the posterior is approximated using a set of weighted samples/particles and is computed recursively. In this work, we focus on developing a dynamics based temporal prior contributing to the posterior as opposed to a first or second order linear dynamical system with Gaussian noise which is often adopted due to unavailability of more realistic priors. We assume that for simulating dynamics of the scene the segment shapes, mass properties, collision geometries and other associated parameters (e.g. direction of gravity) is known and remain constant throughout the motion sequence. We also consider a human as a loop-free articulated structure. Bayesian filtering technique i.e. PF is finally employed with the proposed dynamics-based prior method.
Research Issues:
How to reduce dependency on motion capture data?
How can we use dynamics based models to reduce dependency on motion capture data?
How much physics?
What level of abstraction is effective for accurate tracking of the human pose?
Robustness:
Do physics-based models for vision need to be as robust and rich as that in Robotics?
Inference:
How to do simultaneous inference of pose and other unknown environmental parameters in dynamics-based models? Or if possible how to reduce the dependency of dynamics-based models on such unknown parameters using Bayesian filtering?
Publications
Conference/Journal
P. Agarwal, S. Kumar, J. Ryde, J. Corso, and V. Krovi, "Estimating Dynamics On-the-fly Using Monocular Video For Vision-Based Robotics", IEEE/ASME Transactions on Mechatronics, 2013. [IEEE Xplore]
P. Agarwal, S. Kumar, J. Ryde, J. Corso, and V. Krovi, "An Optimization Based Framework for Human Pose Estimation in Monocular Videos", International Symposium on Visual Computing, Rethymnon, Crete, Greece, July 16-18, 2012. [PDF] [Springer] [Videos]
P. Agarwal, S. Kumar, J. Ryde, J. Corso, and V. Krovi. Estimating Human Dynamics On-the-fly Using Monocular Video for Pose Estimation. Robotics: Science and Systems Conference, University of Sydney, Sydney, Australia, July 9-13, 2012. [PDF] [RSS12] (Selected for the National ICT Australia (NICTA) Student Fellowship)
P. Agarwal, S. Kumar, J. Corso, and V. Krovi. Estimating Dynamics On-the-fly Using Monocular Video. Dynamic Systems and Control Conference, California, October 12-14, 2011. [PDF] [ASME]
Book Chapters
P. Agrawal, S. Kumar, J. Ryde, J. Corso and V. Krovi, “Estimating Human Dynamics On-The-Fly Using Monocular Video For Pose Estimation,” in Robotics: Science and Systems VIII, P. Newman, N. Roy and S. Srinivasa (Eds.), MIT Press, pp. 1-8, August 2013. [IEEE Xplore]
Theses
P.Agarwal, "Dynamics-based Human Pose Estimation Using Monocular Vision", Master's Thesis, State University of New York at Buffalo (SUNY Buffalo), 2012. [PDF]
Reports
S. Kumar, P. Agarwal, J. Corso, and V. Krovi. Product of Tracking Experts for Human Tracking. European Conference on Computer Vision, Firenze, Italy, October 7-13, 2012.
P. Agarwal, "An Optimization Framework for Pose Estimation of Human Lower Limbs from a Singe Image", Project Report, Optimization in Engineering Design. [Report] [Poster]
Interim Report [PDF]