Project 1

Object Removal on 3D models

Here is the pdf of the paper.

3D reconstruction from a single photograph is an attractive way to explore images. At the same time, foreground objects in a scene, like people and cars, make such reconstruction cumbersome. We use user-generated foreground masks to remove these objects and aid in making better 3D models. By combining Saxena et al. [9]’s reconstruction method with a region filling method [1], we demonstrate improvement not only in the quality ofreconstruction, but also the efficacy of objects removal. The novelty of the paper is in creating a seamless algorithm which jointly removes objects while producing 3D models.

Figure Shown the Higher level picture of What we are trying to achieve.

Saxena's Make3D paper which is the basis for our 3D models can be downloaded here.

The way it fits into our work in shown in this Figure.

Some Results showing how inpainting on 3D models does better.

Project 2

Monocular Visual SLAM by learning Optical Flow Subspaces

Here is the pdf of the paper.

So this paper is an Specific Experiment on the paper by Richard Roberts . The link to that paper is here. He has used an EM algorithm to find the outliers in optical flow. This is important because the outliers bias the system. In this paper we talk about doing the same thing using RANSAC to see if it improves the performance of the algorithm.

Simultaneous Localization and Mapping (SLAM) involves simultaneously estimating locations of newly perceived landmarks and the location of the robot itself while incrementally building a map of an unknown environment. Both SLAM (which has been more popular term in mobile robotics research) and Structure from Motion (SfM) in computer vision does the same job of estimating sensor motion (camera in our case) and structure of an unknown static environment. One of the important motivation behind this is to estimate the 3D scene structure and camera motion from an image sequence in realtime so as to help guide robots. Thus visual SLAM takes as input the 2D motion from images and seeks to infer in a totally  automated manner the 3D structure of the viewed scene and the camera locations where the images were captured. In this work, we propose an approach where we estimate the egomotion of the camera by learning the Optical flow subspace and use this estmate as an odometry input for the Visual SLAM. Optical flow between images are often noisy and as the number of ouliers increase, the accuracy of motion estimates drop. While Roberts et al. incorporates this into the genrative model, the resulting Expectation Maximization (EM) expression takes a long time to converge to the right estimate. We propose an approach where we use RANSAC to remove ouliers and by doing so, estimate the ego motion based purely on the inliers present in the system. We believe that this could lead to faster convergence of the EM step to give odometry estimates quickly. Output of the visual SLAM algorithm could then be used to build a occupancy grid map, suitable for purposes like robot motion planning and exploration.

Natesh S,
Dec 18, 2011, 5:38 AM
Natesh S,
Dec 18, 2011, 5:52 AM