Research

At Facebook, I work on exciting computer vision and machine learning projects with great applications in the real world. I have particularly built deep learning computer vision systems for e-commerce solutions. I was deeply involved in the whole project cycle, from the very early stages of planning and scoping the project, going through the actual implementation phase, and finally shipping the models to be used in our products.

Before that I was at Intel and I was responsible to research and design efficient and accurate algorithms and to implement them in fully functional prototypes. For instance, I was part of a mobile augmented reality project for image matching and retrieval on mobile devices. I also worked on a 3D mobile augmented reality project, where we analyzed large amounts of images, 3D points’ clouds, and metadata to extract higher level 3D information, do image recognition, and register and track the camera while moving in the physical world. One more interesting project is a fully automatic system that takes input photos from many users, extracts the foregrounds and also the relative perspective to the reference background (homographies), and finally creates new appealing forms of media.

While at AUB, I advised four CS Master’s student at AUB on projects in the areas of computer vision, image processing, and pattern recognition:

  • Hani Masri (MSc 2012, senior architecture consultant at Murex)
  • Lama Affara (MSc 2013 (2013 Best CS Master thesis award), Phd student at KAUST)
  • Huda Nassar (Phd student at Purdue)
  • Lea Boutros

Additionally, I also worked on a camera tracking project that uses depth and image data acquired by Kinect.

Stabilized High-Speed Video from Camera Arrays

Presented "Stabilized high-speed video from camera arrays" at the Digital Photography and Mobile Imaging Conference on January 30, 2017 in Burlingame, California.

We present an algorithm to get high-speed video using camera array with good perceptual quality in realistic scenes that may have clutter and complex background. We synchronize the cameras such that each captures an image at a different time offset. The algorithm processes the jittery interleaved frames and produces a stabilized video. Our method consists of: synthesis of views from a virtual camera to correct for differences in cameras perspectives, and video compositing to remove remaining artifacts especially around disocclusions. More explicitly, we process the optical flow of the raw video to estimate, for each raw frame,

the disparity to the target virtual frame. We input these disparities to content-aware warping to synthesize the virtual views, significantly alleviating the jitter. Yet, while the warping fills the disocclusion holes, the filling may not be coherent temporally, leading to small jitter still visible in static/slow regions around large disocclusions. However, these regions don’t benefit from high rate in high-speed video. Therefore, we extract low frame rate regions from only one camera and video composite them with the remaining highly moving regions taken by all cameras. The final video is smooth and efficiently has high frame rate in high motion regions.

3D Mobile Augmented Reality

The fast growth of powerful mobile platforms and the constant generation of large amounts of data enable new compelling applications such as augmentation of 3D scenes. We built a complete prototype to investigate the challenges and the rooms for improvement to have a real-time 3D augmented reality system on a mobile device.

Gabriel Takacs, Maha El Choubassi, Yi Wu, and Igor Kozintsev "3D Mobile Augmented Reality in Urban Scenes," IEEE International Conference on Multimedia and Expo (ICME), Barcelona, Spain, July 2011.

Yi Wu, Maha El Choubassi, and Igor Kozintsev "Augmenting 3D Urban Environment Using Mobile Devices," IEEE International Symposium on Mixed and Augmented Reality (ISMAR), Basel, Switzerland, October 2011.

3D Mobile Augmented Reality

Creative Transformations of Images

We implemented a fully automatic system to process appearance and geometry information in simple photos and transform them into much more interactive and expressive form of media.

Yi Wu, Kalpana Seshadrinathan, Wei Sun, Maha El Choubassi, Joshua Ratcliff, and Igor Kozintsev "Creative Transformations of Personal Photographs," IEEE International Conference on Multimedia and Expo (ICME), Melbourne, Australia, July 2012.

Mobile Augmented Reality in Static Images

We developed a fully functional end-to-end prototype that matches an image acquired by a mobile device to a database of half-million images and shortly retrieves relevant information, such as the corresponding Wikipedia page or matching Flickr images.

The system combines visual features, state-of-the-art keypoint features, with contextual clues, such as GPS and orientation data.

Maha El Choubassi and Yi Wu, "Augmented Reality On Mobile Internet Devices Based On Intel Atom Technology," Intel Technology Journal, Vol. 14, No. 1, August 2010. [link]

Graduate Students Research

In his Master’s project, Hani implemented an image-based tour guide system for AUB campus. This system enables people roaming around to capture an image of a building with their mobile devices, identify the particular building of interest from the visual information only, and instantly retrieve valuable historical information.

Lama’s thesis is in progress and investigates the accuracy of image matching algorithms within a rigorous mathematical framework, using Bayesian decision theory and large deviations.

Huda and Lea are respectively working on 3D reconstruction from Kinect and object recognition/tracking in video. These projects are in collaboration with Dr. Bernard Ghanem from King Abdullah University of Science and Technology (KAUST). In particular, Huda is spending this fall semester (2012) at KAUST.