Zero-Shot Object Detection with Attention

Zero-shot object detection with attention

Rohan Chacko, Wenyu Xia, William Wong

16-824 Visual Learning and Recognition Project

https://sites.google.com/andrew.cmu.edu/vlr-team11-zero-shot-detection/

Motivation

For real world tasks, computer vision systems must be able to reason about their environment beyond what their underlying models were trained on. Consider a human warehouse worker who picks and places items into their corresponding package. Say they are picking objects off a conveyor belt and are given a single target image of the object of interest. Even if the worker has not seen the object before, they are still able to identify the object on the conveyor belt and pick it up correctly. Similarly, an object recognition system for a warehouse robot for example, should be able to perform the same task. We frame this task as zero-shot object detection where we seek to train models that are able to locate a target object it has never seen before until an image at test time is provided. In computer vision, attention has also led to great performance increases in reasoning about images as a whole and across time. Much prior work has focused on textual descriptions of a target object however we aim to provide a visual description and utilize attention to help the model identify salient features between the target and camera view images.

Factory robots often need to grasp objects from a pile without apriori knowledge of which object to be picked

Page updated

Report abuse