Dataset

News & Announcements

  • 7/18/2017 - We've released version 4.0.1 of the API and made some small changes to trainval_merged.json. To update:
    1. run git pull in the detail-api directory,
    2. delete your existing trainval_merged.json, and
    3. re-run python3 download.py trainval_withkeypoints /your/directory
    4. re-run python3 download.py pascal /your/directory to download the test images in addition to the trainval ones.
  • 7/17/2017 - We have published trainval_merged, which includes 8 tasks in one JSON file, and we've released version 4.0 of the Detail API. Now you can access image classification, object detection, semantic segmentation, instance segmentation, part segmentation, objectness estimation with one consistent API!
  • 7/11/2017 - The next dataset, trainval_merged, is in the works. It will provide data for image classifcation, object detection, semantic segmentation, instance segmentation, part segmentation, objectness estimation, occlusion recognition and boundary detection in a single JSON file. Stay tuned, and continue contacting pascalindetail@gmail.com with your questions.
  • 7/6/2017 - Another two tasks, instance segmentation and part segmentation, have been released with Python and MATLAB APIs.
  • 6/27/2017 - The competition has been officially announced! The Detail API has been updated to version 2.2, the list of tasks have been finalized, and we've updated the website. Good luck!
  • 6/16/2017 - The first dataset for 2 tasks, semantic segmentation and object detection, has been released with Python and MATLAB APIs. Evaluation scripts and a submission system are not yet available, but are on the way. Stay tuned!

DOWNLOADS

The Detail API

Our Python API for working with the raw JSON data, called the Detail API, is available on Github. The current version, 4.0.1, does not provide an evaluation script for the dataset, but that will be added in the future. Any issues with the API may be reported as a Github issue.

To download the API to a directory named detail-api:

To upgrade from an older version:

git pull

See the README of the repo for instructions on how to get set up with the API in Python 2 or 3.


Images

We use about 10,100 of the training and validation JPEG images from the PASCAL VOC 2010 challenge and about 4,100 of the test images. These images are available on the official PASCAL VOC website as part of the challenge's toolkit, VOCdevkit. We will keep the test annotations a secret in order to prevent participants from overfitting to the set we use for evaluation.

The Detail API provides a python script for downloading and extracting this toolkit to a directory of your choice. To download VOCdevkit to /your/directory/VOCdevkit, do the following:

cd detail-api
python ./download.py pascal /your/directory


Data

We provide the raw annotation data as JSON, structured similarly to that of the MS COCO challenge. As new data becomes available, we will link to it here. To download a dataset, for example, "trainval_example.json", to /your/directory/trainval_example.json, you may use the Detail API's download script:

python3 ./download.py trainval_example /your/directory


trainval_merged.json (JSON download link)

  • Provides data for 8 tasks: instance segmentation, part segmentation, semantic segmentation, object detection, image classification, objectness estimation, boundary estimation, and occlusion estimation.
  • Though boundary and occlusion data are in the JSON, the Detail API does not yet provide visualizations for them. The boundary data is provided as a set of binary masks for which nonzero pixels correspond to boundaries, so visualizing it is as simple as using numpy.imshow().
  • Released July 17, 2017. Updated on July 18, 2017.
  • Use API version 4.0.1 or later
  • To download: python3 ./download.py trainval_merged /your/directory
  • Demo

trainval_parts.json

  • Provides data for: instance segmentation and part segmentation.
  • Released July 6, 2017.
  • Use API version 3.0

trainval_preview1.json

  • Provides data for: semantic segmentation and object detection.
  • Released June 16, 2017.
  • Use API version 2.2


Upcoming:

trainval_all.json

    • Provides data for: human action recognition and human keypoint estimation.
    • Estimated release: July 20, 2017

Looking for data for the taster tasks? See the tasters page.

Task Summary

The challenge will provide detailed per-pixel annotations of the PASCAL VOC 2010 images for 10 computer vision tasks:

Challenges:

    1. Image Classification
    2. Object Detection
    3. Semantic Segmentation
    4. Instance Segmentation
    5. Objectness Estimation
    6. Object Part Segmentation
    7. Boundary Detection
    8. Occlusion Recognition
    9. Human Keypoint Estimation
    10. Human Action Recognition

Taster challenges:

    1. Saliency Estimation
    2. Line Segment Detection
    3. Symmetry Detection

The data of this competition was collected from several research publications. We elaborate on each task, and its corresponding publication, below.

Instance Segmentation

Instance segmentation is the task of labeling individual object instances in an image. Instance segmentation for the original 20 PASCAL VOC classes will be provided.

For details, see also Roozbeh Mottaghi, Xianjie Chen, Xiaobai Liu, Nam-Gyu Cho, Seong-Whan Lee, Sanja Fidler, Raquel Urtasun, and Alan L. Yuille. "The role of context for object detection and semantic segmentation in the wild." In CVPR, 2014. [PDF][Project][Citations]

Primary creator and maintainer: Xiaochen Lian (left, now at Baidu Research) and Xiang Xiang (right, JHU).

Semantic Segmentation

Semantic segmentation is the classic visual task of assigning a class label to every pixel in an image. Semantic segmentation for 59 classes, including the original 20 PASCAL VOC classes, will be provided.

For details, see Roozbeh Mottaghi, Xianjie Chen, Xiaobai Liu, Nam-Gyu Cho, Seong-Whan Lee, Sanja Fidler, Raquel Urtasun, and Alan L. Yuille. "The role of context for object detection and semantic segmentation in the wild." In CVPR, 2014. [PDF][Project][Citations]

Primary creator and maintainer: Xianjie Chen (left, now at Facebook Research) and Xiaochen Lian (right, now at Baidu Research).

Part Segmentation

Part segmentation is the task of splitting object instances into parts based on their semantic classes. This dataset provides semantic part segmentation for 16 out of the 20 original PASCAL semantic categories.

For a list of the semantic parts, see the project page.


For details, see Xianjie Chen, Roozbeh Mottaghi, Xiaobai Liu, Sanja Fidler, Raquel Urtasun, and Alan L. Yuille. "Detect what you can: Detecting and representing objects using holistic models and body parts." In CVPR, 2014. [PDF][Project][Citations]

Primary creator and maintainer: Xianjie Chen (left, now at Facebook Research) and Peng Wang (right, now at Baidu Research).

Boundary Detection

Boundary detection is the task of labeling boundaries between semantic classes. We provide ground truth boundaries for 59 semantic classes - namely, the same 59 classes used for the semantic segmentation task, described above.

For details, see also Vittal Premachandran, Boyan Bonev, and Alan L. Yuille. “PASCAL Boundaries: A Class-Agnostic Semantic Boundary Dataset”. In WACV, 2017. [PDF]

Primary creator and maintainer: Vittal Premachandran (Johns Hopkins Univ.)

Occlusion Recognition

Occlusion recognition is the task of determining border ownership. In other words, the task is to determine whether one object is "causing" the boundary between itself and another object by occluding the other. In the figures above, this is visualized with the "left hand rule": the object to the left of the arrow is the occluder, while the object to the right of the arrow is the occludee.

For details, see also Peng Wang, and Alan L. Yuille. "DOC: Deep OCclusion Estimation from a Single Image." In ECCV, 2016. [PDF][GitHub]

Primary creator and maintainer: Peng Wang (now at Baidu Research)

Keypoint Estimation

Keypoint estimation is the task of identifying keypoints of the objects in an image.

For details, see Fangting Xia, Peng Wang, Xianjie Chen, Alan Yuille, Joint Multi-Person Pose Estimation and Semantic Part Segmentation in a Single Image. In CVPR, 2017. [PDF]

Primary creator: Fangting Xia (now at Google)

Remaining Tasks

Human action recognition is the task of identifying the action of every person in an image. Data for this task will be collected by the PASCAL in Detail team and published when ready.

Image classification is the task of identifying which recognizable objects are present in an image. We will provide ground truth for the 59 classes. Data for this task will be generated from the semantic segmentation data. It's worth mentioning that our image classification is a multi-class recognition problem, which means each image is associated with 59 booleans.

Object detection is the task of finding and identifying objects in an image. We will provide ground truth for the PASCAL 20 categories. Data for this task will be generated from the instance segmentation data.

Objectness is the task of identifying whether an image window contains an object of any non-background class. Ground truth for the PASCAL 20 classes will be provided. Data for this task will be generated from the instance segmentation data.

taster TASKS

The following tasks aren't part of the competition because their datasets are too small. Nonetheless, we'll make these datasets available for download, in case some participants find them useful in their research.

Symmetry Detection

Symmetry mapping is the task of identifying the symmetry axes of object instances in an image.

For details, please see

  • Wei Shen, Kai Zhao, Yuan Jiang, Yan Wang, Xiang Bai, Alan Yuille. DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images. arXiv:1609.03659. (PDF)
  • Wei Shen, Kai Zhao, Yuan Jiang, Yan Wang, Zhijiang Zhang, and Xiang Bai. Object skeleton extraction in natural images by fusing scale-associated deep side outputs. In CVPR, 2016. (PDF)

Primary creator: Wei Shen (Johns Hopkins Univ.)

Saliency Detection

Saliency detection is the task of recording which parts of an image "catch the eye", using an eye tracker. We will provide free-viewing fixation data for 850 images from PASCAL VOC 2010.

For details, see Li, Yin, Xiaodi Hou, Christof Koch, James M. Rehg, and Alan L. Yuille. "The secrets of salient object segmentation." In CVPR, 2014. [PDF][Project][Citations]

Primary creator and maintainer: Yin Li (left, Georgia Tech) and Xiaodi Hou (right, now at TuSimple)

Line Segment Detection

Line segment detection is the task of annotating any line segments present in an image. This track will provide line segments for 102 images (not from PASCAL).

For details, please see Cho, Nam-Gyu, Alan L. Yuille, and Seong-Whan Lee. "A Novel Linelet-based Representation for Line Segment Detection". Accepted by IEEE Trans. PAMI, vol. PP, iss. 99, 2017 (DOI: 10.1109/TPAMI.2017.2703841).

Primary creator and maintainer: Nam-Gyu Cho (Korean Univ.).