Aerial imagery plays an important role in land-use planning, population analysis, precision agriculture, and unmanned aerial vehicle tasks. However, existing aerial image datasets generally suffer from the problem of inaccurate labeling, single ground truth type, and few category numbers. In this work, we implement a simulator that can simultaneously acquire diverse visual ground truth data in the virtual environment. Based on that, we collect a comprehensive Virtual AeriaL Image Dataset named VALID, consisting of 6690 high-resolution images, all annotated with panoptic segmentation on 30 categories, object detection with the oriented bounding box, and binocular depth maps, collected in 6 different virtual scenes and 5 various ambient conditions (sunny, dusk, night, snow and fog). To our knowledge, VALID is the first aerial image dataset that can provide panoptic level segmentation and complete dense depth maps. We analyze the characteristics of VALID and evaluate state-of-the-art methods for multiple tasks to provide reference baselines. The experiment results demonstrate that VALID is well presented and challenging.
From top left to bottom right are original image, detection bounding box, depth map, panoptic segmentation, instance segmentation, and semantic segmentation respectively.
Fully VALID are seperated into VALID-Depth, VALID-Seg and VALID-Det.
dataset split file can be accessed at
Baiduyun: https://pan.baidu.com/s/1ZvG1ENSP_HVHfl7_d--OGQ coe: y8kq
Google Drive: https://drive.google.com/open?id=1v4mWL8qwSTj3RvzMuxoXrS3Eoti-A1d9
We provide the 16 bit ".png" depth map for each image in the dataset. So please read the depth png file as follows (take opencv as example):
img = cv2.imread("/path/to/file", -1)
The real depth value = The value in the file / 256.
Download link:
Baiduyun: https://pan.baidu.com/s/1W1CI2Q4Jp8yY_OQDugK9bg code: orop
Google Drive: https://drive.google.com/open?id=16VIzmBRu3zOEKEog4seUUxnXPWEOX68x
For object detection task, we provide the original ".jpg" RGB images, the ".json" label file for each image and the category file for the whole dataset.
Baiduyun: https://pan.baidu.com/s/13UOz_cIlWwCjTvGwqiKLbQ code: 9wth
Google Drive: https://drive.google.com/open?id=1Q1j_OgNlnJG2RlzQniFSIpLjozGAJ1D_
Baiduyun: https://pan.baidu.com/s/1rERaaXRt4GsRBLi6P3eNjw code: u58q
Google Drive: https://drive.google.com/open?id=1F6Xp5QLmE9vwjwzh82Gsq1jUyzV8m7ds
Baiduyun: https://pan.baidu.com/s/1uYjADj5fS0UHEZwPaicbcA code: 58n2
Google Drive: https://drive.google.com/open?id=1Av9tHAamg2oH_JpOpqNH1Db9C5TmKfH2
{
"id": "06689", # image unique id
"width": 1024, # image width
"height": 1024, # image height
"file_name": "images/seaside/50/img_2_0_1552116963150118100.png",
“detection”: [ # all instance list
{
"id": 14764361, # instance id
"category_id": 15, # category id
"category_name": "building", # category name
"segmentation": polygon-style contour # contour are generated based on cv2.findContour() function
"hbbox": [39, 0, 316, 180], # top-left (x, y) and w, h
"obbox": [[354, 179], [39, 179], [39, 0], [354, 0]] # four vertex (x, y)
},
{ instance 2 },
...
],
"segmentation":
{
"panoptic_filename": "panoptic/seaside/50/img_2_5_1552116963148972700.png", # panoptic ground truth file
"semantic_filename": "semantic/seaside/50/img_2_5_1552116963148972700.png" # semantic ground truth file
}
}
Note: For each insatnce on the panoptic ground truth image, suppose its color is (R, G, B), then the instance_id = 256*256*R + 256*G + B
{
...
{"id": 1, "category_name": "tree", "color": [89, 121, 72], "isthing": 0},
...
{"id": 17, "category_name": "person", "color": [242, 107, 146], "isthing": 1},
...
}
Note: For "thing" category, the color is the semantic color in semantic segmentation ground truth.
For instance segmentation task, the polygon-style mask segmentation has already included in the label json. So please download the image, label and category file above.
For panoptic and semantic segmentation, we provide the panoptic and semantic ground truth and for each image.
Baiduyun: https://pan.baidu.com/s/1VaGDR-zOS_G7ElqQmoQk4g code: 3uzq
Google Drive: https://drive.google.com/open?id=1z2jWB3MOfRmI9-nTXHxkqTEMnfLH55nh
Baiduyun: https://pan.baidu.com/s/1eHXwa-fkmE6EgbgGIrIKCQ code: rctc
Google Drive: https://drive.google.com/open?id=1QpJywG6fVtk_a2xxwDZJVxc6LEOgunvm
For semantic segmentation ground truth file, you can generate the label based on color in the category file.
Neighborhood
Downtown
Airport
Night Street
Snow Mountain
Seaside Town
In each scene, images at three altitudes (20m, 50m, 100m) with their corresponding ground truth data (detection bounding box, panoptic segmentation, depth map) are presented. For the annotation of object detection, blue color horizontal bounding box and red color oriented bounding box of each instance are both displayed. Note that the images in night street scene are at the same altitude because we only collect data at 20 meters height in this environment.