Data‎ > ‎

Zurich Summer Dataset

This page contains links to download an *updated version* of the dataset (Zurich Summer Dataset v1.0), previously used in :  


Creative Commons License
Unless otherwise noted, all images on this site are the property of DigitalGlobe, Inc. 
and are licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License


We all strive for new benchmarks. So we managed, thanks to some great guy working at DigitalGlobe, to obtain a permissive licensing for some chips of a large QuickBird scene of the city of Zurich acquired in 2002. 
The "Zurich Summer v1.0" dataset is a collection of 20 chips (crops), taken from a QuickBird acquisition of the city of Zurich (Switzerland) in August 2002. QuickBird images are composed by 4 channels (NIR-R-G-B) and were pansharpened to the PAN resolution of about 0.62 cm GSD. We manually annotated 8 different urban and periurban classes : Roads, Buildings, Trees, Grass, Bare Soil, Water, Railways and Swimming pools. The cumulative number of class samples is highly unbalanced, to reflect real world situations. Note that annotations are not perfect, are not ultradense (not every pixel is annotated) and there might be some errors as well. We performed annotations by jointly selecting superpixels (SLIC) and drawing (freehand) over regions which we could confidently assign an object class. 



The dataset is composed by 20 image - ground truth pairs, in geotiff format. Images are distributed in raw DN values.  We provide a rough and dirty MATLAB script (preprocess.m) to: 
i) extract basic statistics from images (min, max, mean and average std) which should be used to globally normalize the data (note that class distribution of the chips is highly uneven, so single-frame normalization would shift distribution of classes).
ii) Visualize raw DN images (with unsaturated values) and a corresponding stretched version (good for illustration purposes). It also saves a raw and adjusted image version in MATLAB format (.mat) in a local subfolder.
iii) Convert RGB annotations to index mask (CLASS \in {1,...,C}) (via rgb2label.m provided).  
iv) Convert index mask to georeferenced RGB annotations (via rgb2label.m provided). Useful if you want to see the final maps of the tiles in some GIS software (coordinate system copied from original geotiffs).

Some requests from you

We encourage researchers to report the ID of images used for training / validation / test (e.g. train: zh1 to zh7, validation zh8 to zh12 and test zh13 to zh20). The purpose of distributing datasets is to encourage reproducibility of experiments. 


We release this data after a kind agreement obtained with DigitalGlobe, co. This data can be redistributed freely, provided that this document and corresponding license are part of the distribution. Ideally, since the dataset could be updated over the time, I suggest to distribute the dataset by the official link from which this archive has been downloaded

We would like to thank (a lot) Nathan Longbotham @ DigitalGlobe and the whole DG team for his / their help for granting the distribution of the dataset. 
We release this dataset hoping that will help researchers working in semantic classification / segmentation of remote sensing data in comparing to other state-of-the-art methods using this dataset as well in testing models on a larger and more complete set of images (with respect to most benchmarks available in our community). As you can imagine, it has been a tedious work in preparing everything. Just for you. 

If you are using the data please cite the following work (a journal paper is slowly coming)


For any bug, comment or request, feel free to contact me at michele ( a dot ) volpi ( an at ) ed ( a dot ) ac ( a dot ) uk