TiSeLaC : Time Series Land Cover Classification Challenge

Nowadays, modern earth observation programs produce huge volumes of satellite images time series (SITS) that can be useful to monitor geographical areas through time. How to efficiently analyze such kind of information is still an open question in the remote sensing field. In the context of land cover classification, exploiting time series of satellite images, instead that one single image, can be fruitful to distinguish among classes based on the fact they have different temporal profiles.

The objective of this challenge is to bring closer the Machine Learning and Remote Sensing communities to work on such kind of data. The Machine Learning community  has the opportunity to validate and test their approaches on real world data in an application context that is getting more and more attention due to the increasing availability of SITS data while, this challenge offers to the Remote Sensing experts a way to discover and evaluate new data mining and machine learning methods to deal with SITS data.

The challenge involves a multi-class single label classification problem where the examples to classify are pixels described by the time series of satellite images and the prediction is related to the land cover of associated to each pixel. A more detailed description follows.


Organizers

Discovery Challenge Chairs

  • Dino Ienco, UMR TETIS - IRSTEA, Montpellier, France

Challenge Organizers

  • Dino Ienco, UMR TETIS - IRSTEA, Montpellier, France
  • Raffaele Gaetano, UMR TETIS - CIRAD, Montpellier, France


Task & Dataset


The task consists in the prediction of the Land Cover class of a set pixels given their time series acquired by the satellite images time series. Both training and test data comes from the same time series of satellite images, this means that they span over the same time period and both set of samples are generated by the same distribution. Figure 1 shows a picture of the study area.


                                (a)                                                                        (b)
Figure 1: The Reunion Island site (a) and the corresponding classification according to the considered Land Cover Classes (b)




The dataset has been generated from an annual time series of 23 Landsat 8 images acquired in 2014 above the Reunion Island (2866 X 2633 pixels at 30~m spatial resolution), provided at level 2A. Source data have been further processed to fill cloudy observations via pixel-wise multi-temporal linear interpolation on each multi-spectral band (OLI) independently, and compute complementary radiometric indices (NDVI, NDWI and brightness index - BI). A total of 10 features (7 surface reflectances plus 3 indices) are considered for each pixel at each timestamp. 

Reference land cover data has been built using two publicly available dataset, namely the 2012 Corine Land Cover (CLC) map and the 2014 farmers' graphical land parcel registration (Régistre Parcellaire Graphique - RPG). The most significant classes for the study area have been retained, and a spatial processing (aided by photo-interpretation) has also been performed to ensure consistency with image geometry. Finally, a pixel-based random sampling of this dataset has been applied to provide an almost balanced ground truth. The final reference training dataset consists of a total of 81714 pixels distributed over 9 classes. 
More in detail, the training dataset is composed by three different files:
- A file containing the pixels values
- A file containing the pixels coordinates w.r.t. the 2866 X 2633 pixels grid
- A file containing the class values
Each file will contains 81714 rows (one for each pixel).

The first file contains 230 columns (10 features x 23 dates). The columns are temporally ordered, this means that features from 1 to 10 correspond to the first timestamps, features from 11 to 20 correspond to the second timestamps, ..., features from 220 to 230 correspond to the last timestamps. The feature order, for each timestamps, is the same:  7 surface reflectances (Ultra Blue, Blue, Green, Red, NIR, SWIR1 and SWIR2) plus 3 indices (NDVI, NDWI and BI).

The second file has 2 columns (row, column) of the corresponding pixel time series. This additional information represents the spatial coordinates of the pixel on the image grid. This file contains as many rows as the previous file.

The third file contains the Land Cover Classes for the training set. The class file contains as many rows as the other files. The value in a row is the class of the corresponding pixel (at the same row) in the other two files.


Details about class distribution in the training data are reported in Table 1.


 Class IDClass Name  # Instances
 1Urban Areas16000
 2 Other built-up surfaces 3236
 3 Forests 16000
 4 Sparse Vegetation 16000
 5 Rocks and bare soil 12942
 6 Grassland 5681
 7 Sugarcane crops 7656
 8 Other crops 1600
 9 Water 2599

                                    Table 1


The source data are provided by the French Pôle Thématique Surfaces Continentales - THEIA and preprocessed by the Multi-sensor Atmospheric Correction and Cloud Screening (MACCS) level 2A processor developed at the French National Space Agency (CNES) to provide accurate atmospheric, environmental and geometric corrections as well as precise cloud masks.



Submission and Important Dates

1/07   - The Challenge starts, people need to register and the training dataset will be online
20/07 - The Test dataset will be released to participants
24/07 - The classification results need to be sent to the Challenge chairs
26/07 - The results of the challenge are available on the web site

Important: No dashboard is available during the competition to monitor performances. One submission per team is allowed at the end of the Challenge (24/07).


Evaluation Criteria

In order to evaluate the challenge we use F-Measure. To compute the F-Measure we will use the function available in scikit-learn described at this page. The F-Measure will be computed with the option average='weighted'.

Prizes

- First place: 1 Free Registration (per team) for the ECML/PKDD 2017 Conference
- Second place: 1 Free Registration (per team) for the ECML/PKDD 2017 Conference

Download

The following material:

Note that the participants are not requested to submit any runs during the challenge but only a final submission (by mail) to the challenge. 

The mail address for the submission is: tiselac@gmail.com

!!!!!!!!! The submission format may be the same as the Class_labels format for the training data !!!!!!!!


The registration to the challenge can be done by filling the form HERE.
When the challenge starts and you will get a link (by e-mail) to download the data.


Contacts

For any request or clarification please contact us at:  tiselac@gmail.com


Comments