Semantic Segmentation (CRF baseline)

Semantic segmentation is the task of labelling each pixel of an image with a semantic category. The framework works following these phases:
  1. database preparation: extraction of filenames and few useful things to ease the process;
  2. super-pixel extraction: pixels are clustered into coarser patches that will be classified;
  3. feature extraction: for each super-pixel, texture, colour and position features are extracted;
  4. graph creation: the minimum spanning tree (MST) that connects the patches is created
  5. dataset writing: a compact xml description is written for each image, to be used by the classifier
  6. CRF training and testing: the CRF learns the appearance and the pairwise connections
The different phases are commented and the corresponding software is given. For the sake of explanation, the software can be used on the MSRC database. This can be downloaded at the Microsoft Research website.

1. Database Preparation

I will consider the database to be decompressed in ~/Images/msads/, for the sake of this document. 
cd ~/Images/msads/

As first thing, after extracting the images, I shall rename the filenames to be in the form n_nn_s.bmp (images) and n_nn_s_GT.bmp (ground truth) -- that is, for the sake of simplicity, I want the file 1_1_s.bmp to be 1_01_s.bmp. In the images folder:
rename  's/_(.*?)_/_0$1_/'  *_?_s*

Then we need a list of filenames for the dataset, without extension. For the MSRC dataset, in the directory with the BMPs, run:
find -name '*_s.bmp' -printf %f\\n | sort |  awk -F.  '{ print $1 }' > filenames.txt

2. Super-pixel extraction

This step is meant to obtain a file containing a segmentation map, that is, an image whose pixels have as a value the ID of the regions.
I've used Normalised Cuts, for the time being (it's slow, it can be changed). The code is from Berkeley, can be downloaded here. You can find additional info on the BSE website.

You can extract and compile the BSE software, it should be quite straightforward (it requires g++, gfortran and some libraries).
Once the code is compiled, the segmentation is done with the segment program. It requires JPG images.

To obtain JPG images from BMP in your database (e.g., for MSRC dataset), using ImageMagick:
mogrify -format jpg *_s.bmp
mkdir jpegs
mv *.jpg jpegs


Once you have the jpg files, the segmentation is obtained using a short script, segment_script.py.
This simple python script suppose that your images are in the subdirectory jpegs, it will put the results in segmented, and it will segment into 300 super-pixels.
After copying the segment binary from BSE in the dataset directory, the script can be used via: (it requires python installed)
chmod +x segment_script.py
mkdir segmented
./segment_script.py filenames.txt

this will produce a set of .seg files in the segmented/ subdirectory. They can be read in Matlab via the scripts that ship with BSE.

The segmentation is obtained. For structure selection, the images containing the probability of boundary pixels are requested. These image contain for each pixel the probability that the pixel is a boundary between two objects, calculated by the BSE suite and used in the segmentation as well. After downloading the pb_script.py in the base dir, and copying the pixelfeatures binary from BSE in the dataset directory, the script is used as::
chmod +x pb_script.py
mkdir pb
./pb_script.py filenames.txt
The pixel boundary probability images will be in the pb/ subdirectory.

3. Feature Extraction, graph creation and dataset writing

We'll use texture, colour and position features.For the texture features, in the BSE suite, the program plainfilter associates a feature vector that is a histogram of oriented directional differential filters to each pixel. These coefficients can be obtained through the script plainfilter_script.py
chmod +x plainfilter_script.py
mkdir filtered/
./plainfilter_script.py
This will fill a set of (big) files with filters coefficients in the subdirectory filtered/. This is the plain frequency response of the filterbank, for each image and each pixel a list of coefficient over 39 channels. The feature vectors for each patch are calculated in matlab as part of the matlab code attached to this document. A GM model is used, and the code is obtained from the Mathworks on-line file collection.

The colour features are directly calculated in Matlab as part of the matlab code. The calculation of the graph structure and the storage of the xml description is performed in Matlab as well. This is done in a single batch file, feature_extract.m. In particular, extracting the archive matlab.tar.gz in the base directory should lead to another directory named matlab/. In this directory, the main script is do_all.m and it contains:
  1. allocation of the database parameters data structure (dbinfo)
  2. calculation of the texture features (texmog)
  3. calculation of the hue features (hues)
  4. calculation of the neighbourhood metrics for MST and pyramids (the latter not used for baseline)
  5. writing of the xml files with all this information.
By simply running the script, all the relevant xml files should be generated in the xml/ subdirectory.
ċ
BSE-1.2.tar.gz
(338k)
Giuseppe Passino,
13 Apr 2010, 08:05
ċ
Giuseppe Passino,
14 Apr 2010, 01:29
ċ
Giuseppe Passino,
13 Apr 2010, 07:55
Comments