Map Similarity

Map Similarity

JOSE LOPEZ-COLLADO

July 2015

Frequently, we need to determine how maps are similar to each other. A program that do just that is the Map Comparison Kit MCK, and is free. However, for those who want to delve into the guts of the computations, here are two ways to compare maps, or actually matrices representing maps.

The most elementary comparison is by performing a Pearson correlation between two rasters by sampling the target regions. Given that an asc file contains null values, is necessary to obtain valid values. In addition, sometimes the rasters do not contain valid values in the same position so is necessary to exclude those points as well.

A set of different similarity indices (computed as differences, the complement) are described in the MCK; to do this, we need to standardize the values so the differences are meaningful. The indices computed by the MCK are (where r1 and r2 are maps 1 and 2):

Difference: r2-r1
Absolute difference: Abs(r2-r1)
Scaled difference: (r2-r1)/Max(Abs(r2-r1))
Scaled absolute difference: Abs(r2-r1)/Max(Abs(r2-r1))
Relative difference: r2/r1
Absolute relative difference: Abs(r2/r1)

here, we illustrate computing the difference between raster values. The next lines of code perform the operations:

Define sample size of pairs

Read two asc files

Define the functions to: standardize the rasters, get a valid sample of pairs, and compute the difference.

Standardize the asc files

Obtain a valid sample

Compute the correlation coefficient

Compute the difference raster

Make a 3d histogram and a scatter plot

Display the difference map

Cluster the values to improve the difference map

Notice:

The example files are in the File Box page

(* ===============code start ============================== *)

SetDirectory[NotebookDirectory[]];

nsize = 3000;

SeedRandom[830333];

(* asc files are supposed to be in the working directory *)

bio1 = Drop[Import["asc1fin.asc", "Table"], 6];

bio2 = Drop[Import["asc12fin.asc", "Table"], 6];

(*stdMap is used to standardize the raster, Note the use of N[] to force Mathematica to evaluate the operations, otherwise they may be produced as a symbolic expression, 5 is the number of significant digits *)

stdMap[xmap_, minx_, maxx_, nullv_] :=If[xmap == nullv, nullv, If[(xmap <= maxx) && (xmap >= minx),N[(xmap - minx)/(maxx - minx), 5], 0]];

SetAttributes[stdMap, Listable];

(*stdRaster get the min and max values from valid values obtained from the original raster and then adjust those values between 0 and 1*)

stdRaster[xmap_, nullv_] := Module[{v1, vect, minv, maxv},

v1 = Flatten[xmap];

vect = Select[v1, # > nullv &];

minv = Min[vect];

maxv := Max[vect];

stdMap[xmap, minv, maxv, nullv]

]

(*get a valid sample of pairs *)

getValidSample[b1_, b2_, invalid_] :=

Module[{ids, wid, fig, s1, s2, validId},

validId = Position[b1, x_ /; (x > invalid)];

ids = RandomSample[validId, nsize];

s1 = Extract[b1, ids];

s2 = Extract[b2, ids];

wid = Position[s2, invalid];

If[Dimensions[wid, 1] > 0, s1 = Delete[s1, wid];

s2 = Delete[s2, wid]];

{{Dimensions[wid, 1]}, Dimensions[s1], Thread[{s1, s2}]}

]

(*Write the difference function *)

fDiffValues[v1_, v2_, nullv_] :=

If[(v1 == nullv) || (v2 == nullv), nullv, v2 - v1];

SetAttributes[fDiffValues, Listable];

(*===========Apply functions to data ============*)

(* standardize rasters *)

r1 = stdRaster[bio1, -9999];

r2 = stdRaster[bio2, -9999];

(*get the sample of pairs*)

vs = getValidSample[r1, r2, -9999];

(*Check invalid pairs and final sample size*)

Flatten[{vs[[1]], vs[[2]]}]

{0, 3000}

(*Get the list of pairs*)

x1x2 = vs[[3]];

(* Compute correlation, not very similar... *)

Correlation[x1x2[[All, 1]], x1x2[[All, 2]]]

0.159819

(* Compute the difference map *)

difMap = fDiffValues[r1, r2, -9999];

(*=======Do some plots and maps =======*)

(*Create a 3d histogram with the pairs*)

Histogram3D[x1x2, ChartElementFunction -> "SegmentScaleCube", BaseStyle -> {FontSize -> 16, FontFamily -> "Arial"}, ImageSize -> 450]

(*Create a list of points*)

lp = ListPlot[x1x2, PlotStyle -> {Opacity[0.6]}, PlotMarkers -> {\[GrayCircle], 16}, BaseStyle -> {FontSize -> 16, FontFamily -> "Arial"}]

(* Display the difference map *)

figd = ArrayPlot[difMap, ColorFunction -> "Rainbow", PlotRange -> {Full, Full, {-1, 1}}, ImageSize -> 350]

Cluster the value to improve presentation of differences.

Tally the values to color each class, null values are also included. Mathematica 8 uses the K-means clustering algorithm as default

cMap = ClusteringComponents[difMap, 5];

Tally[Flatten[cMap]]

{{1, 303006}, {2, 41966}, {3, 16079}, {4, 18905}, {5, 11054}}

(* display the map in clusters, red shows the most difference, green the least difference *)

clustMap = ArrayPlot[cMap, ColorRules -> {1 -> White, 2 -> Red, 3 -> Orange, 4 -> Yellow, 5 -> Green}, ImageSize -> 350]

(* ==================end of code ========================= *)

Map similarity is an active research subject, here we just browse the basics.