Symbolizing vector features
Before tackling this tutorial, you will need to download and install a dataset following these instructions:
Create a folder called vector_symbology somewhere under your personal directory (e.g. C:\Users\jdoe\Documents\Tutorials\vector_symbology\).
Download the data for this exercise then extract the contents of symbolizing_vectors.zip into your newly created vector_symbology folder.
Open the map document
Open the Symbolizing.aprx ArcGIS project.
The map document consists of four data layers: Cities (a point layer), Interstate (a polyline layer), Water (a polygon layer) and northeast (another polygon layer).
We will first focus on symbolizing a choropleth layer. A choropleth map is one where polygons are symbolized by an interval or ratio scale field. We'll look at symbolizing two variables: income and election results. We'll also learn how to symbolize polygons by categorical values. We'll then explore symbology schemes for polylines (Interstate layer) and points (Cities layer).
Exploring a variable
Before we jump straight into any of the layers' symbology, it's good practice to explore a variable using basic exploratory tools such as a histogram or boxplot. Such exercise can be helpful in defining an appropriate classification scheme.
In the Contents pane, select the northeast layer. This will activate the Data tab.
In the Data tab, click on the Create Chart pull-down menu and select Histogram.
In the Chart Properties pane (this should show up on the right-hand side of the window), select Income in the Number field, and play with the number of bins.
Notice how the look and feel of the data distribution changes with differing bin counts. For example, a 10 bin scheme suggests a somewhat normally distributed set of values when ignoring the outliers above $79,000 yet, when bumped up to 19 bins, the core distribution looks far from normal with a prominent peak at $60,000. We could explore higher bin numbers but doing so may not offer much more insight into the distribution of income as the number of observations in each bin diminishes.
Learning about the distribution of a value can help guide our choice of classification breaks if the goal is to convey interesting aspects of the data. For example, we might want to assign a dedicated class to the upper income values (above $91,600 for example) as well as a separate class for the lower income values (below $41,800, for example) and another dedicated class for values around $60,000.
However, in this exercise, we'll explore some semi-automated approaches to classifying the breaks.
Symbolizing income using sequential color schemes
We'll adopt a sequential color scheme in the examples that follow. A sequential color scheme usually consists of a single hue whose lightness varies from light to dark across the full range of color classes.
A choropleth map makes use of the Graduated color scheme. While making sure that the northeast layer is selected in the Contents pane, click on the Feature Layer tab then click on the Symbology pull-down menu
Select Graduated Colors from the Symbology pull-down menu.
This should activate the Symbology pane on the right-side of the ArcGIS window. Next, we'll explore 3 built-in classification methods: Jenks, Equal Interval and Quantile.
Jenks (natural breaks) classification
By default, ArcGIS Pro will adopt a Jenks (natural breaks) classification method.
In the Symbology pane, set the Field to Income.
Make sure that the Natural Breaks method is selected.
Set the number of classes to 6.
Change the color scheme to Blue-Green.
At the bottom of the Symbology pane, click on the Histogram tab to view the distribution of values vis-a-vis the classification scheme adopted by the Jenks method.
You'll recognize the rotated histogram we explored earlier in this exercise. The graphic enables us to see where the breaks are defined along the distribution. A Jenks classification method attempts to find natural clusters in the data. You can learn more about this method here.
Equal interval method
The equal interval classification method may seem to be the most intuitive method in that each class width is identical across all classes. This is analogous to having equal width histogram bins.
In the Symbology pane, change the method to Equal Interval.
You'll note that any changes made in the Symbology pane is instantly reflected in the map window as well as in the Histogram graphic.
Now, you might be tempted to increase the number of classes to match the 19 histogram bins we explored earlier in this exercise, however, this will prove ineffective in that our eyes can only discern so many different shades of a same hue. A good guide on properly symbolizing features (with suggested color schemes and class counts) is the Color Brewer website. Note that it does not recommend going above 9 classes for the green hue.
Quantile method
If the goal is to ensure the equal occurrence of each color class in a map (i.e. each color is represented roughly the same number of times), then the quantile method should be the classification method of choice.
In the Symbology pane, change the method to Quantile.
Stick to the 6 class count.
Each color swatch should be represented N/c number of times where N is the total number of polygons and c is the number of classes. Here, we have 56 features and 6 classes. The number of features per class is calculated as 56/6 = 9.3. This is not a whole number implying that we will have a slight imbalance in the counts per color swatch. A scan of the attribute table sorted on income suggests that the first color swatch covering all values up to an including $47,371 encompasses 10 polygons while the other 5 color swatches encompass 9 polygons each.
Other factors that can influence the distribution of classes are ties. This data layer has several ties (e.g. $95,668, $73,533, $61,242, etc..), many of which are a result of counties split across multiple polygons. But these ties are distributed such that they do not contribute to the imbalance in the classes, but they do influence our perception of income distribution by artificially inflating certain income value counts. In practice, dissolving this layer by county would probably be a prudent thing to do but such exercise will be left up to the reader .
You will probably have noticed that differences in classification schemes can influence the "look" of a map much like the differences in bin counts can influence the "look" of a histogram. It is therefore encouraged that you toggle back and forth between different classification schemes when exploring a choropleth map to ensure that you are teasing out as much valuable information from the data.
Symbolizing voting margins using divergent color schemes
Ordinal and ratio data can also be symbolized using a divergent color scheme. This scheme does not only convey a gradient of values, but it also conveys a measure of centrality about which the values diverge. Hence, we glean two bits of information from the data.
In this example, we will map the distribution of the 2016 presidential election results. More specifically, we will explore the margin of victory by which Clinton or Trump carried a county. The field of interest is Winner and it provides us with the fraction by which a candidate carried a county. A fraction of 0 would suggest a tie between candidates; A positive value gives us the fraction by which Clinton carried that county; A negative value gives us the fraction by which Trump carried that county. We'll adopt a four class symbology scheme with the bins delimiting the 10% margin of victory for each candidate.
Most classification schemes offered by the software will only provide you with sequential color scheme options. You can, however, force it to display a divergent color scheme by assigning a "critical break" to one of the classes as shown next.
Check that the northeast layer is still highlighted in the Contents pane.
If the Symbology pane is no longer open, follow earlier instructions to open it.
In the Symbology pane, set the Field to Winner.
Select either Quantile or Natural Break option from the Method pull-down menu. The choice does not matter since we will be changing the class intervals shortly.
Set the number of classes to 6.
Under the Upper value column, change the upper values for each break to -0.10, -0.05, 0, 0.05, 0.10 and 0.7.
This last upper value of 0.7 was chosen to ensure that all values greater than 0.1 in the Winner dataset is included in that last class (the largest value in the dataset should be 0.633).
Right-click on the third class (upper bound of 0) and select Set as critical break.
At this point, the divergent color schemes should be available. Select the Red-Blue scheme.
You'll note that the divergent colors assigned to our classes is unbalanced. However, this is fixable:
Right-click on the Upper value cell of the third break and select Remove as critical break.
Doing so should render a balanced color scheme by swapping the light blue color swatch with a light red one.
Finally, we'll change the Labels to something more meaningful (by default, the labels reflect the range of values associated with each class).
Modify the labels as shown in the accompanying figure.
Your final map should look something like this.
Symbolizing categorical features
In this next exercise, you will symbolize the northeast counties layer by State name. But first, we'll make a copy of the data layer so as not to wipe out the election margins symbology.
Right-click the northeast layer in the Contents pane and select Copy.
Right-click on the Map data frame name (in the Contents pane) and select Paste.
Select the newly pasted northeast layer.
Bring up its Symbology pane (via the Feature Layer tab).
Change the Primary symbology to Unique Values.
Set State for Field 1.
Keep the default color scheme.
All unique state names should automatically be added to the Classes table. Note that the colors automatically assigned to each swatch may differ on your PC from those shown here.
ArcGIS Pro allows for a wide range of customization options. Next, we will modify the polygon outline color for all swatches.
Modifying outlines for all symbols
In the Classes tab, click on the More pull-down menu and select Format all symbols.
This will bring up a new Symbology page whereby you can customize the look of all symbols in your map.
Click on the Properties tab.
Set the outline color to "arctic" white.
Click the Apply button at the bottom of the pane.
The changes should be reflected in the map window.
Click on the back arrow to return to the primary Symbology pane.
Your categorical map should look something like this. Note that the colors may differ on your computer.
Symbolizing all polygons the same way
So far we've learned how to symbolize polygons based on an attribute value. You can, of course, assign the same symbology to all polygons in a feature class. In this short exercise, you will apply a same symbology to the water layer polygons.
Move the Water layer above the topmost northeast data layer in the Contents pane. Make sure to check the box next to the layer name. This should make the layer visible in the map window.
With the Water layer selected, bring up its Symbology pane.
In the Symbology pane, click on the (pink) swatch symbol to bring up the symbol gallery.
Make sure that the Gallery tab is active.
In the Gallery tab search window, type Water then press the Enter key.
Several water symbols should be returned. Select the third option under the 2D section.
All water polygons should now be symbolized with the same color scheme.
Symbolizing point features
In this next exercise, you will modify the point symbols by assigning different point sizes based on that point layer's estimated maximum population count value stored in the POP_MAX field.
Make sure that the point layer is at the top of the layer stack in the Contents pane.
With the Cities layer selected, bring up its Symbology pane.
You'll note that the Symbology pane's content will change depending on the feature type.
If you want to apply the same customized symbol to all points, click on the point symbol in the Symbology pane.
Make sure that the Properties tab is selected.
Here, you can modify the point symbol shape, color, angle and size. You can also add a halo to the point symbol (this can be useful when you need to distinguish the point from a busy background).
However, we want the point symbol to change as a function of the point feature's POP_MAX value. This will be our next step.
Click on the back arrow to return to the primary Symbology tab.
Select Proportional Symbols from the pull-down menu.
Set the Field to POP_MAX.
Set the minimum and maximum point sizes to 6 and 30 respectively. (You may need to check the Maximum size checkbox to enable the point size field).
The latter options are somewhat arbitrary in that they are usually chosen to render an aesthetically pleasing range of point sizes. But note that this can distort a viewer's actual perception of the real underlying values.
You'll note at the bottom of the Symbology window a Histogram tab. If it's not activate, click on the Histogram tab.
The figure plots the histogram showing the range of POP_MAX values. You'll note that the data are strongly skewed toward higher values. This results in a range of point sizes not being represented in the map.
If you want to maximize the use of different point sizes in the map, you can assign the maximum point size to a different POP_MAX value.
Grab the little arrow next to the value at the bottom of the histogram and slide it up the scale such that the full range of point symbol size is limited to the lower end of the POP_MAX range (around 1,000,000)
This action is similar to changing the class intervals in a choropleth map whereby each class interval size can be modified to maximize the frequency of a given set of classes.
The difference in point size distribution is shown in the accompanying figure.
Next, we'll change the point symbol color.
Click on the Template icon.
In the next window, make sure that the Properties tab is selected.
Change the Color to a Tulip Pink.
Click Apply to see the changes reflected in the map.
We'll make one more modification to the point symbols. Given that we have several overlapping points, we'll change the point symbol's opaqueness to make them partially transparent.
The transparency setting is available on the Feature Layer ribbon.
Change the transparency setting to about 50% by sliding the transparency slider to the right.
Your map should look something like this. The background county polygons may differ in color.
Depending on your zoom level, the points may or may not overlap.
Symbolizing line features
In this last section, you will learn how to symbolize line features.
Turn off the Cities point layer in the Contents pane.
Turn on the Interstate layer and make sure that it appears near the top of the layers stack in the Contents pane.
With the Interstate layer selected, bring up its Symbology pane (see earlier steps for guidance on accessing the Symbology pane).
Click on the line symbol to bring up the Format Line Symbol pane.
We can change the color and width of the line feature, but we can also adopt a more elaborate line symbol from the Gallery tab.
Click on the Gallery tab.
Select the Highway template.
The Highway symbol provides a thin outline to the line symbol which makes the line features stand out a bit more than a simplar line symbol. However the outlines overlap line segments taking away the sense that these line features should be connected. We'll change this in this next step.
In the Format Line Symbol pane, click on the back arrow to return to the primary Symbology pane.
Click on the Symbol Layer drawing button.
Enable the symbol layer drawing switch.
This should render the line features seamless.
Labeling line features
Next, we'll add the interstate shield.
Click on the Labeling tab to bring up its ribbon.
Set the field to FULLNAME (this is the interstate number column) then click on the Label button.
By default, the map will label the text as it appears in the FULLNAME column. It will also place the labels parallel to the line segments. Next, we will tweak the label properties.
In the Labeling ribbon, click on the Text Symbol Style pull-down menu and select Shield 2. (Note that you may need to scroll halfway down the list to see this symbol).
Note that the Text Symbol Style button may not show up on your desktop if your ArcGIS Pro window is expanded in which case you will see the a pull-down menu instead.
In the Labeling ribbon, click on Shield under the Label Placement Style group.
Finally, we'll increase the distance between the labels.
In the Label Placement rubric, click on the small lower-right icon to bring up the label placement properties pane.
In the Label Class pane, click on the conflict resolution button (this is the third button).
Expand Remove duplicate labels.
Select Remove within fixed distance.
Set the search radius to 100 points.
Your final line feature should look something like this.
Note that the placement and number of highway shields will change depending on your zoom level. However, regardless of zoom level, the labels should not overlap.
A note about color blind safe colors
About ten to fifteen percent of the the population is believed to suffer from some degree of color blindness. This may, for example, make it difficult to distinguish between red and green hues of similar perceived lightness and saturation for some members of the population as shown in the following example.
In the figure below, the left image adopts a divergent color scheme using green and red hues. For someone who suffers from Deuteranopia (red-green color blindness), the map will actually look like the one on the right.
ArcGIS offers a tool that can simulate color blindness. You can access the Color Vision Simulator tool from the View tab.
The tool offers three different color vision simulations. Clicking on one of the simulators will convert your map to one as would be perceived by a person suffering from that color vision impairment.
To exit the Color Vision Simulator, click on the button itself (note that clicking on the individual color vision simulators will not deactivate this mode).
This wraps up this tutorial.