Symbolizing vector features

 Before tackling this tutorial, you will need to download and install a dataset following these instructions:

Open the map document

The map document consists of four data layers: Cities (a point layer), Interstate (a polyline layer), Water (a polygon layer) and northeast (another polygon layer). 

We will first focus on symbolizing a choropleth layer. A choropleth map is one where polygons are symbolized by an interval or ratio scale field. We'll look at symbolizing two variables: income and election results.  We'll also learn how to symbolize polygons by categorical values. We'll then explore symbology schemes for polylines (Interstate layer) and points (Cities layer).

Exploring a variable

Before we jump straight into any of the layers' symbology, it's good practice to explore a variable using basic exploratory tools such as a histogram or boxplot. Such exercise can be helpful in defining an appropriate classification scheme.

Notice how the look and feel of the data distribution changes with differing bin counts. For example, a 10 bin scheme  suggests a somewhat normally distributed set of values when ignoring the outliers above $79,000 yet, when bumped up to 19  bins, the core distribution looks far from normal with a prominent peak at $60,000. We could explore higher bin numbers but doing so may not offer much more insight into the distribution of income as the number of observations in each bin diminishes.

Learning about the distribution of a value can help guide our choice of classification breaks if the goal is to convey interesting aspects of the data. For example, we might want to assign a dedicated class to the upper income values (above $91,600 for example) as well as a separate class for the lower income values (below $41,800, for example) and another dedicated class for values around $60,000. 

However, in this exercise, we'll explore some semi-automated approaches to classifying the breaks.

Symbolizing income using sequential color schemes

We'll adopt a sequential color scheme in the examples that follow.  A sequential color scheme usually consists of a single hue whose lightness varies from light to dark across the full range of color classes. 

This should activate the Symbology pane on the right-side of the ArcGIS window. Next, we'll explore 3 built-in classification methods: Jenks, Equal Interval and Quantile.

Jenks (natural breaks) classification

By default, ArcGIS Pro will adopt a Jenks (natural breaks) classification method.

You'll recognize the rotated histogram we explored earlier in this exercise. The graphic enables us to see where the breaks are defined along the distribution. A Jenks classification method attempts to find natural clusters in the data. You can learn more about this method here.

Equal interval method

The equal interval classification method may seem to be the most intuitive method in that each class width is identical across all classes. This is analogous to having equal width histogram bins.

You'll note that any changes made in the Symbology pane is instantly reflected in the map window as well as in the Histogram graphic.

Now, you might be tempted to increase the number of classes to match the 19 histogram bins we explored earlier in this exercise, however, this will prove ineffective in that our eyes can only discern so many different shades of a same hue.  A good guide on properly symbolizing features (with suggested color schemes and class counts) is the Color Brewer website. Note that it does not recommend going above 9 classes for the green hue.

Quantile method

If the goal is to ensure the equal occurrence of each color class in a map (i.e. each color is represented roughly the same number of times), then the quantile method should be the classification method of choice.

Each color swatch should be represented N/c number of times where N is the total number of polygons and c is the number of classes.  Here, we have 56 features and 6 classes. The number of features per class is calculated as 56/6 = 9.3.  This is not a whole number implying that we will have a slight imbalance in the counts per color swatch. A scan of the attribute table sorted on income suggests that the first color swatch covering all values up to an including $47,371 encompasses 10 polygons while the other 5 color swatches encompass 9 polygons each. 

Other factors that can influence the distribution of classes are ties. This data layer has several ties (e.g. $95,668, $73,533, $61,242, etc..), many of which are a result of counties split across multiple polygons.  But these ties are distributed such that they do not contribute to the imbalance in the classes, but they do influence our perception of income distribution by artificially inflating certain income value counts. In practice, dissolving this layer by county would probably be a prudent thing to do but such exercise will be left up to the reader .

You will probably have noticed that differences in classification schemes can influence the "look" of a map much like the differences in bin counts can influence the "look" of  a histogram. It is therefore encouraged that you toggle back and forth between different classification schemes when exploring a choropleth map to ensure that you are teasing out as much valuable information from the data.

Symbolizing voting margins using divergent color schemes

Ordinal and ratio data can also be symbolized using a divergent color scheme. This scheme does not only convey a gradient of values, but it also conveys a measure of centrality about which the values diverge. Hence, we glean two bits of information from the data.

In this example, we will map the distribution of the 2016 presidential election results. More specifically, we will explore the margin of victory by which Clinton or Trump carried a county. The field of interest is Winner and it provides us with the fraction by which a candidate carried a county. A fraction of 0 would suggest a tie between candidates; A positive value gives us the fraction by which Clinton carried that county; A negative value gives us the fraction by which Trump carried that county. We'll adopt a four class symbology scheme with the bins delimiting the 10% margin of victory for each candidate.

Most classification schemes offered by the software  will only provide you with sequential color scheme options. You can, however, force it to display a divergent color scheme by  assigning a "critical break" to one of the classes as shown next.



This last upper value of 0.7 was chosen to ensure that all values greater than 0.1 in the Winner dataset is included in that last class (the largest value in the dataset should be 0.633).



You'll note that the divergent colors assigned to our classes is unbalanced. However, this is fixable:

Doing so should render a balanced color scheme by swapping the light blue color swatch with a light red one.

Finally, we'll change the Labels to something more meaningful (by default, the labels reflect the range of values associated with each class).

Your final map should look something like this.

Symbolizing categorical features

In this next exercise, you will symbolize the northeast counties layer by  State name. But first, we'll make a copy of the data layer so as not to wipe out the election margins symbology.

All unique state names should automatically be added to the Classes table. Note that the colors automatically assigned to each swatch may differ on your PC from those shown here.

ArcGIS Pro allows for a wide range of customization options.  Next, we will modify the polygon outline color for all swatches.

Modifying outlines for all symbols


This will bring up a new Symbology page whereby you can customize the look of  all symbols in your map.

The changes should be reflected in the map window.

Your categorical map should look something like this. Note that the colors may differ on your computer.

Symbolizing all polygons the same way

So far we've learned how to symbolize polygons based on  an attribute value. You can, of course, assign the same symbology to all polygons in a feature class. In this short exercise, you will apply a same symbology to the water layer polygons.

All water polygons should now be symbolized with the same color scheme.

Symbolizing point features

In this next exercise, you will modify the point symbols by assigning different point sizes based on that point layer's estimated maximum population count value stored in the POP_MAX field.

You'll note that the Symbology pane's content will change depending on the feature type. 


However, we want the point symbol to change as a function of the point feature's POP_MAX value. This will be our next step.

The latter options are somewhat arbitrary in that they are usually chosen to render an aesthetically pleasing range of point sizes. But note that this can distort a viewer's actual perception of the real underlying values.

You'll note at the bottom of the Symbology window a Histogram tab. If it's not activate, click on the Histogram tab. 

The figure plots the histogram showing the range of POP_MAX values.  You'll note that the data are strongly skewed toward higher values. This results in a range of point sizes not being represented in the map.

If you want to maximize the use of different point sizes in the map, you can  assign the maximum point size to a different POP_MAX value.

This action is similar to changing the class intervals in a choropleth map whereby each class interval size can be modified to maximize the frequency of a given set of classes.



The difference in point size distribution is shown in the accompanying figure.


Next, we'll change the point symbol color.


We'll make one more modification to the point symbols. Given that we have several overlapping points, we'll change the point symbol's opaqueness to make them partially transparent.

The transparency setting is available on the Feature Layer ribbon. 

Your map should look something like this. The background county polygons may differ in color.

Depending on your zoom level, the points may or may not overlap.

Symbolizing line features

In this last section, you will learn how to symbolize line features. 


We can change the color and width of the line feature, but we can also adopt a more elaborate line symbol from the Gallery tab.

The Highway symbol provides a thin outline to the line symbol  which makes the line features stand out a bit more than a simplar line symbol. However the outlines overlap line segments  taking away the sense that these line features should be connected. We'll change this in this next step.


This should render the line features seamless.

Labeling line features

Next, we'll add the interstate shield.

By default, the map will label the text as it appears in the FULLNAME column. It will also place the labels parallel to the line segments. Next, we will tweak the label properties.

Note that the Text Symbol Style button may not show up on your desktop if your ArcGIS Pro window is expanded in which case you will see the a pull-down menu instead. 


Finally, we'll increase the distance between the labels.

Your final line feature should look something like this.

Note that the placement and number of highway shields will change depending on your zoom level. However, regardless of zoom level, the labels should not overlap.

A note about color blind safe colors

About ten to fifteen percent of the the population is believed to suffer from some degree of color blindness. This may, for example,  make it difficult to distinguish between red and green hues of similar perceived lightness and saturation for some members of the population as shown in the following example.

In the figure below, the left image adopts a divergent color scheme using green and red hues.  For someone who suffers from Deuteranopia (red-green color blindness), the map will actually look like the one on the right.

The tool offers three different color vision simulations. Clicking on one of the simulators will convert your map to one as would be perceived by a person suffering from that color vision impairment. 

This wraps up this tutorial.