Joining tables

Before tackling this tutorial, you will need to download and install a dataset following these instructions:

Create a folder called joining_tables somewhere under your personal directory (e.g. C:\Users\jdoe\Documents\Tutorials\joining_tables\).
Download the data for this exercise then extract the contents of joining_tables.zip into your newly created joining_tables folder.

Loading the CSV data table

Viewing the data table

Joining the data table

Making the join permanent

Another join example

Accessing attribute properties (aka field views)

Did the table join properly?

Data types should match!

Creating a new "numeric" field

Attempting the join once more

Exporting this new join

The project folder consists of an .aprx file, a shapefile of the Maine counties and two data tables: a CSV formatted data table and an Excel formatted data table. You will first learn how to load and view a data table in ArcGIS Pro, then you will learn how to join the table to the shapefile.

Loading the CSV data table

Open the Joining_tables.aprx project file.
The project consists of a map of the Maine counties (USA).

Open the Catalog pane (if it's not already open).

The project folder should already be mapped as a folder connection.

Expand the Joining_Tables project folder.

ArcGIS Pro will list all files in a folder that it recognizes as being valid files for use in your project. This includes non-spatial data files like Alcohol_use_2008.csv and Estimated_median_income_2009.xlsx.

In the next step, you will load the CSV file.

Drag and drop the Alcohol_use_2008.csv file from the Catalog pane into the Map or Contents pane.

The table should appear in the contents pane . The data consists of the percentage of young people between the grades 6 and 12 who have consumed alcohol at least once (source: www.maine.gov/maineosa/survey/).

Recall that this is not a spatial dataset, hence, its contents will not be drawn in the map.

Viewing the data table

Before attempting a join, you should check the contents of the table.

Right-click on the Alcohol_use_2008.csv layer and select Open to bring up its table.

The data file consists of two fields (or attributes in ArcGIS lingo): The county names and the percentage of youth having consumed alcohol at least once.

A join can only be performed when both the spatial object being joined to, and the data table to be joined have a common field that will be used to properly assign the data table rows to the spatial features in the spatial object.

Let's now open the attributes table associated with the Counties layer to help identify the matching field.

Right-click the Counties layer and bring up its Attribute table.

The Counties attribute table will appear as a separate tab in the Tables pane.

You can toggle back and forth between both tables to compare their contents.

Alternatively, you can split the Tables pane into two side-by-side tables to facilitate comparison as shown in the next step.

Click on the Counties table tab and undock it by dragging it away from its location but keeping the mouse somewhere inside the Tables pane. At some point, you will see a cross shaped docking indicator unto which you can drop the table tab. Select the right-most indicator (this will dock it to the right of the other table).

Note that you can always dock the table back to its original location, or dock it somewhere else in the project window.

The common fields that will be used to join the records are County (in the CSV table) and COUNTY (in the Counties shapefile).

Notes:

The fields used to perform the join do not need to have their names match letter for letter or case for case.
The number of records do not need to match between the tables since we are not appending one table to the other by row order. In this example, the CSV file has just 16 records (one record for each Maine county) whereas the shapefile has 6,251 records. The large number of records reflects the number of individual polygons making up each county in the shapefile. Many counties in Maine cannot be represented by a single polygon--this is especially true for coastal counties that are made up of hundreds of island polygons.

Joining the data table

Right-click the Counties layer and select Joins and Relates >> Add Join.

Make sure that you right-click on the layer you are joining a table to since this is a unidirectional operation. (I.e. you are NOT joining the shapefile table to the CSV table).

Set the Input Table to Counties, the Input Field to COUNTY, the Join Table to Alcohol_use_2008.csv, and the Join Field to County.
Click OK to run the geoprocess.

If you expand the Counties Attribute table, you should now see two additional columns from the CSV file added to the layer's table.

At this point, you can symbolize the Counties layer using any one of the newly joined columns.

In the accompanying figure, the Percent field is symbolized.

Making the join permanent

The join you created in the last step will only persist in the current ArcGIS Pro .aprx document. In other words, if you were to load the Counties shapefile into a new map document, the join would no longer be present.

To make the join permanent, you need to export the layer (with the join) to a new GIS data file. In the next example, you will export the layer to a dedicated shapefile.

Make sure that none of the features (polygons) are selected. If a selection is present, only the selected features will be exported.
Right-click the Counties layer and select Data >> Export Features.
Set Counties as the Input Features.
Name the output file Alcohol_consumption.shp and save it in the Joining_tables project folder.
Click OK.

Notes:

Shapefiles limit the field names to 10 characters. This means that long field names will be truncated. If you need to preserve column names that go beyond 10 characters, you can either export the layer to a geodatabase or you can export the layer to a geopackage.
If both tables share a common filed name (regardless of case), the joined table's common field name will be appended with a "_1" suffix. In this example, the exported shapefile has a County_1 column name.

You now have a dedicated shapefile with a permanently joined data table. This will probably be automatically added to the Contents pane. If so, you can remove it from the the map document since it will not be needed for the remainder of the tutorial.

Another join example

For our next exercise, we will repeat the join operation with an Excel file. But first, we will remove the current join from the Counties layer.

Right-click the Counties layer and select Joins and Relates >> Remove All Joins.

This will remove all joins performed on this layer. In doing so, it will generate an empty map if you symbolized the features using one of the joined columns. You might need to revert the symbology back to its original state (i.e. a single symbol Symbology).

Next, we'll load the Excel file. Note that an Excel file can consist of more than one sheet. ArcGIS Pro will only allow you to load one sheet at a time. So when opening the file, you will be asked to select the sheet of interest. In this example, we only have one sheet in the Excel file.

In the catalog pane, expand the Estimated_median_income_2009.xlsx file and drag and drop the Estimated_median_income_2009$ sheet into the Map pane or the Content pane.

Following steps outlined earlier in this exercise, open the Excel file's attribute table and generate a side-by-side table layout showing the Counties table on the left and the Excel table on the right. Note that you will need to close any other table no longer needed in this step.

This time, we will join by a county code ID and not by county name (you'll note that this excel file does not have a county name column). US counties are assigned a numeric code by the federal government (FIPS). These codes are stored in the CNTYCODE field for the Counties layer and in the COUNTY ID field for the Excel file. These columns will be used to perform the join.

But before we do, let's take a look at the data types associated with these fields.

Accessing attribute properties (aka field views)

For a GIS file layer (such as a shapefile or geodatabase) , click on the table's upper right-hand icon (the three horizontal lines) and select Fields View from the pull-down menu.

For a non-GIS file (aka Standalone Table) such as a CSV or Excel file, you might find the Fields View option greyed out.

The workaround is to open the Fields tool from the Standalone Table tab. If you don't see the Standalone Table tab, you probably do not have the Excel table selected.

You'll note that the shapefile's CNTYCODE column is stored as a Text and not as a number (even though the values appear numeric) and Excel's County_ID column is stored as a Double (a numeric data type). The difference in data type may influence the join. So let's see what happens if we ignore this discrepancy in data type.

Following the example outlined earlier, join the Excel file to the Counties layer. Don't forget to click OK to run the geoprocess.

Open the Counties' layer attribute table. (You might need to click on the Counties tab to get out of the Fields tab).

Did the table join properly?

At first glance, the join seems to have performed as expected. We see the Income_2009 column from the Excel file joined to the Counties layer. This allows us to symbolize the layer using that column. However, in doing so, you will note that some counties are blank.

It's possible that the Excel file is missing records for those counties, but after further examination, you will note that the Excel file has 16 records, one for each of the 16 Maine counties.

So missing records is not the problem.

After a bit of detective work, you might have noted that the CNTYCODE values that start with 0 do not have matching Excel records (i.e. 01, 03, 05, 07 and 09). Yet, those values do exist in the Excel file without leading 0s as shown in the following figure.

Data types should match!

So why the NULL matches? Recall that the data types in both tables do not match. When ArcGIS Pro is asked to combine or compare different data types, it will convert one data type to match that of the other data type. It does so by applying a data type hierarchy whereby anytime numbers are mixed with text, the numbers will be changed to text data type. In our example, County_ID values are converted to text. So for the mismatched records, ArcGIS Pro is comparing the text string "01" to "1" , the text string "02" to "2", etc...

There are several workarounds to this problem. One could involve converting the County_ID column to text (with padded "0"s) inside of an Excel session, then re-importing the file into ArcGIS Pro. Another solution is to convert the CNTYCODE text column to a numeric column in Pro. In this exercise, we'll adopt the latter solution.

Creating a new "numeric" field

At this point, it's probably a good idea to remove the join from the Counties layer before proceeding with the creation and computation of the new field. Revert to earlier steps in this tutorial for a refresher on how to remove a join.

Click on the Fields: Counties tab in the Table pane.
At the bottom of the Fields table, click on the "Click here to add a new field" link.

This will add a new row to the Fields table.

Name the new field cnty_id.
For Data Type, you can choose anyone of the numeric options (ArcGIS Pro will usually properly join any numeric data type). But for sake of consistency, we'll make this a Double since this is the data type adopted by the Excel file.

If this was a very large file and memory management was an issue, you would probably want to be judicious in your choice of data type. In such a scenario, you would want to adopt the smallest data type that can accommodate the full range of values in that field. In our example, you could choose a Short (integer) data type.

Once you've defined the new field, you need to save the changes to the shapefile.

Navigate to the top of the ArcGIS Pro window and select Save under the Fields tab.

Navigate back to the Counties attribute table by clicking its tab.
Right-click on the cnty_id column and select Calculate Field.

If you do not see the newly created cnty_id column, you probably did not properly save the changes as outlined in the previous step.

Next, you will assign the text values in the CNTYCODE field to the cnt_id column using the Python syntax. Note that you could do the same using the Arcade syntax.

In the Calculate Field window, double-click on the CNTYCODE field in the Fields box. This will automatically add it to the Expression box at the bottom of the window.

Alternatively, you could manually type the expression !CNTYCODE! in the expression box, but be sure to respect all cases since the operation is case sensitive. If you use the Arcade syntax, the expression box will look like $feature.CNTYCODE.

Click OK to run the operation.

The expression we just created copies the values in the CNTYCODE to the cnty_id column. It will automatically convert the data type as needed (here, it converts the text representation of numbers to a numeric data which we defined for the cnty_id column).

Scan the newly computed field. You should now see numbers without the "0" padding.

Attempting the join once more

Now, on your own, attempt another join using the cnty_id column instead of the CNTYCODE column.
Symbolize the income column to check that all counties have a matching Excel record.

Exporting this new join

If you are using a version of ArcGIS Pro greater than 3.3 (version 3.5.2 is used for this exercise), the same series of operations will need to be performed to export this join as was conducted for the export of the CSV joined layer earlier in this tutorial.

Right-click on the Counties layer and select Data >> Export Features.
Name the output Income.shp
Click OK.

To confirm that the join was properly exported, bring up Income's attribute table.

You'll note that the Income_2009 field name was truncated to Income_200 (note the missing "9"). This is a result of the shapefile's character name length limit.

To avoid unexpected or undesirable truncation of field names, you can rename the output fields prior to completing the export.

In the following steps, we will re-run the export process.

Right-click on the Counties layer and select Data >> Export Features.
Name the output shapefile Income.shp (this will overwrite the existing shapefile).
Expand Fields.
Right-click the Income_2009 field name and select Rename.
Rename the field Income2009. This reduces the field name length to the 10 character shapefile limit.
Click OK to recreate the Income.shp file.

Save your project.

This wraps up this tutorial.

Page updated

Report abuse