Characterizing Ann Arbor's Urban Tree Canopy

For my masters' capstone project, I worked with a team of three other students to characterize the urban canopy in Ann Arbor, Michigan. Ann Arbor is home to a rich canopy of nearly 1.5 million trees that provide an estimated $4.6 million in annual benefits. Identifying, describing, and quantifying the urban tree canopy and associated benefits needed to be carried out for the City of Ann Arbor. Of particular interest were many fragments of old-growth forest on private land where it is difficult for conservation planners to identify and document trees. 

Our team worked to develop three data layers to support decision-makers in Ann Arbor’s Environmental Commission and Office of Sustainability and Innovations. First, a map of native forest fragment locations. Second, a map of turf grass to aid in identifying tree planting sites and areas for incentivizing sustainable lawns. Third, a map of Ann Arbor’s urban canopy classified by tree genus.

Our final report can be read for free here.

Skills and software:

Training Data Collection

We began the project by using 100 ft. x 100 ft. sample plots to locate and identify trees within many of Ann Arbor's natural areas. This served two purposes: to gather additional training data for use in machine learning models, and to familiarize the team with the different forest communities found in the area.

In total, we recorded over 1,000 trees in 38 plots in 15 locations around Ann Arbor. We also incorporated a database of street trees and trees on the University of Michigan's campus.

Genus Classification

To classify tree Genera, we used the lidR package for R to segment tree crowns from LiDAR pointclouds. 

After segmentation, Python and ArcGIS Pro were used to find the mean values of various predictor rasters within each crown. These included DN values for all bands of NAIP and Nearmap aerial imagery, NDVI, texture, canopy height, soil type, and principal components derived from LiDAR (see below).

After the segmentation polygons, aggregated predictors, and training/testing points were united, the resulting dataset was used to train and test a random forest, support vector, and multinomial model. 

Native Fragment Classification

Manual Delineation

For the native forest fragment map, we began by using present and historical aerial imagery to manually delineate old-grown forests. We worked closely with our stakeholders and local experts to ensure that our polygons aligned with reality on the ground. In the above image, forested areas in 1940 are outlined in red, and present-day fragments in blue.

Unsupervised Classification

In order to identify fragments we missed when manually delineating, we ran an ISO clustering algorithm on the imagery, LiDAR principal components, texture, and NDVI. After manually merging classes, the resulting layer had strong agreement with the delineated fragments (outlined in black) and known shrub areas. 

LiDAR Principal Components

In an effort to derive more robust canopy descriptors from LiDAR, we followed the methodology in Ciuti et. al. (2017) and performed a Principal Components Analysis on five foot voxels derived from our pointclouds. The resulting principal components proved effective at highlighting native areas.

Results and Conclusions

We found that 82% of the 6,900 acre urban canopy in Ann Arbor exists on non-city owned land. Our manual fragment delineation covered 14% of the total canopy, with two-thirds located outside of City-owned property. Clustering classification results found twice as much (28% of the total canopy), with most of that additional area falling outside of City-owned land.


Our efforts at genus classification were less successful - our highest accuracy achieved was 54%. However, our project has laid the groundwork for more useful results following the acquisition of more accurate ground truth data. Furthermore, though genera were frequently misclassified, much of the confusion was between genera common to a single forest commnuity type (e.g. oak and hickory), further bolstering the rationale for remotely sensed classification of forest communities.