Accurately representing the spatial distribution and relative abundance of tree species is fundamental to forest inventory, regeneration planning, and climate-informed management. A wide range of species distribution models and spatial forest inventory products already exist, but leading inventory-based products are developed within national boundaries. Because climate driven shifts in suitable habitat are not constrained by political borders, there is value in modelling frameworks that integrate forest inventory data consistently across jurisdictions, while incorporating climate, topographic, and land-cover information at continental extents.
In this study, I develop a deep learning framework to model tree species frequencies across North America by integrating forest inventory and ecological plot data with environmental predictors. Forest inventory data from the United States, Canada, and Mexico were harmonized to produce proportional species frequency estimates, which were paired with historical climate normals (1951–1980), derived topographic indices, and probabilistic land-cover predictions generated using a separate deep neural network trained remotely sensed land-cover data.
To address the zero-inflated nature of species frequency data, a two-stage modelling approach was implemented, consisting of a presence–absence classifier followed by a conditional frequency regression model. Additional preprocessing steps, including spatial aggregation of plots, filtering of observations using buffered historical species ranges, and the introduction of pseudo-plots in non-forested regions, were applied to improve computational efficiency and ecological realism.
Model performance was evaluated using withheld inventory data and spatial comparisons with historical species range maps for a regionally diverse subset of tree species. Results show that incorporating topographic variables and probabilistic land-cover information improves model performance relative to climate-only formulations, and produces spatially coherent, ecologically plausible species frequency patterns across different forest regions of the continent.
The framework presented here provides a foundation for generating consistent, continent-wide species frequency surfaces that complement existing forest inventory products and support applications in forest inventory, regeneration planning, and future climate-informed analyses.