Methods

Data collection 

Data sources

Site index data

The site index data comes from the dataset called The Provincial Growth and Yield Initiative (PGYI). PGYI is to collectively obtain data on tree growth through repeated measurements of Permanent Sample Plots (PSPs) to develop, calibrate, and validate growth models which support Forest Management Plan (FMP) yield estimation and the Reforestation Standard of Alberta (RSA) assessment process. The long-term goal of PGYI is to establish a database of natural and managed stand PSPs that represents the range of forest conditions in Alberta. The target is for 1800 natural stand PSPs and 1200 managed stand PSPs to be submitted to the database.


Table 1: Example of PGYI dataset.

Table 1: Continue.

Photo 1: Lodgepole pine in Permanent Sample Plots

Photo 2: Measuring the location of plot center in Permanent Sample Plots

Climate data

And the climate data comes from ClimateAB v3.21. ClimateAB v3.21 is a standalone MS Windows application written in Visual Basic 6.0. Using an elevation lapse rate adjustment, it extracts and downscales ANUSPLIN-interpolated monthly normal data (2.5 x 2.5 arcmin) to any resolution (Hamann & Wang 2005). The application computes historical monthly, seasonal, and annual climatic variables for specific years and periods between 1901 and 2006 using monthly anomaly data (Mitchell and Jones 2005). Additionally, 83 future estimates produced by different global circulation models are downscaled and integrated by this program (Barrow & Yu 2005). The program produces climatic variables that are both directly and indirectly calculated. Wang et al. in 2006 detail the downscaling of PRISM monthly data, along with bilinear interpolation, elevation correction, climate variable computations, and estimation of derived climate variables.

Table 2:Example of climate data.

Table 3: Acronym of climate variables.

Site location

From the PGYI dataset, I got totally 2095 plots of lodgepole pine site index. The data is divided into 2 parts, managed stands and natura stands. There are 450 managed stands and 1645 natural stands. And the site index also divided into 4 classes,  Class IV, Class III, Class II and Class I. Class IV is from 5 to 10. Class III is from 10 to 15. Class II is from 15 to 20. Class I is from 20 to 25. The two maps below show the location of the managed and natural plots, respectively.

Figure 2: Location of the 450 plots of managed stands' Lodgepole pine site index across Alberta. The red, orange, gray and black dots represent the Class IV, Class III, Class II and Class I, respectively. And the different color represent different subregions of Alberta.

Figure 3: Location of the 1645 plots of natural stands' Lodgepole pine site index across Alberta. The red, orange, gray and black dots represent the Class IV, Class III, Class II and Class I, respectively. And the different color represent different subregions of Alberta.

Data Visualization and Analysis

​Data were visualized and analyzed in RStudio (version 4.3.3) using R Statistical Software (v4.3.3; R Core Team, 2024).  

(1) Using histogram to see the distribution of each climate variable.

(2) Using correlation matrix and heatmap to see the correlation between site index and different climate variables.

(3) Using Principal component analysis (PCA) to explore the relationship.

(4) Using random forest to build the model and do prediction of lodgepole pine site index in Alberta.