Yearly datasets were gathered in a single Excel sheet to prepare a clear and standard dataset, given that the different people who collected the data were not systematic in their recordings and terminology.
The data used in this study includes the two mentioned measurements for three native tree species (see Table 1) at two different sites with three different micro-topographical treatments from 2015 to 2018. The tree species (Picea glauca, Pinus banksiana, and Populus tremuloides), sites (south and east), treatments (control, hilled, and ridged), and years are all predictor variables. The sites and years were fully controlled variables, the treatments were organized following a randomized block design, and the species were randomly planted at the start of the experiment. All predictor variables are categorical and nominal variables. The response variables are the height measurements (in cm) taken from the surface to the terminal bud and the root collar diameter (RCD in mm) measured only for the last two years of the study. Both response variables are continuous with numeric values.
Table 1: A sample of the raw data for tree height and RCD measurements used in this study with all other variables.
The first step was to visualize as much information as possible using multiple boxplots grouped into the different predictor variables. These initial boxplots (see Figure 6 and 7) offered a visual display of all the raw data in the two sites based on treatment and species. These were the first steps to checking the data and the potential patterns in the study. It helped visualize outliers and decide the final timeline. However, the resolution of these graphs was too low and therefore, individual species and year plots were used for outlier removal.
Figure 6: Boxplots of the average height change from 2015 to 2018 at the south site based on the raw data. Each row corresponds to one of the analyzed species: 1. Picea glauca 2. Pinus banksiana 3. Populus tremuloides, and the colors represent the treatment types: yellow for control, green for hilled, and blue for ridged.
Figure 7: Boxplots of the average height change from 2016 to 2018 at the east site based on the raw data. Each row corresponds to one of the analyzed species: 1. Picea glauca 2. Pinus banksiana 3. Populus tremuloides, and the colors represent the treatment types: yellow for control, green for hilled, and blue for ridged.
The raw data required modifications because of gaps and outliers. Initially, data collection intended to capture measurements at the start and end of each growing season to evaluate yearly growth and ensure comparability between the end of one season and the beginning of the next. In the end, only 2015 and 2016 had both measurements, and so to ensure comparability, the end-of-season heights were used to approximately represent the change of one year's worth of growth between time periods.
This project focuses solely on the growth of planted trees and excludes natural regeneration considered in the overall study. Since the soil mix likely contained seeds, various small individuals emerged over time and were measured. Because the focus of this project is on the planted trees, many smaller heights, appearing as outliers in later years, were removed when they were clearly due to regeneration. This was particularly challenging for Populus tremuloides, given its rapid growth rate. For example, the number of individuals measured increased from 497 in August 2015 to 602 in 2016 and further to 794 in 2017. Although some variation is expected due to transect variability, human error, and tree mortality, the increase in nearly 300 individuals suggests regeneration. As shown in Figure 6, the first quartiles and lower whiskers in Populus tremuloides boxplots remained fairly constant with time despite an increasing mean. Because it is unlikely that trees did not grow, they were considered regenerating trees. Therefore, around 100 of the smallest individuals were removed per year, proportionally across treatments, to minimize the influence of regeneration.
With the clean data, boxplots were used to illustrate the average height of the three different species per treatment and per site. The measured growth for Picea glauca can be seen in Figure 8, for Pinus banksiana in Figure 9 and for Populus tremuloides in Figure 10.
Figure 8: Height change throughout the study for Picea glauca of polished data. The first row corresponds to the south site (filled color), measured from 2015 to 2018, and the second row corresponds to the east site (striped color), measured from 2016 to 2018. Each color corresponds to a different treatment: yellow for control, green for hilled, and blue for ridged.
Figure 9: Height change throughout the study for Pinus banksiana of polished data. The first row corresponds to the south site (filled), measured from 2015 to 2018, and the second row corresponds to the east site (striped), measured from 2016 to 2018. Each color corresponds to a different treatment: yellow for control, green for hilled, and blue for ridged.
Figure 10: Height change throughout the study for Populus tremuloides of polished data. The first row corresponds to the south site (filled), measured from 2015 to 2018, and the second row corresponds to the east site (striped), measured from 2016 to 2018. Each color corresponds to a different treatment: yellow for control, green for hilled, and blue for ridged.
After that, average heights were calculated for each species, treatment, site and year; with the corresponding standard errors. This was ploted as a line chart in Figure 11 for both sites.
Figure 11: Average heights throughout the study for the south (top) and east (bottom) sites based on species, treatment and year. The columns divide the species: Picea glauca, Pinus banksiana and Populus tremuloides. Each color corresponds to a different treatment: yellow for control, green for hilled, and blue for ridged. Error bars represent standard errors.
Tree volumes were estimated with the RCD and height measurements collected from 2017 and 2018. On top of the initial height data processing, further individuals were removed due to inconsistent RCD values. For example, two records had unrealistically large diameters (e.g., an RCD of 791.1 mm for a 3-year-old aspen), likely from data entry errors, that were ultimately removed. Tree stem volume was calculated using the cone volume formula* :
Volume = π × (RCD/2)² × Height/3
*Note: This method is general but should provide an adequate estimate for this analysis.
As with height measurements, the initial step in analyzing tree volumes was to visualize distributions using boxplots grouped by the predictor variables: species, treatment, and site.
These boxplots (Figure 12 and 13) were used to identify outliers and evaluate the overall distribution of the data, which revealed that the volume data was positively skewed and not normally distributed. To address the lack of normality, the data was transformed to a logarithmic scale to minimize the impact of extreme values and allow for more robust statistical analyses.
Figure 12: Boxplot showing the distribution of raw volume data for each species, treatment, and year in the south site. Each column represents a species (Picea glauca, Pinus banksiana, Populus tremuloides) under three treatments (control - yellow, hilled - green, and ridged - blue) from 2017 and 2018. The raw data shows a highly positively skewed distribution, which will be log-transformed in subsequent analyses.
Figure 13: Boxplot showing the distribution of raw volume data for each species, treatment, and year in the east site. Each column represents a species (Picea glauca, Pinus banksiana, Populus tremuloides) under three treatments (control - yellow, hilled - green, and ridged - blue) from 2017 and 2018. The raw data shows a highly positively skewed distribution, which will be log-transformed in subsequent analyses.
After the data transformation, mean log-transformed volumes were plotted for each species, treatment, and site using bar charts. To increase visual robustness, error bars displaying standard errors were added. Figure 14 shows averages for the south site, and Figure 15 shows the east site.
It is also important to mention that the south slope trees were 1 to 2 years older than the east slope trees, accounting for their larger volumes. Figure 14 shows the average volumes by treatment and species for the south site, and Figure 15 shows the same bar charts for the east site, with error bars showing standard error for both figures.
Figure 14: Bar chart of average log-transformed volumes (cm3) for each species and treatment on the south site. Each bar chart section provides the increase over 1 year (from 2017 to 2018) and compares the different treatments (control - yellow, hilled - green, and ridged - blue) for Picea glauca, Pinus banksiana and Populus tremuloides. Error bars represent standard error.
Figure 15: Bar chart of average log-transformed volumes (cm3) for each species and treatment through time on the east slope. Each bar chart section provides the increase over 1 year (from 2017 to 2018) and compares the different treatments (control - yellow, hilled - green, and ridged - blue) for Picea glauca, Pinus banksiana and Populus tremuloides. Error bars represent standard error.
Multifactor ANOVA was conducted to evaluate the interaction effects of site and treatment, with blocks included as fixed effects, which were included in the analysis to account for spatial variability across the study area. Given that species
When results were significant, post hoc pairwise comparisons using EMMs were conducted to understand the differences between treatment and sites for each species. The analysis was done for each species separately due to the different growth rates and the study's interest in the variations between treatment effects on the native species to select the best restoration technique. Additionally, confidence intervals were calculated to assess the estimated difference between treatments. Both height and volume data were analyzed using these methods.
For all species, the ANOVA revealed meaningful interaction effects between treatment and site (p-value < 0.05). Subsequent pairwise comparisons highlighted significant differences between treatments and sites, with confidence intervals providing further insights into these variations (Tables 2, 3, and 4). Interaction plots (Figures 16, 17, and 18) illustrate the trends and interactions, visually showcasing how treatment effects differ across sites with error bars showing confidence intervals.
Table 2: Pairwise comparisons between treatments and sites for Picea glauca for height. P-values for interactions between treatments with statistical significance (p < 0.05) are highlighted in blue. The 95% confidence intervals for each treatment and site are shown by the lower and upper limits.
Figure 16 : Interaction plot showing the combined effects of treatment and site on the mean height (cm) of Picea glauca at the end of the experiment. The x-axis represents the two sites (East and South) and the colors represent the different treatments(control - yellow, hilled - green, and ridged - blue) Error bars indicate the 95% confidence intervals for the mean volume estimates.
Table 3: Pairwise comparisons between treatments and sites for Pinus banksiana for height. P-values for interactions between treatments with statistical significance (p < 0.05) are highlighted in blue. The 95% confidence intervals for each treatment and site are shown by the lower and upper limits.
Figure 17: Interaction plot showing the combined effects of treatment and site on the mean height (cm) of Pinus banksiana at the end of the experiment. The x-axis represents the two sites (East and South) and the colors represent the different treatments(control - yellow, hilled - green, and ridged - blue) Error bars indicate the 95% confidence intervals for the mean volume estimates.
Table 4: Pairwise comparisons between treatments and sites for Populus tremuloides for height. P-values for interactions between treatments with statistical significance (p < 0.05) are highlighted in blue. The 95% confidence intervals for each treatment and site are shown by the lower and upper limits.
Figure 18: Interaction plot showing the combined effects of treatment and site on the mean height (cm) of Populus tremuloides at the end of the experiment. The x-axis represents the two sites (East and South) and the colors represent the different treatments(control - yellow, hilled - green, and ridged - blue) Error bars indicate the 95% confidence intervals for the mean volume estimates.
Similarly, when analyzing volumes, the ANOVA revealed meaningful interaction effects between treatment and site (p-value < 0.05). Differences between treatments and sites were further analyzed using pairwise comparisons and confidence intervals for each species (Tables 5, 6, and 7). Interaction plots of these log-transformed volumes (Figures 19, 20, and 21) were also graphed to show these interaction trends with error bars showing confidence intervals.
Table 5: Pairwise comparisons between treatments and sites for Picea glauca for volume. P-values for interactions between treatments with statistical significance (p < 0.05) are highlighted in blue. The 95% confidence intervals for each treatment and site are shown by the lower and upper limits.
Figure 19 : Interaction plot showing the combined effects of treatment and site on the mean volume (cm³) of Picea glauca at the end of the experiment. The x-axis represents the two sites (East and South) and the colors represent the different treatments(control - yellow, hilled - green, and ridged - blue) Error bars indicate the 95% confidence intervals for the mean volume estimates.
Table 6: Pairwise comparisons between treatments and sites for Pinus banksiana for volume. P-values for interactions between treatments with statistical significance (p < 0.05) are highlighted in blue. The 95% confidence intervals for each treatment and site are shown by the lower and upper limits.
Figure 20: Interaction plot showing the combined effects of treatment and site on the mean volume (cm³) of Pinus banksiana at the end of the experiment. The x-axis represents the two sites (East and South) and the colors represent the different treatments(control - yellow, hilled - green, and ridged - blue) Error bars indicate the 95% confidence intervals for the mean volume estimates.
Table 7: Pairwise comparisons between treatments and sites for Populus tremuloides for volume. P-values for interactions between treatments with statistical significance (p < 0.05) are highlighted in blue. The 95% confidence intervals for each treatment and site are shown by the lower and upper limits.
Figure 21: Interaction plot showing the combined effects of treatment and site on the mean volume (cm³) of Populus tremuloides at the end of the experiment. The x-axis represents the two sites (East and South) and the colors represent the different treatments(control - yellow, hilled - green, and ridged - blue) Error bars indicate the 95% confidence intervals for the mean volume estimates.