Data Quality and Preparation
This project uses a dataset of 688 individual Aspen trees measured over a 22-year period in Alberta, Canada, covering height data and several contextual variables like clone, provenance, and sex. Initial cleaning was performed to ensure data relevance and accuracy, focusing only on height growth data. Height growth data shows a normal distribution throughout the 22 year period (Figure 10). Tree height of each clone seems to vary between clones in year 22 (Figure 11). It doesn't vary as much between male and female trees when compiling all clones, whereas non blooming trees seem to be represented on the lower end of the height distribution (Figure 12). The trend of mean spring and summer temperature as well as the trend of cumulative precipitation (winter, spring, summer) is increasing throughout time (Figure 13), although there are fluctuations within. Those variations were used to identify cold & dry (CD, from 2002 to 2005), warm & dry (WD, from 2005 to 2007) and cold & wet (CW, from 2010 to 2014) periods. Changing climate conditions during growth intervals can be seen in Figure 14.
Rows labelled with “need to confirm it is poplar” and “willow” in the comment section of the original dataset were removed, as well as any rows where the height in the first year (H1Y) was either missing or recorded as zero. Additionally, entries from the “filler” provenance were removed to maintain consistency within provenances of interest. Column reordering was done to streamline analysis, and only relevant columns for this project (ID, height data, clone, provenance, and sex) were retained. An individual tree ID was created, composed of Order-Trial-Rep-Block-Stake-Prov-Clone (e.g. tree 55-G813B-1-3-2012-AIN-8016)
Data cleaning included verifying data consistency across clones and provenances and managing missing data appropriately. Missing height values, primarily due to tree death, were recorded as "NA" and excluded from calculations in the years following death to avoid skewing results.
Data Table
The final data table includes columns for the unique tree identifier (ID), height measurements taken in various years (e.g., H1Y, H3Y, up to H22Y), provenance, clone, and sex. Table 1 shows an abbreviated version of the data table, showing sample rows:
Figure 13: Change in mean temperature (spring & summer) from 2002 to 2023 (top) and change in cumulative precipitation (winter, spring, summer) from 2002 to 2023 (bottom). The marked time periods represent cold & dry (CD), warm & dry (WD) and cold & wet (CWe) conditions.