The raw data used in this analysis are presented in Table 1. All datasets in this study include measured concentrations of trace elements (TEs), reported in nanograms per liter (ng/L) in Table 1, used as response variables. Samples were collected at four distinct nests (N1 to N4), each with three sampling points, and each point was sampled at two depths (S = Shallow and D = Deep). The nest variable is treated as a categorical grouping factor, while sampling dates (June, July, August 2023) and sampling depths serve as predictor variables, with dates treated as temporal variables and depths as categorical variables.
To visualize the concentration of trace elements across all nests, box and whisker plots (Fig. 7) were prepared for each sampling date. These allowed the detection for errors or outliers in the data, which could be attributed to errors during data collection, data entry, or instrument errors. The y-axis shows the concentrations (log scale) of the TEs shown on the x-axis.
Fig 7. Box plots showing mean concentration of trace elements in soil solutions across all nests, collected in Jul, Aug and Sep 2023.
Boxplots were also used to visualize the concentrations of trace elements at Nest 1 (Fig 4), and to visualize the depth as main effect, a boxplot for select trace elements is shown in Fig 5. These boxplots also allow us to visually assess trends, distributions, and potential outliers in the dataset across the variables and years.
Fig 8. Box plots showing mean concentrations of trace elements at Nest 1.
Fig 9. Box plots showing mean concentrations of select trace elements at varying depths.
Extreme outliers can impact or introduce biases in the analysis results. Therefore, the extreme outliers associated with each TE were identified and removed. The initial distribution of the values per TE was checked, then fine-tuned log transformations were applied. The extreme outliers were excluded to improve the normality and make the analysis more streamlined. As an example, the transformation for Cd concentrations using R software is shown below: