Data and Statistics

Statistical Analysis

Objective 1

Independent variables

Categorical: Sites (Wet and Dry); Soil layers (Top and bottom);

Continuous Variables: Soil pH; Organic matter; NO3 (Nitrate); NH4 (Ammonium); PO4 (Phosphate); Ergosterol (Fungal biomass)

Response variables

ECM fungal diversity (ASVs/reads)

Since I am interested to compare soil cores taken from individual lodgepole trees, a one-way ANOVA followed by LSD post hoc test will be carried out to explore the differences in ECM fungal diversity within two layers of the soil, at 95 % confidence level. A linear regression analysis will be conducted to study the relationship between soil parameters/ independent variables and fungal diversity.

Objective 2

Independent variables

  1. Drought stress: 3 levels - optimal (CTRL); moderate drought (MD); severe drought (SD)

  2. ECM fungal communities: 3 levels - No inoculation (NI); ECM drought (ECMD); ECM non-drought (ECMND)

Response variables

  1. Shoot fresh and dry biomass

  2. Root fresh and dry biomass

  3. Plant height

  4. Antioxidation activity (Proline (leaf and roots); Superoxidase (leaf and roots); Peroxidase dismutase (leaf and roots))

  5. Fungal biomass (Ergosterol content)

All experiments were structured following a completely randomized design. The experimental data sets (each comprising of 15 technical replicates) were pooled for data analysis using R version 4.2.1. The data were analyzed by one-way ANOVA using Tukey’s multiple means comparison to determine statistical differences between treatments, at the 95% confidence level. Data that were significant at p < 0.05 were considered for response description. The data that was not statistically significant but numerically different as compared to control treatment, percentage differences were calculated to support the treatment responses with regard to biological differences.

Data Description and Exploration

Raw data

In this project, simulated data has been used for statistical analysis and project outcomes. The data has been generated in MS-Excel using "RAND()" function with addition of "*(top-bottom)+bottom" in the formula to have fractions/decimals.

Objective 1

Data set components:

SITES = Dry (lodgepole pine stands under drought conditions); Wet (lodgepole pine stands under non-drought conditions)LAYERS: Top (0-15cm); Bottom (15-30cm)

NH4=Ammonium; NO3=Nitrate; PO4=Phosphates; OM= Soil organic matter; Reads= ECM Fungal ASVs (fungal abundance)

Objective 1 (RAW Data).xlsx

Objective 2

The data set is comprised of two factors (independent variables - 3 levels each) and 12 response variables to estimate the treatment effects.

Data set components

CMN = Fungal community (NI=No-inoculation; ECMND= ECM fungal communities from non-drought lodgepole pine stands; ECMD=ECM fungal communities from drought-affected lodgepole pine stands)

STRESS: CTRL (Control / unstressed); MD=Moderate drought; SD=Severe drought

SOD=Superoxide Dismutase (L=leaf, R=Root); PRO=Proline content (L=leaf, R=Root), POD=Peroxidase (L=leaf, R=Root), F_Col= Fungal colonization


RENR 580.xlsx

Data distribution and visualization

In order to see the distribution of the data and; importantly the outliers that may exist, box plots have been used to demonstrate the locality, spread, and skewness of the data.

  1. Data demonstration of soil parameters across sites (dry and wet) at different soil depths (top and bottom)

Fig1: Box plots to visualize data distribution and identifying outliers; (A) Soil pH (B) Soil phosphate (C) Soil organic matter content (D) Ergosterol content in the soil (E) Ammonium content (F) Nitrate content

The box plots are representing data distribution for each continuous variable (y-axis), at different sites (x-axis: Sites, Fills: Soil layers/depths). Symmetry could be seen across most of the continuous variables, however, the data seems to be asymmetric, and skewed for soil organic matter and ergosterol content. Also, a reasonable number of outliers are present in both variables (boxplot - C and D)

2. Data demonstration of plant antioxidation activity following the inoculation of ECM fungal communities under drought and optimal growth conditions

Fig 2: Box plots to visualize data distribution and identifying outliers; (A) Proline content in leaves (B) Proline content in roots (C) Peroxidase dismutase activity in the leaves (D) Peroxidase dismutase activity in the roots (E) Superoxidase activity in leaves (F) Superoxidase activity in the roots.

The box plots are representing data distribution for each response variable (y-axis), and independent variable (x-axis: Stress, Fills: CMN-Fungal communities). Reasonable variance in the distribution of the data could be seen. However, the data in a few places is skewed either positively or negatively (the median is either closer to the bottom of the box or the top of the box, respectively), showing the data may not be symmetric!

3. Data demonstration of plant growth parameters following the inoculation of ECM fungal communities under drought and optimal growth conditions

Fig 3: Box plots to visualize data distribution and identifying outliers; (A) Shoot fresh wt (B) Shoot dry wt (C) Root fresh wt (D) Root dry wt (E) Plant height (F) Fungal colonization.