An Exploration of Effect of Polluted and Non-polluted Soil on Bacterium Content

Sydney Brown, Eliza Luttman, and Sydney Parpart

Purdue University

Purpose: As the human population grows and more land is expanded for commercial use, there is more runoff of water that occurs. This phenomenon is what leads to rain gardens being installed. Rain gardens prevent water pollution, reduce flooding, create habitats, and much more (e (“Soak Up the Rain: What’s the Problem?”, 2023). Different rain gardens in different areas can have varying levels of effectiveness, even within a mile of each other. In this study, the soil of a rain garden by Ross-Ade, which is a polluted area, will be compared to the soil of a non-rain garden area in Cary Quad, where there is much less, if any, pollution. If there are a lot of bacteria and pollutants found in the soil at the rain garden near Ross-Ade, then it can be concluded that it is causing polluted stormwater runoff to go into the waterways. Urban stormwater runoff severely affects the quality of receiving water bodies by carrying a significant load of many different pollutants that have accumulated on the urban surfaces (Hong et al, 2006). By testing the soil's pH and moisture content, it can be further determined how that is impacting the waterways and the growth of the plants in the soil.

General Soil Collection Methods: The soil was collected on Thursday, January 25th around 4 PM in a grassy area in the middle of Cary Quadrangle in Purdue University (non-polluted), and near a stormwater drain on the edge of a parking lot near Ross-Ade stadium at Purdue University (polluted). The soil was collected by digging about six inches into the ground and pulling out the soil, then tapping it into a Ziploc bag. 

Figure 1

pH Data Analysis of Polluted and Non-Polluted

Method: Three grams of soil were weighed, and then 12 mL of deionized water was added to the tube, which was then vortexed. The pH values of the polluted and non-polluted sample were then measure using a pH meter, which measures the concentration of hydrogen ions in the solution.

Legend: Average pH of the polluted (Ross-Ade) and non-polluted (Cary Quad). The white bars represent the averages of the measurements of the three respective soil samples- polluted and non-polluted. The black bars represent the standard deviations of the three samples of each respective location. The average pH of the polluted samples was 8.853, and 8.223 for the non-polluted. The standard deviation for the three polluted samples was about 0.4922, and the average standard deviation of the nonpolluted samples was about 0.2857. The result of the unpaired t-test assuming unequal variance was 0.144963, which shows that the two groups do not have a statistical difference because it is above 0.05.

Evidence: The pH values comparing polluted (Ross-Ade) and nonpolluted areas (Cary Quad) were similar but had a difference of 0.55. When comparing the whole class's values, there was a small difference of 0.63. When looking at the standard deviations comparing the two locations, the polluted area’s value was 0.49, and the nonpolluted area was 0.28. This shows that there is more variation in the polluted area, but the overall values are still very similar to each other. Lastly, we performed an unpaired t-test assuming unequal variance and got a value of 0.145. This value being greater than 0.05 confirms the two unique locations have no direct impact on the soil’s pH level.

Conclusion: Condition 1 (Polluted- Ross-Ade) and Condition 2 (Nonpolluted- Cary Quad) do not have significant differences in their pH values. When looking at the specific numbers, they are all similar. To be sure, after performing an unpaired t-test assuming unequal variance, we got a p-value of 0.145, which is greater than 0.05, proving our result differences to be statistically insignificant. Given the information previously stated, our group is extremely confident that two unique locations (Polluted and Nonpolluted) do not have an impact on soil pH.

Explanation: There could be a multitude of reasons as to why the soil in both locations is rather basic. According to Zhang, decaying organic matter produces hydrogen ions, which contributes to acidity. Since both locations have a rather basic soil, this could be an indicator that there is not much decaying organic matter present. Ross-Ade (polluted) had a more basic pH than Cary Quad (non-polluted), which could be a sign that there is even fewer decaying organic matter here. One reason for this occurrence is that the Ross-Ade soil sample was taken from a patch of land in between a parking lot and road, a location where it would be hard for organisms to thrive and take shelter. This is very different from the Cary Quad sample, that was taken from the middle-edge of a larger patch of land with more trees and plants in the area. This would provide a better ecosystem for the organisms and therefore more decaying organic matter. Another cause of the basicity of the soil could be due to the runoff occurring more at Ross-Ade than at Cary Quad. The soil sample was taken from an area where a lot of people congregate on game days, and the runoff that results from that into the storm water drain could contain basic compounds that lead to a more basic pH of the soil. According to Halcyon et al., stormwater runoff can contain compounds like nitrogen, phosphorus, aluminum, and chromium. These are all elements that can be contained in basic compounds and therefore contribute to the more basic pH of the Ross-Ade sample.

Figure 2

Moisture Content Data Analysis of Polluted and Non-Polluted

Method:  After baking the soil samples in an oven at 105oC -110oC, we measured the mass of the dried soil and determined the percent moisture content by subtracting the dried soil mass from the original soil mass divided by the dried soil mass and multiplied by 100.

Legend: Average moisture content of the polluted (Ross-Ade) and the non-polluted (Cary Quad). The graph depicts the lab section’s average moisture content percentage for the polluted and non-polluted samples with the standard deviation error bars. The result of the unpaired t-test assuming unequal variance was 0.27019, which demonstrates that the two groups do not have a statistical difference since the p-value is above 0.05.

Evidence: The Moisture Content percentages comparing Ross-Ade (polluted) and Cary Quad (non-polluted) for our group had a difference of 30.68. When looking at the whole class’s differences, there was a difference of 35.6 comparing the polluted and nonpolluted. Comparing the standard deviation values, the polluted was 40.82% and the nonpolluted was 1.24%. This signifies the polluted area had a lot of variation in the group's results, but the nonpolluted area did not. While these differences seem to be large, after performing an unpaired t-test assuming unequal variance, we got a p-value of 0.270, which is greater than 0.05, signifying that our results are significantly insignificant and that the unique locations do not have an impact on the moisture content percentages.

Conclusion: Condition 1 (Polluted-Ross-Ade) and Condition 2 (Non-polluted- Cary Quad) do not have significant differences in their moisture content values. While directly looking at the moisture content percentages from each group, it is obvious that the values are similar. To ensure this, after performing an unpaired t-test assuming unequal variance, we got a p-value of 0.270, which is greater than 0.05, proving our results to be significantly insignificant. Given the information previously stated, our group is very confident that the two unique locations (Polluted and Non-polluted) do not impact soil moisture content.

Explanation: The moisture content of Condition 1 (Polluted; Ross-Ade) is slightly higher than Condition 2 (Non-polluted; Cary Quad), however based on the p-value, there isn’t a significant difference between the two. The graph shows that the polluted sample has a slightly higher average moisture content than the non-polluted sample, likely due to the source area. According to JoEV, “If all the pores become filled with water, excess water will now leach downward through continuous soil pores, until the rain or irrigation ceases” (JoEV, n.d.), so the water should seep farther into the ground over time as the soil drains. ScientistLive also states that the moisture content is relative to temperature, humidity conditions, and environment. Since the polluted area was located by a parking lot, it is likely that the water cannot drain farther into the soil than in the non-polluted area, located in grassy area. Since the water cannot travel far into the soil in the polluted area, certain areas could have contained more water than others; this may explain why there was a large outlier, causing the standard deviation to be quite high and possibly affecting the p-value. The standard deviation for the non-polluted samples is quite low since the samples were taken from a flat grassy area where the rain likely drained evenly, causing a more similar moisture content in each groups’ sample. The high standard deviation for the polluted samples could have caused a higher p-value and no statically significant difference between the two conditions.

Figure 3

Functional Biodiversity Data Analysis of Polluted and Non-Polluted

3A: Shannon Diversity Index

3B Richness

3C: Evenness

3D: Carbon Source Utilization Efficiency

Method: In separate tubes, we combined the soil sample and 9 mL of sterile water and mixed by flicking the tube. Afterwards we pipetted 1 mL of the mixture and added it to a new tube with another 9 mL of sterile water, creating a 1:10 dilution. Then we pipetted 10 µL of the 1:10 diluted soil mixtures to the separate agar plates/Ecoplates with glass mixing beads, by gently swirling the plate, we used the beads to coat the agar surface. We let the plate dry with the lid closed for several minutes and then place the two plates in a 25 °C incubator for a week. After the week, we removed the Ecoplates and compared the two samples' bacteria growth, bacteria colonies, colors, and consistencies.

Legend: Data from all four figures were found through the method of filling an Ecoplate with solutions of control, polluted, and non-polluted.  Figure A is the average Shannon Diversity Index and standard deviation of the polluted and non-polluted conditions. The t-test, assuming unequal variance values for Figure A-C, were above 0.05, which is the critical threshold for determining the significance of values, and therefore not statistically significant. Figure B is the average richness and standard deviations for the polluted and non-polluted conditions. Figure C is the average evenness and standard deviations for the polluted and non-polluted conditions. Figure D is the relative utilization efficiency percentages for the polluted and non-polluted conditions, in which it displays the percentages of carboxylic acids, amino acids, carbohydrates, polymers, and amines.

Evidence: When comparing the condition one (polluted) and condition two (non-polluted) areas, there are similarities and differences noticed. After combining and looking at the total class data for the richness of the soil, the two averages between the conditions only have a 1.3 value difference. This means all the groups in the class averaged to have similar richness data. For the standard deviation, there was a 0.35 value difference, and the p-value of the data is 0.44, which is greater than 0.05, signaling that there is no significant difference between the richness of academic and residential conditions.

When looking at the total class data for the evenness of the soil, the polluted conditions and the non-polluted condition had only a 0.001 value difference with the groups’ average data. There was a 0.005 value difference between the two conditions’ standard deviations, and lastly, the p-value was 0.76, which is greater than 0.05, and shows that there is no significant difference between the evenness of academic and residential conditions.

When looking at the total class data for the Shannon Diversity Index of the soil, the average values between the polluted condition and the non-polluted condition only differed in a 0.038 difference. The standard deviation between the two conditions had a difference of 0.035. The p-value between the entire class's data of the two conditions was 0.45, which is greater than 0.05, and shows that there is no significant difference between the Shannon index value of academic and residential conditions.

When looking at figure 3D, the relative utilization efficiency is relatively similar between the polluted and non-polluted sites, except for the amines and polymers. The polluted site has a bit over 10% of polymers, while the non-polluted has under 10% of the polymers utilized. Also, the polluted appears to have fewer amines utilized while the non-polluted has a slightly higher percentage of the amines utilized. 

Conclusion: In conclusion, when comparing the polluted and non-polluted conditions, there are not any significant differences between the richness, evenness, and Shannon Diversity Index values of the different soils. Based on the values listed above, my group is very confident and sure there is no noticeable difference between the polluted and non-polluted conditions. We are this confident because of the calculated p-values. All p-values for the three different categories the soil was tested on are greater than 0.05, showing there is no significant difference between the conditions.

Explanation: In this lab, the class compared samples of polluted to non-polluted samples to observe its effect on the soil’s biodiversity. Possible factors that affect the soil’s biodiversity and other aspects include pH, moisture content, temperature, and land usage. To effectively test our hypothesis, we measure the Shannon Diversity Index, the richness, evenness, and carbon source utilization efficiency of the soil. Shannon Diversity Index measures the diversity of species in the community, richness measures the number of unique species in the given area, evenness refers to the relative abundance of species, and carbon source utilization determines how effectively carbon-based resources are used and converted during various processes. The p-value of each graph demonstrates that there is very little variability between the two samples as they are all greater than 0.05. Although the samples were taken from two different areas, polluted and non-polluted, the locations were fairly close and endured the same environmental conditions. According to Cell Press, “if a site is studied for longer, or a wider area is surveyed, more species are likely to be encountered” (Magurran, 2021), so if we had studied the soil conditions for longer or in bigger quantities, we might have been able to observe more variability between the polluted and nonpolluted samples. Therefore, if the areas where the samples were taken had more variability, it’s probable that there would be more distinction between the two samples and the values collected for the Shannon diversity index, richness, evenness. Another variable includes the number of species in given area; due to the proximity of the two areas, the number and type of species is the same, and while the non-polluted area has less human contact, the area does not have the resources to support unique species to those at the pollute site. As for the carbon source utilization, both samples were picked from areas that were rich with plants and experienced the same weather conditions that could affect the usage of carbon, which may explain the similar data.

Figure 4

Genetic Diversity Data Analysis of Polluted and Non-Polluted

4A: Shannon Diversity Index

4B: Richness

4C: Evenness

4D: Taxa Bar Plot

Method: To find the results of these four figures, we extracted DNA from samples from our two conditions (polluted and non-polluted) and an analysis was done on the contents of the samples at Rush University. The data was then inputted into Nephele, and we ran QC, DADA, and QIIME diagnostic tests to get the values we used for the Shannon Diversity Index, richness, and evenness, and taxa bar plots.

Legend: Figure A is the average Shannon Diversity Index and standard deviation of the polluted and non-polluted conditions. The unpaired t-test assuming unequal variance for the Shannon Diversity Index is 0.058342, meaning that the data is not statistically significant. Figure B is the average richness and standard deviations for the polluted and non-polluted condition. The t-test assuming unequal variance is 0.051682, meaning that the data is not statistically significant. Figure C is the average evenness and standard deviations for the polluted and non-polluted conditions. The t-test for the evenness is 0.411492, and therefore the data is not statistically significant. Figure D is the Taxa bar plot of the species contained in the three polluted and three non-polluted samples.

Evidence: In figure A, the average Shannon Diversity Index of the non-polluted sample is only slightly greater than that of the polluted sample (0.216633 difference). The standard deviations for figure A also only slightly differ, with the polluted being .087287 and the non-polluted being 0.109665. The figure B richness averages have a greater difference, with the non-polluted average being almost a third greater (101.3333 (polluted) compared to 130 (non-polluted)). The standard deviations of the richness have a difference of about 5, with the non-polluted (14.17745) being greater than the polluted (9.712535). The figure C graph of evenness has a slight difference in averages, with the polluted being 1.57062 and the non-polluted being .898895. The standard deviations between the two conditions greatly varied, with the polluted being 1.130203 and the non-polluted being only .022542 (almost 50x smaller than the polluted). In figure D, there are some noticeable differences between the three polluted samples and three non-polluted samples when it comes to the relative frequency of the phyla. For the smaller percentages of phyla (the colorful small bars near the bottom of the graph), there is about the same percentage wise in the polluted and non-polluted samples. The larger bars at the top have a more noticeable difference in percentages, specifically the yellow bar (Crenarchaeota). There is a higher percentage of this in the polluted samples, almost 20% more than in the non-polluted samples.

Conclusion: To conclude, when comparing condition one (polluted) and condition two (non-polluted) areas, there are no significant differences among the richness, evenness, and Shannon index values. My group is very confident to say that there is no significant difference in these soils among the listed values. We know this because of the calculated p-values. All the p-values are greater than 0.05, which signifies no significant difference between the polluted and nonpolluted conditions.

Explanation: After having our DNA samples tested at Rush University, we have concluded that for the Shannon Diversity Index, Richness, and Evenness values, there is no statistically significant difference between the polluted and non-polluted samples in terms of genetic diversity. We have determined this because the p-values for the Shannon Diversity Index, Richness, and the Evenness were all above the 0.05 threshold value. The taxonomic bar plot also proves the species found in the polluted and non-polluted samples are relatively similar. However, there is a definite difference in the relative frequency of Proteobacteria (green), Crenarchaeota (yellow), and Verrucomicrobiota (brown) species’ percentages between the polluted and non-polluted samples, which may be related to the sample site. The amount of Verrucomicrobiota may be higher in the polluted sample as does well in marine environments as there are “Verrucomicrobiota meta-genome assembled genomes (MAGs) from freshwater”. (Orellana, Francis, Ferraro, Hehemann, Fuchs & Amann, 2021).

Collaboration Statements:

Sydney Brown completed the Figure 1 and 2 conclusion and evidence paragraphs, Figure 3 evidence, conclusion, and references, Figure 4 conclusion, explanation, and collaboration statement and the acknowledgement. Eliza Luttman completed the Figure 2 graph as well as the accompanying legend, explanation, and references, Figure 3 graphs, explanation and references, Figure 4 graphs, explanation, and references, Figures 1- 4 methods, and the discussion. Sydney Parpart completed the Figure 1 graph as well as the Figure 1 legend, explanation, and references, the Figure 3 graphs, legend, and the Figure 4 legend and evidence, purpose, general soil collection methods, and contributed to the Figure 3 evidence. All group members contributed to Figure 3 and 4 collaboration statement.

Discussion: 

This study was necessary to conduct to check the effectiveness of rain gardens and to ensure that dangerous bacteria/pollutants are not contaminating the water, which flows through the city’s sewage system and into the Wabash River. The purpose of the rain gardens is “capture the rain that would usually runoff your property and allow it to soak into the ground. This helps minimize runoff and helps reduce the amount of pollution that enters our waterways.” (Steeb, 2010). Rain gardens are effective against pollutants since “they are constantly being broken down (biodegraded) by microorganisms in the soil.” and “[s]ome plants actually absorb pollutants and use them as food.” (Purdue University, n.d., pg 25). The importance of this study was to determine if the environment surrounding the rain gardens affects the amount and type of bacteria(s) in the soil that is then transferred to the stormwater that enters the sewers. To determine this theory, we took soil samples from polluted and unpolluted areas to compare the differences in pH, moisture content, biodiversity and genetic diversity. Based on the results shown above, we have determined that there is not any statistically significant difference between the polluted and nonpolluted soil sample for all tested areas. Therefore, the environment of a rain garden does not influence the amount or type of bacteria found in the stormwater.

References:

Acknowledgement:

We greatly appreciate the help we received through this lab process. Thank you to our group members as we collaborated and communicated to get our work done. Thank you to our lab TIs, Olivia Vigilante, Abigail Smith, and Twesha Ray, and our TA, Madi Reid, for overseeing our entire lab section. Thank you to Dr. Adler who has been a great help this semester, and lastly, thank you to Purdue University for overseeing this lab and funding it.