After collecting data, ecologists use statistical tests to infer information about samples or populations. In this lab, we will cover how to use one of these tests by examining data from an experiment that explored if penguins were able to use scent to find fish (their primary source of prey). We will also examine similar data from other seabird species.
This lab will also demonstrate an example of the complicated relationships that make up food webs. It is not always as simple as one organism consuming another; sometimes these trophic (pertaining to food) dynamics can be indirect and have multiple levels, such as the one shown here.
Students should be able to:
Understand the concept of statistical significance and interpret the meaning of a p-value.
Use given biological information in addition to statistical evidence to draw conclusions about differences between groups.
Since the measurements ecologists collect exhibit variation due to both differences in the real world and sampling error, common questions focus on determining the true distribution of a parameter (a trait we are measuring) in a given population. Oftentimes we are focused on the average value for a given trait, but understanding the spread and shape may also be important. For example, we know that there is likely some variation in "normal" human body temperature and that thermometers may not always be accurate, but using data collection and statistics we have come to find the average human body temperature is 98.6 degrees Fahrenheit under typical conditions. We also observe that most healthy people should be close (within 1-2 degrees) to that value.
Statistics is a broad field of data analysis methods that allow us to use data to make inferences about the distribution of a parameter. Using statistics, we can answer any number of questions about the natural world by looking for signals that are related to the parameter of interest (like the average, or mean, value) that may be hidden in data among the noise, or variation we always observe. When the same traits are measured in different groups or under different conditions, ecologists can also use statistics to consider if groups actually differ in some way or if trait values are impacted by external factors.
Statistical analysis connects to experimental design through the focus on a null hypothesis (H0). This is a hypothesis of no difference. Depending on what you are focused on, it may state no difference between two values, no difference among groups, or no relationship exists between two measured variables. While starting with a hypothesis of no difference may seem odd, it means we have to collect data to justify new accepted values, differences among groups, or important relationships.
For example, assume someone told you that the normal human body temperature was really 102 degrees Fahrenheit. Would you believe them? Probably not, but we could use data collection and statistics to actually test that assertion. In this case, our null hypothesis might be that the true average human body temperature is 98.6 degrees, since that is the generally accepted value. Notice we are testing the average temperature here! This is often the signal we are focused on. This means we realize human body temperature may vary among individuals, but we expect the average normal human body temperature is 98.6 degrees. An alternative hypothesis (HA ) is that the true average human body temperature is not equal to 98.6 degrees (for example, it might be 102!). Statistical analysis allows us to see whether or not we have the evidence to reject the null hypothesis.
The scientific process usually requires a fair amount of evidence to reject a null hypothesis. That's because we know variation exists in our measurements. For example, if you found the temperature of one person to be equal to 101 degrees, would you immediately decide the true average "normal" human body temperature isn't 98.6? Probably not, because you know if we measured the temperature of 100 people we might expect some different results (maybe ranging from 98 to 100 degrees) based on factors like activity or health level, base metabolism, and error introduced by the measurement device. However, if we collect many temperatures and most of them were closer to a number other than 98.6 (like 102!), we might begin to suspect our null hypothesis was not correct (or that our thermometer is broken!). Statistics offers a variety of approaches to use the data we collect as evidence to test the null hypothesis. The ability to test hypotheses is a key part of the scientific method.
We often describe the strength of the evidence against a null hypothesis using a p-value. The p-value considers both the signal and noise we observed in the collected data. It does this by considering how likely we are to observe the signal we saw in our sample if the null hypothesis is true and we sampled the noisy population multiple times. Since we are assuming the null hypothesis is true, the p-value is just the probability of observing the signal we found in the data by chance. As a result, a high p-value means we likely observed the signal we found in our data just by chance. However, if the p-value is low, it means we were unlikely to observe the signal we found by chance if the null hypothesis was true, thus we have evidence against the null hypothesis. This is what is known as significance in statistics.
Scientists typically consider a p-value of less than .05 (<0.05) to be significant evidence against the null hypothesis. This means there is less than a 5% chance that the signal we observed has arisen due to chance under the null hypothesis. If you run a statistical test and end up with a p-value that is <0.05, you can reject the null hypothesis.
One major source of questions in ecology is how groups may differ from one another, whether that may be between individuals, species, or geographic areas. Multiple tests exist to consider this question depending on the data type and number of groups. Each of these tests starts with a null hypothesis that states there is no difference among the groups. If the test returns a p-value of <0.05, you can reject the null hypothesis of no difference. This means there is a significant difference among the groups.
If you’d like more clarification on p-values, watch this video, from StatQuest.
In this lab, you will analyze a dataset that compares the ability to sense a smell associated with food in two different age groups of king penguins (adults and chicks).
Read the background information below and answer the questions interspersed throughout as you go.
King penguins (Aptentodytes patagonicus) are a species of seabird that can be found at sea or on various islands surrounding the Antarctic. These birds can forage for food broadly, sometimes more than 400 km from the beaches where they raise their young (Bost et al., 2002). Adult penguins spend most of their time foraging at sea, but when breeding or feeding their young they will spend periods of time onshore.
King penguin chicks usually hatch between January and March. Once they have hatched, parents feed their young to prepare them for the coming winter. In the winter months (~June - September since we are thinking about the southern hemisphere), chicks exhibit behavior known as "crèching", where they fast and cluster together on the beach to withstand cold weather and predation under the watchful eye of adult penguins. Once this is over, chicks get fed by their parents again from September to December, reaching maturity in the crèche. When they are ready, the older chicks will join their parents on their first foraging trips (see Descamps et al., 2002).
The prey of penguins is often difficult to find since the ocean is very large and resources are unevenly distributed. Penguins, like other predators, must seek out productive patches of prey. These discrete patches of productivity arise due to variation in abiotic factors such as temperature, nutrient availability, and the patterns of the currents.
It is still not clear how penguins are able to find these patches where food is abundant, but one possible signal is the scent of an airborne gas, dimethyl sulfide (DMS). This gas is what is known as an infochemical, which means any chemical cue that tells organisms something about their environment. DMS is derived from dimethylsulfoniopropionate (DMSP), which is associated with the presence of phytoplankton (plant plankton) that make up the basis of the marine food web (Raina et al., 2013). DMSP gets converted into DMS when phytoplankton are being preyed upon by organisms such as fish, which may be a prey source for penguins. DMS releases into the air above the phytoplankton patch may thus serve as an airborne indicator of penguin prey location (Berresheim et al.,1989).
In this experiment, both adult penguins and their chicks were studied to see if they could detect and respond to DMS. Birds were studied when they were asleep to see if they would wake up if they smelled the signal that prey was near (DMS). Each age group (adult and chick) was divided into two treatments: one would be exposed to the scent of DMS (DMS dissolved in vegetable oil), and the other would be exposed to a control odor (vegetable oil alone). The birds' response to the odor was ranked on a scale of 0-2, with 0 being no response and 2 meaning that they were woken up.
You will now determine whether penguins respond the presence of DMS. We will evaluate chicks and adults separately (Note: if you've take another statistics course, you may recognize another approach could be used to assess this data).
First, complete the following steps for the adult data.
1. How many treatments are there? What are they?
2. Given the biological information above, create a null hypothesis about the relationship between king penguins and DMS that you expect to be able to observe and test using the experimental data. Your hypothesis should focus on what you expect to see when comparing the responses of the two groups (the one exposed to DMS and the one exposed to the control) and consider the average response to DMS in the two groups.
3. Then create corresponding alternative hypotheses.
Now open the dataset below (titled "Sleeping") in Google Sheets. Note it has two penguin tabs (at the bottom), one with adult and with chick data.
When making your graphs, make sure that you aggregate the data, and keep in mind that you can change what is being counted in your range.
4. Visualize the data by comparing how many of each variable (0, 1, or 2) there are in each treatment. Keep in mind, you are simply visualizing how many of each variable there is in every treatment.
a. What type of graph would you use to visualize this data (examples include scatterplots, bar graphs, etc.)? Explain your choice.
b. Make the chart(s). You can make one graph for the adult data and one for the chick data, or you can put them both on the same graph (ask your instructor about having multiple histograms on one chart or check out the Data Summaries in Google Sheets page).
c. Are the variables being tested categorical or quantitative?
5. Calculate the summary statistics for each treatment group (mean, sample size, standard deviation, and standard error). These are useful numerical summaries of the data.
Use Google Sheets formulas to do this. Use =AVERAGE(RANGE) and =STDEV(RANGE) for computing the mean and standard deviation respectively. To calculate your sample size (n), you can either count up your data points or use =COUNT(RANGE).
Google Sheets does not have formulas for standard error (SE), so you will need to create your own. SE = SD√n. You can use =(CELL)^0.5 or =SQRT(CELL) to calculate the square root. To find SE, you must take one of these values and use them in your formula as such: = SD/(n^0.5) or =SD/SQRT(n). Enter your formula in the SE cell.
You can also do this using pivot tables! See Data Summaries in Google Sheets .
6. Then, add an identical table to your spreadsheet and fill it out:
a. What does the mean tell us about this data?
b. How does sample size impact the standard error?
c. Explain what the standard deviation can teach us about a dataset.
7. Next, you will graph the average of the DMS and Control treatments (hint: ask your instructor about creating bar charts or check out the Data Summaries in Google Sheets page. )
8. Based on the figure you created, would you assume that there is a statistically significant difference between the Control and DMS treatments?
Now you will see if there is a numerically significant difference between the treatments by doing a statistical test.
You may expect to be using a t-test, but this requires us to assume that the data will be normally distributed, because it is quantitative. Our data, however, is qualitative and we cannot assume a normal distribution, as our data is not even expected (under the null hypothesis) to follow a normal distribution. Therefore, you will be using a special kind of test designed for this type of data known as a Mann-Whitney U-test to compare the means between groups. (Remember: You will first compare the control treatment to the DMS treatment for the adults first and then do the same for the chicks!)
The steps to do this are as follows:
You will be using a free online calculator for t-tests and other statistical analyses found here: http://vassarstats.net
When you get to the main page, click on the tab in the menu box on the left labeled "Ordinal Data". Then choose the option for "Mann-Whitney Test"
A pop-up window will appear for you to enter your sample size for each treatment. The Control treatment is Sample A and the DMS treatment is Sample B. Use the sample size (n) you calculated above.
You will copy and paste each column of data for the Control treatments in the box for Sample A and the data from the DMS treatments in the box for Sample B (the numbers pictured below are fake data to show you how the entry will look--do not use them! Only use the data in the Sleeping spreadsheet).
Once you have your data in the boxes, click “Import data into data cells”. This will autofill the “Raw Data for…” in the “Data Entry” heading.
Then, click the “Calculate from Raw Data” button at the bottom of your autofilled Data. It will give you a series of different outputs. The output we will be using in this exercise is the p(2) value; this corresponds to a null hypothesis that states the two groups are equal to each other.
9. What is the p(2) value for the first test with adults? Is there a significant difference between the groups exposed to DMS and those exposed to the Control?
Now complete steps 1 - 9 for the chick data before you answer the following questions! You may simply click “Clear A” and “Clear B” before entering the raw data again, but from the chick tab of the spreadsheet.
10. What p-values did you get when testing the two groups? Based on your results, can you say that king penguin adults and chicks have different responses to DMS?
11. Why do you think adults and chicks respond differently to DMS? Think back to the process by which penguins end up detecting DMS.
Examine the following image which depicts an evolutionary tree. The different names at the end of each branch describe various taxonomic orders of birds. Each branch of the tree shows where two groups split off from one another earlier in the evolutionary history of birds. Usually, the more related two groups of animals are, the more characteristics they share.
12. King Penguins are Sphenisciformes. Based on the above figure, if you had to guess which order of birds would be most likely to also detect DMS in water, which would it be?
Now, you will look at data about DMS response gathered from a different species of seabird, the Blue petrel (Halobaena caerulea). This bird is a Procellariiform, an order that includes albatrosses, shearwater, and petrels. In contrast to penguins, Procellariiforms bury their eggs during incubation, while penguins simply keep it warm between their feet. Procellariiforms parents do, however, forage food for their chicks, just like Sphenisciformes.
Blue petrel adults have also been shown to respond to DMS, and you will be determining if the chicks are capable of doing the same. The blue petrel data can be found in the third tab of your spreadsheet. This data was gathered from an experiment where chicks were placed in a simple, symmetrical maze and given the option to walk to a corner with DMS (DMS Treatment), or to one without (control). The response of chicks was recorded, even if they made neither choice. You will need to exclude chicks who made no choice from your data analysis. Keep in mind, there are only 24 chicks in this study, and the data is presented differently from the penguin data.
13. Unlike penguins, who learn to hunt with their parents, petrels are left on their own for their first hunt. How do you think this will reflect on the ability of blue petrel chicks to detect DMS?
14. Using the data in the blue petrel tab, create a figure showing how many petrel chicks walked to the DMS, how many walked to the control, and how many made no choice. Use only column B as the range for your graph, as the information in column A is not relevant.
You may sort column B by right clicking column B and clicking “Sort sheet A to Z”. Row 1 is frozen, so it will not be sorted along with the rest of the data. You may also use the COUNTIF function to count how many of each variable there is in column B. For counting how many times “Control” appears, you would make a cell with “=COUNTIF(B2:B25, "Control")”. Make sure to exclude the outer quotations when putting this equation into your own spreadsheet. Adjust the words between the quotations in the parentheses of the equation to change what you are counting.
We must now conduct a statistical test of this data, despite how the evidence may seem at first glance. We need a p-value. Since our data is binary (only two states), we will perform a binomial test.
15. What is the null hypothesis for this test? How many chicks should we expect to select the DMS and control treatments, respectively, under the null hypothesis?
To perform our binomial test, we will use the =BINOMDIST function. Our equation will look like this: =BINOMDIST(s, t, 0.5, FALSE). You must substitute “s” for the number of chicks that chose DMS, and “t” for the total number of chicks tested (remember, you are excluding the chicks that made no choice). The “0.5” in the equation denotes our assumed distribution (50/50 split) and “FALSE” is meant to tell Google sheets that our data is not cumulative.
16. What p-value did the binomial test yield? Translate what this result tells you about blue petrel chick attraction to DMS into words.
17. What would you study next to further understand the role that DMS plays in seabird food foraging?
This lab is adapted from:
Kaitlin M. Bonner and Gregory B. Cunningham. March 2018, posting date. The nose knows: How tri-trophic interactions and natural history shape bird foraging behavior. Teaching Issues and Experiments in Ecology, Vol. 13: Practice #8 [online]. http://tiee.esa.org/vol/v13/issues/data_sets/bonner/abstract.html with permission.
Berresheim, H., M.O. Andreae, G.P. Ayers, and R.W. Gillett. 1989. Distribution of biogenic sulfur-compounds in the remote southern-hemisphere. Pp. 352–356 in Saltzman, E. J. and Cooper, W. J. (eds.), Biogenic sulfur in the environment. American Chemical Society.
Bonner, K.M. and Cunningham, G. B.. March 2018, posting date. The nose knows: How tri-trophic interactions and natural history shape bird foraging behavior. Teaching Issues and Experiments in Ecology, Vol. 13: Practice #8 [online]. http://tiee.esa.org/vol/v13/issues/data_sets/bonner/abstract.html.
Bost, C.A., T. Zorn, Y. Le Maho, and G. Duhamel. 2002. Feeding of diving predators and diel vertical migration of prey: king penguins’ diet versus trawl sampling at Kerguelen Islands. Marine Ecology Progress Series 227:51–61.
Descamps, S., M. Gauthier-Clerc, J.-P. Gendner, and Y. L. Maho. 2002. The annual breeding cycle of unbanded king penguins Aptenodytes patagonicus on Possession Island (Crozet). Avian Science 2:87-98.
Raina, J.B., D.M. Tapiolas, S. Forêt, A. Lutz, D. Abrego, J. Ceh, F.O. Seneca, P.L. Clode, D.G. Bourne, B.L. Willis, and C.A. Motti. 2013. DMSP biosynthesis by an animal and its role in coral thermal stress response. Nature 502:677–680.