Chapter 1: The Scientific Method and Data Collection

Introduction

The Scientific Method

Steps to the Scientific Method

Statistical Analysis of Data

Reporting Our Data

Exercise 1: Review articles and primary articles.

Exercise 2: Statistical Analysis (Optional!).

Glossary

Introduction

In this chapter, you will learn about all of the steps of the scientific method and practice following those steps to carry out an experiment.

The Scientific Method

The scientific method begins with an observation that prompts a question. For example, we accidently spill hydrogen peroxide on some cells growing in a tissue culture dish and observe that there are more dead cells in that culture dish than in another dish. Based on this observation, we ask the question “Does hydrogen peroxide treatment result in an increase in cell death?”

Next, we formulate a hypothesis. The hypothesis is a prediction (educated guess) about the answer to our question. It is generally a good idea to do some preliminary literature research about the question we are asking before making a hypothesis to ensure that prediction actually has a basis.

For statistical reasons, it is necessary to state a null hypothesis and an alternate hypothesis. A null hypothesis (H_o) is a statement of no difference, so in this example, our null hypothesis would state that there is no difference in the cells treated with hydrogen peroxide and those with no treatment. One alternative hypothesis (H₁), however, would state that there is more cell death with treated cells than untreated cells, and the other alternative hypothesis (H₂) would state that there is more cell death in untreated cells than treated cells. See figure 1 below

Upon collecting data, a statistical analysis of that data will allow us to reject the null hypothesis in favor of one of the alternatives, or we will be unable to reject the null hypothesis. Often this means that there is no difference, but it can also mean that we need to collect more data.

Once we have established a hypothesis, it is necessary to design an experiment or set of experiments to test that hypothesis. The experiments should be well thought out in advance and the following should be taken into consideration: The Scientific Method is a systematic way of asking questions and finding the answers to those questions. It involves five basic steps, which will be outlined in this chapter. It is important to understand this method and how to apply it to our experiments in this class. This information will be important in the second half of the semester for designing our own project.

1. What will we measure? How?

2. What will our sample size be?

3. How will we ensure that our sample reflects the entire population?

4. How will we keep variables that are not being measured controlled between the groups?

5. How will we analyze our data?

We should always try to keep variables that are not being measured the same between different groups being studied. However, this is not always possible. It is important to include controls in our experiments. The untreated sample in the experiment defines a baseline by which treated samples are compared. A positive control ensures the experimental method is functional, and a negative control eliminates the potential for a false positive. In this example, the viability of 100 randomly selected treated cells and 100 randomly selected untreated cells will be measured and the data for each group will be averaged to see if there is a difference.

Figure 1. Comparing hydrogen peroxide treated cells with untreated cells. Null hypothesis states that there is no difference in the cells treated with hydrogen peroxide and those with no treatment. Hypothesis one states that there is more cell death in treated cells than in untreated cells. Hypothesis two states that is more cell death in untreated cells than in treated cells.

Steps to the Scientific Method

1. Observation/Question

2. Hypothesis

3. Experimental Design

4. Data Collection/Analysis

5. Conclusions

After designing the experiment, we should perform the experiment and carefully collect and record all of the data in an organized manner. Be sure to include the units for all of the measurements we take and make any notes of problems encountered. It is important to keep very detailed notes of our experiment and data collection because it is easy to forget minor details between performing the experiment and writing the results in a paper. Whenever it is possible, we will want to repeat the experiment several times to show that the data is reliable. The data can then be analyzed using one of many statistical tests, and an overall conclusion can be derived from the experiment.

Statistical Analysis of Data

1. Descriptive analysis is used to show patterns in data.

· Mean - The sum of the observed data values divided by the number of observations.

· Median - The middle value of a data set when there is an odd number of values or the average of the two middle values when the number of values is even.

· Standard deviation - Shows the amount of difference among the data values. The standard deviation is small when the values are close to the mean; and large when there is a wide range of variation in the data.

· Standard error - The measure of how far a random variable (all possible outcomes of an experiment) is likely to be from its expected value.

· Confidence interval - The probability that all values will lie within the limits of greater than 5% and less than 95% of all possible outcomes.

2. Comparative is used to compare means of groups of data

· Paired t-test - t tests usually are used to compare the means between two samples; paired t tests compare one dependent variable in an independent variable, i.e. growth measured in individuals in untreated controls compared to treated individuals. The independent variable is the variable that is manipulated by the researcher, and the dependent variable is the response measured.

· ANOVA - Analysis of variance is a t test used to measure one or more dependent variables in one or more independent variables, i.e. height, weight, number of leaves in plants which have been grown for a period of time in low light or high light.

3. Frequency is used to compare experimental results with expected results.

· Chi squared test - Used to show the estimated similarity between a set of observed data and a random set of expected results values, i.e. the number of red, pink, and white flowers resulting from a cross between a white-flowered plant and a red-flowered plant with the expected results of 1 red:2 pink:1 white.

Suppose that in this example, the average viability for the untreated cells was 90% and the viability for treated cells was 85%. When a statistical test was performed on all of the data, this difference was shown to not be statistically significant. However, these results differ from your observation that originally prompted the question. What kinds of conclusions can you make?

You should be sure that the conclusions you draw follow logically from the data that you have provided and also answer the question that you originally asked. In this case, you can conclude that untreated cells have a higher viability than treated cells, on average. You may want to offer suggestions about why your data does not support the original idea (maybe there was a problem with the hydrogen peroxide) and also offer suggestions of future research that can be done to address such issues (maybe determine if using a freshly opened bottle of hydrogen peroxide will affect results, or varying concentration will give different results). It is very rare to complete an experiment and leave no questions unanswered.

Figure 2. Close up image of a microscope objective and stage. Image courtesy of Unsplash and Michael Longmire. Thank you!

Figure 3. DNA Genotyping and Sequencing. A technician loads DNA samples into a desktop genomic sequencing machine at the Cancer Genomics Research Laboratory, part of the National Cancer Institute's Division of Cancer Epidemiology and Genetics (DCEG). Creator: Daniel Sone. Courtesy of Unsplash.

Reporting Our Data

Whenever a scientific discovery is made, the most important thing to do is to share that information. For this reason, it is important to learn how to report our data so that others can understand and use it. Many future discoveries may be based on our results, so we want to be sure they are accurate and clearly communicated.

Generally, data is shared in the scientific community by publishing in journals. These journal articles are typically written in a way that demonstrates the use of the scientific method.

A journal article usually contains five sections:

1. The Abstract is the summary of the article that is given at the beginning.

2. The Introduction is the part of the paper that tells you some of the work that has been published in the area already, the reasoning for the experiments that were performed and the hypothesis for the experiments.

3. The Experimental section (Methods and Materials) gives a detailed account of how the experiments were performed.

4. The Results section describes all of the results and provides pictures or figures that illustrate these results.

5. The Discussion section gives all of the conclusions that were made based on the results. This section of the article typically offers many alternative interpretations and ideas for future work in the area.

Figure 4. Microcentrifuge tubes in a rack. Some of them are DNA samples while the remainder of them are primers to be used in polymerase chain reaction, or PCR, a laboratory technique used to make multiple copies of a segment of DNA. Courtesy of Unsplash and National Cancer Institute.

Figure 5. Laboratory Researcher. National Cancer Institute researcher Chanelle Case Borden, Ph.D., setting up genetic samples and primers for polymerase chain reaction, or PCR, a laboratory technique used to make multiple copies of a segment of DNA. Photographer Daniel Sone, courtesy of Unsplash and National Cancer Institute.

Exercises 1 and 2

Here's the activites we'll be doing in lab!

Exercise 1: Review articles and primary articles.

There are a number of online resources that will allow us to search for a primary journal article in a given subject. One of the most common search engines is PubMed; another is Google Scholar.

Be sure that you recognize the difference between primary journal articles and review articles. A primary article is written by people who actually performed the experiments and are presenting their data. A review article, however, generally summarizes the work of several different groups of scientists. Review articles generally do not present new data, they analyze data and conclusions taken from other papers. Because of this, reading a review article “feels” like reading from a textbook. Both are useful, but we should always go to the primary source when looking for information as a basis for our research.

1. Begin by finding both the review article “The HL-60 promyelocytic leukemia cell line: proliferation, differentiation, and cellular oncogene expression” by S. J. Collins (1987) Blood 70, 1233-1244. and the primary article “Fibronectin-mediated Cell Adhesion Is Required for Induction of 92-kDa Type IV Collagenase/Gelatinase (MMP-9) Gene Expression during Macrophage Differentiation” by Bei Xie et al. (1998) J. Biol. Chem. 273 (19), 11576-11582

2. Save a copy of the files.

3. Read (lol just skim) through the review article (Collins, 1987) before next class for discussion.

Exercise 2: Statistical Analysis (Optional!).

Need more help with statistics? Here it is! This is optional.

Statistical Analysis

Glossary

1. Alternate hypothesis: This states any possible outcome in addition to your prediction

2. ANOVA: Analysis of variance is a t test used to measure one or more dependent variables in one or more independent variables, i.e. height, weight, number of leaves in plants which have been grown for a period of time in low light or high light.

3. Chi squared test: Used to show the estimated similarity between a set of observed data and a random set of expected results values, i.e. the number of red, pink, and white flowers resulting from a cross between a white-flowered plant and a red-flowered plant with the expected results of 1 red:2 pink:1 white.

4. Conclusions: Interpretation of results and acceptance or rejection of stated hypothesis/hypotheses

5. Confidence Interval: The probability that all values will lie within the limits of greater than 5% and less than 95% of all possible outcomes.

6. Control: The untreated sample in the experiment defines a baseline by which treated samples are compared. Positive control insures the experimental method is functional. Negative control eliminates the potential for a false positive.

7. Data collection: This requires meticulous documentation of results

8. Hypothesis: Based on your observations and research literature review, propose a plausible outcome.

9. Literature research: Rely only on peer-reviewed scientific publications

10. Mean: The sum of the observed data values divided by the number of observations.

11. Median: The middle value of a data set when there is an odd number of values or the average of the two middle values when the number of values is even.

12. Null hypothesis: A statement within the context of the experiment that demonstrates no statistically significant difference among variables

13. Observation: This may be anything based on your previous lab experience and/or preexisting knowledge

14. Paired t-test: t tests usually are used to compare the means between two samples; paired t tests compare one dependent variable in an independent variable, i.e. growth measured in individuals in untreated controls compared to treated individuals. The independent variable is the variable that is manipulated by the researcher, and the dependent variable is the response measured.

15. Standard Deviation: Shows the amount of difference among the data values. The standard deviation is small when the values are close to the mean; and large when there is a wide range of variation in the data.

16. Standard Error: The measure of how far a random variable (all possible outcomes of an experiment) is likely to be from its expected value.

17. Statistical tests: Descriptive statistics include mean, medium, mode, standard deviation, standard error, confidence interval; Comparative statistics include paired and unpaired t-tests, ANOVA; Frequency statistics include Chi-squared tests.

Page updated

Google Sites

Report abuse