Analysis of experimental results

Raw data

Experimental results often include many data points, with multiple treatments for the independent variable and multiple trails or repetitions for each treatment. All these data points should be presented as experimental results, in a raw data table. In a raw data table only data collected experimentally is presented (not data obtained from calculations) and it always includes the uncertainties of the measurements in the headers, together with the units. A very short paragraph may be included under the table with a brief explanation on the calculation of uncertainties. In a raw data table the columns contain the trials and the raws the treatments (with the respective uncertainties in the headers). Raw data table:

Descriptive title including the dependent, the independent and one (or more) relevant constant variable(s)
Headers with units, uncertainties (in brackets)
Trials (repetitions) shown in raws (horizontally) and treatments shown in columns (vertically)
Consistent decimal places between raw data values and uncertainties
Caption text under the table explaining how the uncertainty has been calculated or/and providing some relevant information about the data table
Table is well defined and lined with information presented clearly, all cells have the same size and data is centered in the cells.

The raw data should also include specific qualitative observations made during the experimental data collection. Qualitative observations must include only relevant observations that were made during the data collection. Only observations that can can influence the experimental results should be included. The relevant qualitative data may answer some of these questions:

Was the data for all the trails collected exactly the same way? Where the trials identical?
Was the data for all the treatments collected the same way? Are the treatments comparable?
Was there any difference in the measuring devices in any of the experimental setups?
Did you observe something unexpected or interesting during the experimental data collection that may be related to the results of the experiment?

Experimental control (control group)

Experimental controls are used to increase the validity of an experimental investigation. An example would be a sample in which the IV is zero,. You can't test the effect of your IV on your DV by using a sample that has zero of your IV. So IV=0 is called an experimental control and is used to compare the other samples to that one. For more info read this.

Qualitative data

Qualitative data should address observations that were made that are not considered in the quantitative data. In the experiment that measures the activity of catalase by measuring the height of the foam layer formed, qualitative observation would address the details of the data collection process, such as the speed at which the foam layer formed or the differences between the foam layer at different temperatures. Where all the foam layers identical? Probably not. Did they all the bubbles form exactly at the same speed? Probably not. Were the bubbles still forming after on minute in all the trials and all the treatments.... the answer is likely to be NO, that's why qualitative is important!

Qualitative data should focus on the differences between trials and between treatments. Trial samples are supposed to be identical but they are often NON identical, as there are tiny differences between them... those should be observed and recorded, and used to support other statements in the report, such as outliers, or limitations presented in the evaluation.

Tables

Table title - Each table is titled by the table number AND a descriptive title (explained later). Title is typed before the table is shown (above) and separated from the actual table.
Organization - The IV (independent variable) should be on the left column and the DV (dependent variable) on the right column.
Headers - Every column should have a header at the top that includes the name of the variable AND the unit AND the uncertainty. Units and uncertainties should be included only in the headers.
Precision - Each variable in the table (DV and IV) should have consistent precision (all data should be to the same number of decimal places). The number of decimal places must be consistent with the uncertainty of the measuring device.
Range - The IV values should vary consistently over the range tested.
Formatting - Tables should never be split over pages. Words should not be broken across lines. Gridlines should be included.

Graphs

Title - Each graph is titled by the graph number AND a descriptive title (explained later). The graph title should be typed before the graph is shown and separated from the actual graph.
Type - The appropriate type (bar or scatter plot) of graph should be used according to the type of data collected (continuous vs categorical data).
Axes - The IV (independent variable) should be on the x-axis and the DV (dependent variable) on the y-axis. Both axes should be labeled with the variable AND units.
Scale - Scatter graphs should (almost always) start at (0,0). The maximum scale on each axis should be just beyond the maximum values of the data.
Trendline - Scatter graphs should include an appropriate trendline labeled with the linear equation.

Descriptive Titles

Titles always begin with the Table Number or the Graph Number
Includes the IV (independent variable)
Includes the DV (dependent variable)
Includes some context that describes the specific situation of the experiment (often controlled variables are included).

Making up data (quantitative or qualitative) is a violation of scientific conduct.

I strongly recommend reading this little booklet

My advice: before presenting your lab report or your IA, read this booklet, especially pages 5 to 25!

Statistics-Teacher-Guide.pdf

Mathematics and statistics in Biology

Data processing and interpretation

Raw experimental data may sometimes be abundant and difficult to comprehend. To simplify the understanding of the raw data, scientist use some mathematical formulas and statistical tests to enhance the understanding of raw data and to obtain more information from the raw data. This is called processed data. Because processed data is created to further understand the raw data, processed data must be interpreted.

Data processing should be presented in a different data table than the raw data, because it involves the use of formulae and it is a different type of data than the raw data, which is collected experimentally. Data processing must include the formulae used and a sample calculation (demonstrating the understanding of the formulae).

There are many different ways to process experimental raw data, and the method used to process data must be consistent with the research question that the experiment aims to answer. Keep this in mind: there is many ways of processing data and many of them may be valid, as long as they address the RQ. Each RQ requires a specific approach to the data processing. So before processing raw data is critical to think and define, what is the goal of the data processing? Data processing must be interpreted. What does the processing adds to the raw data? What can be concluded for the data processing? It is critical to understand the formulas and why are they used. A lab report must demonstrate full understanding of the data processing and its interpretation.

The data processing aids the presentation of data and some of the processed data is often presented in a graphical form. But the graph is not enough: the processed data must be interpreted.

Formulae and sample calculations used in data processing

For each processed data, the formula used for processing and a sample calculation should be provided, right after the processed data table. Use the RAW data presented before (quantitative raw data table) and show how each PROCESSED data is obtained by applying the formula provided, step by step. Demonstrate understanding of data processing presented. No written steps needed, just formulae and numbers filling the formulae.

If an uncertainty is shown, an explanation of how it is calculated should also be included.

Data interpretation

Interpretation is the action of explaining the meaning of something. All processed data must be interpreted. What does it mean in your experiment?

Average
Standard deviation
Errors and error bars
R-squared (graphical)
Statistical analysis: Pearson / Anova / T-test / Chi-squared test (...)

Graphic presentation of data

Processed experimental data is often presented visually in a graph. The dependent variable is shown in the x-axis and the independent variable in the y-axis, both of them including labels with units and uncertainties. The type of graph should be consistent with the type of data (continuous or discrete data?).

Graphs often show the averages of the trials for each treatment and the standard deviation values for each treatment are used as error bars, as an indicator of reliability of the specific average measurements. If the graph is a scatter plot, a trend line can be calculated. The equation and the r-squared value from the trend line can be obtained from the graph. If the r-squared value is calculated, an interpretation should also be included.

Graphs

Discrete variables should be graphed using bar charts.

Continuous variables should be graphed using line charts.

A trendline should be calculated and interpreted, according to its r-squared value and its formula.

Error bars must be shown and interpreted. If average data is plotted, the error bar (variability of data) may be shown by the standard deviation (uncertainty must be also considered).

Always graph the IV on the X-axis and the DV on the Y-axis.

Graph and interpretation

In science, the goal of an experiment is to study the possible relationship between two variables, the dependent and the independent. Measurements in the dependent variable are recorded as the independent variable is changed (intentionally). Raw data is collected for the changes observed in the DV as the IV changes and data processing is done later to obtain as much information as possible form the data collected. The final goal is to make a visual representation of the changes observed and measured. This is done by plotting a graph. In general, the DV is shown in the Y-axis and the DV in the X-axis. The processed data is displayed on the the graph and an interpretation is made based on the visual representation of the relationship between the two variables defined for the experiment.

The graph shown here and more info can be found here.

Statistical significance

Imagine you observe a trend in your results and it seems that the differences in your DV are caused by your IV... how can you be sure that the changes on the IV are caused by the changes in the IV? What if the alterations in the DV are due to chance and not due to the changes in the IV? One way to address this issue is to run a statistical test. Pearson is often used for discrete variables and ANOVA (analysis of variance) for continuous data. See this for more info.

How much raw data do I need? - Wisdom of Crowds by Tommie Hennard

Today we did a moles and solutions introductory lab in which the students all made 5 different solutions that were all supposed to be 0.400 mol/L in concentration. Hardly any of them were, as it was their first time making solutions.

After they were finished, we had about 2 L of "waste" which was all of the solutions made during the day. At the end the day I brought the students together, told them the idea of the wisdom of crowds and then we tested the waste. It was 0.401 mol/L, a mere 0.25% different from what it ideally should have been.

And that, as I explained to them, is why scientists take lots of data.

Why is data processing important? A personal note from your teacher.

I have been asked multiple times why I find data processing so interesting. To answer that question, I may need to blame my dad, who is a mathematician and a professor in statistics. Here what I learned from him:

Processing allows you to better understand what's in your data: it gives meaning to numbers :-)
There are multiple conclusions that can be obtained from the same data set, depending on how it is processed.
It is difficult to understand large sets of data, unless we do some data processing: data processing reveals meaning in raw data
The goal is to summarize information so that we can understand it in its full potential
Formulas allow us to gather information that will otherwise take a huge amount of time to obtain

Analysis conventions

Data must consist of an appropriate number and range for the independent variables (five is often enough).
Data must consist of (at least) five repeated trials for each independent data point.
Raw data should be both quantitative and qualitative (two different tables).
Qualitative data is recorded for each trial when appropriate (table similar to quantitative data).
A sample calculation and the formula used for each calculation is included. Numbers used in the sample calculation must appear in the raw quantitative data table.

Report abuse