1.2 Recording Raw Data

How can a scientist determine if two variables are related to one another? First she must collect the data from an experiment. Raw data is recorded in a data table immediately as it is collected in the lab. It is important to build a well-organized data table such as the example shown to the right. Use a ruler to lay it out neatly. You should always record your raw data in ink so that you will not accidentally erase a piece of data that you later decide you need. If you think that a given piece of data is in error, draw a single line through it and recollect the data point. Later, if you decide that the original point was really the correct one, you will still be able to read it.

You will notice that the experimenter has decided to run three trials for each mass value. Is this a good idea? Why? You should also notice that the experimenter has chosen to measure the time for 10 swings instead of the time for 1 swing. Why might she want to do this? Why not?

The independent variable, mass, is given in the first column. Scientists have agreed to consistently place the independent variable in the leftmost column. Whenever something is done as an agreed upon standard it is called a convention. It is conventional, therefore, to place the independent variable in the leftmost column of the data table.

Each column is labeled with the name of the variable being measured and the units of measurement in parentheses below the variable name. Notice that each data entry in a given column is written to the same number of decimal places. The measuring device and technique used in the experiment determines the number of decimal places. In the mass column she recorded mass to the nearest 0.1 g because her balance was calibrated to the nearest 0.1 g. In the "time for 10 swings" column the time was reported to the nearest 0.01 s because the stopwatch gave times to that precision. A case could be made for only reporting the time to the nearest 0.1 s due to reaction time. It is important to exercise good judgment when recording data so as to honestly report how certain you are of your measurements.

It is a good idea to construct the data table before collecting the data. Too often, students will write down data in a disorderly fashion and then try to build their data table. This defeats the purpose of a data table that is to organize and make certain that data is clear and consistent.

Once the raw data has been collected for the experiment, you will proceed to prepare the data for graphing. Before you can graph the data in many experiments, you will need to manipulate the data so that it is ready to graph. For instance, this experiment was designed to test the effect of changing the mass of the pendulum on the period. The period is defined as the time for one swing. The experimenter, in an effort to reduce the error associated with timing the pendulum, decided to measure the time for 10 swings instead of for only one swing. Obviously, a calculation must be done before the period can be determined. Also, the experimenter took three trials for each mass value. It is unnecessary to plot each of the times for each mass since there should be one representative time for each mass. Multiple trials are usually taken in an experiment where it is difficult to make a given measurement easily. Making multiple measurements (three to five) can help the experimenter determine whether or not the data is representative of the actual value or might be in error.

For instance if you were to measure the time for a 40.0 g pendulum five times and the times you recorded were 15.86 s, 15.53 s, 15.47 s, 16.55 s, and 15.72 s you might have an idea that something was wrong with one of your data points. The wise experimenter would then carefully collect another data point for the 40.0 g pendulum. If its value were closer to the other four points than the 16.55 s measured, it would be used in place of the point that seemed out of place. If the value remained in line with the 16.55-s measured, you might need to perform still more trials for the 40.0 g pendulum to determine the appropriate time for its swings.

Usually, when multiple trials are collected for a data point, the trials are averaged to determine a representative value of that data point. This should be done only if the trials seem consistent enough to warrant an average. If you have one or more trials that are significantly different than your others, you need to look for an error in your technique or equipment setup that might be causing the problem. If a problem is found, the data should be recollected for any trials for which the error might have affected the results.

Let's say that in the previous example, the retrial yields a time of 15.68 s. Since this is in the range of four out of the five trials, you might be justified in replacing the value of 16.55 with 15.68 s and then using an average of the five consistent trials to represent the data point.