Getting to know your data through exploring the distribution of responses on each survey question is the first step in quantitative analysis. Descriptive statistics help us organize information from our survey so that we can explain our data and display trends graphically. We will begin by reviewing scales of measurement in our dataset. Then we'll take a look at the shape of the distribution of scores on variables in the data and go on to describe the distribution's center and variance.
Bell-shaped distribution: a distribution in which most scores fall near a central value, producing a symmetrical shape.
Frequency distribution: a summary that describes how often a specific score appears in a dataset.
Measures of central tendency: a single number that tells us something about a typical score in a distribution (e.g., mean, median, mode).
Skewed distribution: a distribution in which most scores fall near the high or low end of the values possible, producing a long tail on one end.
Statistics: a broad range of techniques and procedures for gathering, organizing, analyzing, and displaying numeric data.
Variables: characteristics of individuals that differ, or vary.
Variation: the dispersion, or spread, of scores around the central, typical value in a distribution.
If you're feeling stressed about studying statistics, stop right now and check out the video at right about managing math anxiety. You, too, can do statistics!
Welcome to PSPP
For analyzing data in this class we will be using an open-source (free!) statistical software package called PSPP. We will provide paper copies of lab guides in class, and digital back-up copies will also be available on this site. For ease of learning, we will be using the Graphic User Interface, which provides drop-down menus. Follow the slide presentation at left to get started.
One of the first things we do when we begin analyzing data is to examine the characteristics of each individual variable we're interested in using in our analysis. The distribution of scores on a variable may take on a normal or skewed shape, and there may be a lot of variation in responses or very little variation at all. These characteristics give us a better understanding of how people responded to survey questions, and they tell us whether or not the variable is a good candidate to use as we move forward with our analysis. Learn more in the presentation at right.
Read more about the use and interpretation of descriptive statistics from the Towards Data Science blog, below.
Identifying the characteristics of frequency distributions is an important first step in a quantitative data analysis. Distributions that are highly skewed and those that have little variation will not be suitable for subsequent analyses, in which we'll be interested in comparing groups and exploring associations. If there is not enough variation in scores on an item, there won't be differences to observe or compare. Frequency explorations help you get familiar with your data and refine the variable list your analysis will include.
Photo Credit: Kendra Stanley-Mills