Key Terms and Definitions of Probability and Statistics

Key Definitions of Probability and Statistics

In statistics, we generally want to study a population. You can think of a population as a collection of persons, things, or objects under study. To study the population, we select a sample . The idea of sampling is to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population.

Because it takes a lot of time and money to examine an entire population, sampling is a very practical technique. If you wished to compute the overall grade point average at your school, it would make sense to select a sample of students who attend the school. The data collected from the sample would be the students’ grade point averages. You could use your sample data to make generalizations about the grade point average of the entire school. In presidential elections, opinion poll samples of 1,000–2,000 people are taken. The opinion poll is supposed to represent the views of the people in the entire country.

From the sample data, we can calculate a statistic . A statistic is a number that represents a property of the sample. For example, if we consider one math class to be a sample of the population of all math classes, then the average number of points earned by students in that one math class at the end of the term is an example of a statistic. The statistic is an estimate of a population parameter. A parameter is a number that is a property of the population. Since we considered all math classes to be the population, then the average number of points earned per student over all the math classes is an example of a parameter.

One of the main concerns in the field of statistics is how accurately a statistic estimates a parameter. The accuracy really depends on how well the sample represents the population. The sample must contain the characteristics of the population in order to be a representative sample. We are interested in both the sample statistic and the population parameter in inferential statistics. In a later chapter, we will use the sample statistic to test the validity of the established population parameter.

A variable , notated by capital letters such as X and Y, is a characteristic that takes on different values for different individuals in a population or a sample. It is something that varies. Variables may be numerical or categorical. Numerical variables take on values with equal units such as weight in pounds and time in hours. Categorical variables place the person or thing into a category. If we let X equal the number of points earned by one math student at the end of a term, then X is a numerical variable. If we let Y be a person’s party affiliation, then some examples of Y include Republican, Democrat, and Independent. Y is a categorical variable. We could do some math with values of X (calculate the average number of points earned, for example), but it makes no sense to do math with values of Y (calculating an average party affiliation makes no sense). As statisticians, we may also be interested in measuring constants . Constants are a characteristic that does not change its value in a given context in a sample or population. For example, the number of days in a week is a constant.

Data are the actual values of the variables and constants we measure. They may be numbers or they may be words. Datum is a single value.

Determine what the key terms refer to in the following study.

A study was conducted at a local college to analyze the average cumulative GPA’s of students who graduated last year. Fill in the letter of the phrase that best describes each of the items below.

1._____ Population

2._____ Statistic

3._____ Parameter

4._____ Sample

5._____ Variable

6._____ Data

  • a) all students who attended the college last year

  • b) the cumulative GPA of one student who graduated from the college last year

  • c) 3.65, 2.80, 1.50, 3.90

  • d) a group of students who graduated from the college last year, randomly selected

  • e) the average cumulative GPA of students who graduated from the college last year

  • f) all students who graduated from the college last year

  • g) the average cumulative GPA of students in the study who graduated from the college last year

Watch the following video for a brief introduction to statistics.

Before we finish this section, we want to focus more closely on studies that investigate a relationship between two variables. One type of study is an experiment . The goal of an experiment is to establish a causal relationship between two variables. In experiments, the variable we manipulate to cause a change is called the independent variable . The variable we think will change as a result of us manipulating the independent variable is called the dependent variable . To establish a cause-and-effect relationship, we want to make sure the independent variable is the only thing that impacts the dependent variable. Therefore, in an experiment, we get rid of all other factors that might affect the dependent variable. Then we manipulate, or change, the independent variable. Our goal is to see if the change in the independent variable causes the dependent variable to change as well.

Sometimes, we do not have the ability to get rid of all of the other factors that may impact our dependent variable. In these cases, we can do a correlational study. Because we cannot get rid of all the other factors, we cannot test for causation in correlational studies. There may be something causing a change that we do not know about as researchers! However, correlational studies are still important because they still allow us to observe how two variables are related. In correlational studies, the researchers do not manipulate the one variable to see how it impacts another. They just collect data and look for an association between the two variables. In correlational studies, instead of calling one variable an independent variable, we call it a predictor. Similarly, in correlational studies, instead of using the term dependent variable, we use the term criterion . The terms change because we are doing a different type of study, but the meanings are the same.

References

  1. https://courses.lumenlearning.com/introstats1/chapter/definitions-of-statistics-probability-and-key-terms/

  2. The Data and Story Library, http://lib.stat.cmu.edu/DASL/Stories/CrashTestDummies.html (accessed May 1, 2013).

LICENSES AND ATTRIBUTIONS

CC LICENSED CONTENT, SHARED PREVIOUSLY

ALL RIGHTS RESERVED CONTENT

  • Introduction to Statistics. Authored by: Mathispower4u. Located at: https://youtu.be/zgcx1bs_uVo. License: All Rights Reserved. License Terms: Standard YouTube License