Stats Vocabulary (Quarter 1)
One Variable Sats
Categorical Variable - A variable that can be classified into two or more categories; this variable does not have a quantity. Ex: yes/no, red/blue, made/missed, etc.
Individuals- items or subjects that are in a study
Subject- an individual that is a human
Qualitative Variable- an open response type question
Quantitative Variable- A variable that is measured based on anything that has to do with numbers; Ex: age, weight, a number scale, or even using money
Relative Frequency- the number divided by all the possible outcomes
Variable- what is being measured or used to measure (in future modules, we will see independent vs dependent variables and explanatory vs response variables)
Quantitative Distributions
Bimodal- having two peaks
Box Plot- plotting data using quartiles and outliers
Histogram- plot that summarizes how data is distributed; this kind of graph puts data into groups and graphs them like a bar graph (ex: 0-9, 10-19, 20-29, etc)
interquartile range- upper quartile mins the lower quartile
mean- the average of a set of numbers. average is when you add all the numbers together and divide by how many numbers there are
median- middle number of a set of numbers
outlier- a data point way beyond the borders of a data set
percentile- the percent at or below that score
quartile - splitting the data into top 25%, middle/upper 25%, middle/lower 25%, and lower 25%
stem plot- type of graph that separates the tens place from the ones place by a "stem" in order to organize the data. Ex: 2 I 5 = 25
standard deviation- a measure of how spread apart things are how far
symmetrical- exactly the same on both sides
trimodal- having three peaks
unimodal- having one peak
Sampling
census- a survey given to all of a population
cluster sampling- small even, and evenly mixed groups from a population that is picked by SRS and those groups will serve as the sample
non-response - the individuals that do not respond to a survey
parameter-the number part of the stats of a population, such as mean or median
population- who or what is being studied
sample- a small portion from a large population
simple random sample (SRS)- a random sample, but gives everyone an equal opportunity to be picked
strata- a “layer” of a population, can be divided because of different characteristics. Layer means that there is a group of people with the same type of characteristics for the survey, and each layer is different
stratified sample- a sample not from the populations itself but from certain strata of the population. You need to do an SRS from each strata (ex: a proportion from each grade)
systematic sample- first you estimate the population size, decide how many people you want to sample and then divide the two numbers to decide which every nth person you sample
undercoverage- no chance for the person to be surveyed, for example the person was gone the day of the survey
Confidence Intervals
bootstrapping- Used to find hypothesis tests and confidence intervals; it takes your set of data and uses it as the population and uses it to do a bunch of tests so that you get more results
Hypothesis Testing
Alternative Hypothesis- the answer that must be true if the null hypothesis is wrong
Null Hypothesis- assumed hypothesis
p- value- the probability of obtaining a statistic the same as the one that was observed, assuming that the null hypothesis is indeed true
type one error- when the null hypothesis is rejected when it is true
type two error- the null hypothesis is not rejected when it is false