The Outline...
The foundational class for critically analyzing and describing data for a specific objective.
(Reference: Statistical Techniques in Business and Economics, 14th Edition. Lind, Marchal, Wathen. 2010)
Statistics
Types of Statistics
Descriptive Statistics: organizing, summarizing and presenting data in an informative way.
Inferential Statistics: methods used to estimate a property of a population on a basis of a sample.
Population
Sample
Types of Variables
Qualitative
Quantitative
Discrete
Continuous
Levels of Measurement
Nominal: only be classified and counted with no natural order to outcomes
Ordinal: represented by sets of labels or names (high, medium, low) that have relative order
Interval: equal differences are represented by equal differences in the measurements
Ratio: same as interval, but difference between two numbers is meaningful.
Describing Data- pictorial
Constructing a Frequency Table
Relative Class Frequencies: mutually exclusive, percentage of the whole
Graphic Presentation of Qualitative Data
Bar Chart
Pie Chart
Describing Data- numerical measures
Population Mean
Parameter- characteristic of a population
Sample Mean
Median- midpoint of values
Statistic- a characteristic of a sample
Mode- most frequently occurring observation
Relative positions of the mean, median, and mode
symmetric distribution (normal)
Skewed distribution is NOT symmetrical
Positive Skewness:, the mean is the largest of the three, median, then mode is smallest
Negative Skewness: the mean is the smallest, median, then the mode is the largest
Measures of Dispersion
Range
Mean Deviation
Variance
Standard Deviation
Other Measures
Quartiles
Deciles
Percentiles
Interpretation and Uses of the Standard Deviation
Chebyshev's Theorem
75% of data lie between +- 2 Standard Deviation
89% of data lie between +- 3 Standard Deviation
96% of data lie between +- 5 Standard Deviation
Empirical Rule
68% of data lie between +- 1 Standard Deviation
95% of data lie between +- 2 Standard Deviation
99.7% of data lie between +- Standard Deviation
Continuous Probability Distributions
Types of Distributions
Random is discrete (0 &1)
Binomial, hypergeometric, Poisson
Probability distribution is a table that links each possible value that a random variable can assume with its probability of occurrence
Continuous probability distributions
random variable is continuous
types:
uniform
normal
Student's t
chi-square
F distributions
Infinite number of values
Probability distribution is the graph of an equation
Standard Normal Probability Distribution
normal distribution
AKA the z-distribution
z-value (also called z-score) is the signed distance between a selected value, designated X, and the population mean µ, divided by the population standard deviation, σ.
sample size >30
t-Distribution
symmetrical
sample size <30
STD DEV differ according to sample size.
defined by the degrees of freedom (df = n-1)
f-Distribution
must be positive
positively skewed
usually used to compare two variances
Chi-Square (χ2) Distribution
positively skewed
defined by df (df = n-1)
usually used to compare expected versus observed value
Estimation and Confidence Intervals
Point Estimate
single value (point) derived from a sample and used to estimate a population value.
estimate of the population parameter, from that estimate, we build our confidence interval
Sample mean might be the best estimate of the population parameter
Confidence Interval Estimates
a range of values constructed from sample data so that the population parameter is likely to occur within that range at a specified probability.
The specified probability is called the level of confidence.
C.I. = point estimate ± margin of error.
Factors affecting confidence interval estimates:
The sample size, n.
The variability in the population, σ, usually estimated by s.
Desired level of confidence.
One-Sample Test of Hypothesis
The Hypothesis: A statement made about a population of interested developed for the purpose of testing
Testing: A procedure based on sample evidence and probagility theory to determine whether the hypothesis is a reasonable statement.
NULL HYPOTHESIS (H0) - A statement about the value of a population parameter developed for the purpose of testing; the statement assumed to be true unless sufficient evidence from sample data show that it is false.
ALTERNATE HYPOTHESIS (Ha or H1) - A statement that is accepted if the sample data provide sufficient evidence that the null hypothesis is false.
Rules about the two
H0: null hypothesis
Ha alternate hypothesis.
H0 and H1 are mutually exclusive and collectively exhaustive.
H0 is always presumed to be true
Ha (H1) has the burden of proof
“Reject H0” or “Fail to reject H0”
Never “Accept H0
Population is always changing and we never know what it is.
Philosophical: we can only take the current data today; can’t really say for sure that something works for all of the population.
Equality is always part of H0 (e.g. “=” , “≥” , “≤”)
Errors
Type I Error: Defined as the probability of rejecting the null hypothesis when it is actually true
This is denoted by the Greek letter “a”
Also known as the significance level of a test, or “alpha level”
10%- predicting that the hypothesis is correct 90% of the time (surveys & behavioral health)
5%- predicting that the hypothesis is correct 95% of the time (management studies)
1%- predicting that the hypothesis is correct 99% of the time. (clinical studies)
Type II Error: Defiend as the probability of failing to reject the null hypothesis when it is actually false.
This is denoted by the Greek letter “β”
Hypothesis Testing Steps
Step 1: State the null (H0) and alternate (Ha) hypotheses
Step 2: Select level of significance (α) or probability of rejecting the null hypothesis
Step 3: Determine which statistical test to use (e.g., one sample t-test, ANOVA, etc.)
Step 4: Formulate decision criteria
Calculate critical values for the selected α using appropriate distribution
State decision rule for when to reject H0
Step 5: Evaluate sample, make decision and interpret results
Evaluate sample statistic against the critical values
Reject H0 and accept Ha OR Fail to reject Ho
Interpret results
”
Decision Tree to Determine the
Appropriate Test Statistic in
Step 3 Below.
(Click on the picture for an expanded view)
Other Helps for Statistics
1. What statistical analysis tool should I use?
Hypotheis Testing Steps
Step 1: State the null (H0) and alternate (Ha) hypotheses
Step 2: Select level of significance (α) or probability of rejecting the null hypothesis
Step 3: Determine which statistical test to use (e.g., one sample t-test, ANOVA, etc.)
Step 4: Formulate decision criteria
Calculate critical values for the selected α using appropriate distribution
State decision rule for when to reject H0
Step 5: Evaluate sample, make decision and interpret results
Evaluate sample statistic against the critical values
Reject H0 and accept Ha OR Fail to reject Ho
Interpret results