The independent samples t-test compares the mean of a continuous variable between two groups. For example, it could be used to test if activity levels are different between those who work in-person in an office compared to those who work remotely at home.
Step 1: Set up the data. There is one IV that designates the two groups that will be compared. In the example below, the variable “Work_setting” is coded as 1 = office, and 2 = remote. The other variable is the DV, which in this case is number of steps.
Step 2: Use the menus to select the proper test. Click on “Analyze”, highlight “Compare Means and Proportions", and “Independent Samples T-Test”.
Step 3: Assign DV, IV. Put the DV(s) in the "Test Variable(s)" box and the IV in the "Grouping Variable" box. Then click on “Define Groups.” More than one t-test can be run at the same time by putting in more than one DV.
Step 4: Define the levels of the IV. Use the coding defined in Step 1 to specify the groups for this test. For example, the numbers 1 (in-person) and 2 (remote) could designate the two groups. Click “Continue”.
Step 5: Click “OK” to run the test. The output will include descriptive statistics (means and standard deviations) and the results of a t-test that indicates whether the two groups differ on the dependent variable.
Effect sizes. The output will also include estimates of effect sizes. Cohen's d is a common choice. You should report the "Point Estimate", and as with the t-statistic, the sign should be dropped (always report a positive Cohen's d).
Last Steps and APA writeup. For this class, we will assume equal variances. When reporting significance levels (p-values), be sure to use the "Two-Sided p" found under the “Significance” section of the table.
If the results are significant (which they are, based on the output above):
Individuals working in-person were found to have a higher number of steps (M = 6430.72) than those working remotely (M = 2260.00), t (8) = 2.87, p = .021.
If the result are not significant (not based on the output above):
There was no significant difference in steps between those working remotely (M = 3650.00) and those at home (M = 3175.00), t (8) = 1.60, p = .16.
Notes about reporting statistics:
The “t” and “p” should always be italicized (or underlined if writing it out by hand) when reporting statistics.
The 10 corresponds to the degrees of freedom of your test. You can get this information from the output next to the “t” result.
Give exact p-value to three decimal places. If the output shows p-value =.000, put p < .001
Below is a template for an APA-style table to go along with one or more independent samples t-tests (sample data not from this example).
It is also common to also make a bar plot to display the results.
The paired samples t-test tests compares mean of a continuous variable before and after an intervention (think pre/post treatment) or between two groups where each member of one group is linked to a single member of another group. For example, a dependent samples t-test could be used to test if a dietary supplement increased heart rate by measuring heart rate before and after taking the supplement. Or, it could be used to test if husbands are less concerned about healthy eating than wives, where “concern about healthy eating” was compared between wives and their husbands (i.e., linked husband-wife pairs).
Step 1: Set up the data. Make sure that each pair of before-after measurements, or each of the two measurements from a linked pair are aligned on the same row. In the example below, the data represent anxiety scores before and after viewing a clip from Saving Private Ryan.
Step 2: Use the menus to select the proper test. Click on “Analyze”, highlight “Compare Means", and “Paired Samples T-Test”.
Step 3: Select the variables to be paired. In this example, it will be “Before” and “After.” Click on the variable to highlight it, and then click on the arrow to move it to the box on the right. Repeat for your second variable. The order of the variables does not matter.
Step 4: Click “OK” to run the test. The output will include descriptive statistics (mean and standard deviation of the difference scores) and the results of the t-test.
Effect sizes. The output will also include estimates of effect sizes. Cohen's d is a common choice. You should report the "Point Estimate", and as with the t-statistic, the sign should be dropped (always report a positive Cohen's d).
Last Steps and APA writeup.
If the results are significant (which they are, based on the output above):
Participants were more anxious before watching a clip from Saving Private Ryan (M = 6.36) than after the clip was shown (M = 4.55), t (10) = 2.43, p = .036
If the result are not significant (not based on the output above):
There was no significant difference in anxiety scores before watching the movie clip (M = 5.21) compared to after the clip was shown (M = 5.32), t (10) = 0.36, p = .17
Below is a template for an APA-style table to go along with one or more independent samples t-tests:
The one-way between subjects ANOVA is used to test if the mean of a continuous dependent variable is significantly different between three or more levels of an independent variable. For example: Do apartment rents differ between Manhattan, Brooklyn, and Queens? In this case, the continuous dependent variable is each apartment's rent and the levels of the independent variable are Manhattan, Brooklyn, and Queens.
Step 1: Set up the data. In the example shown, rent is the DV and borough is the IV. In the screenshot below there is another IV, income, but it is not being used for this example.
Step 2: Assign the independent variable and dependent variable names and labels (see Basic Data Setup)
Step 3: Assign values to the groups that will be compared. In this case, the variable “Borough” will be coded 1 = Manhattan, 2 = Brooklyn, and 3 = Queens (see Basic Data Setup)
Step 4: Select “Analyze,” and then highlight “Compare Means” and click on “One Way ANOVA”:
Step 5: First, select the dependent variable, then click on the top arrow to move it into the text box labeled “Dependent List”. Next, select the independent variable and click on the bottom arrow to move it to the text box labeled “Factor”:
Step 6: To run post-hoc analysis, click on “Post Hoc”. Here, we will be using a Tukey test. First, click on the box next to “Tukey,” and then click “Continue”. Next, click on “Options” and the square labeled, “Descriptives” to get the mean and standard deviation for the dependent variable at each level of the independent variable.
Different post-hoc tests are used for different situations. Be sure to check that you are using the appropriate post-hoc test for your data. A list of post-hoc tests can be found here.
Step 7: Click “OK” to run the test. The output will include descriptive statistics (means and standard deviations) and the results of the F-test if any groups (defined by your independent variable) differ on your dependent variable.
The results of Tukey’s test is also shown for all possible pairwise comparisons (Manhattan vs Brooklyn, Queens vs. Manhattan, and Queens vs. Brooklyn).
Last Steps and APA writeup.
For a one-way ANOVA, the degrees of freedom are (df between groups, df within groups). These numbers can be found in the ANOVA output.
If the results are significant:
There was significant difference in rent between Manhattan (M = $4830.00), Brooklyn (M = $4966.67), and Queens (M = $7100.00), F (2,26) = 39.18, p < .001.
If the result are not significant (not based on the output above):
There was no significant difference in rent between Manhattan (M = $4930.00) Brooklyn (M = $4666.67), and Queens (M = $4100.00), F (2,26) = 1.11, p = .180
Reporting the post-hoc analysis
If the results are significant, you generally should also report the results of a post-hoc analysis.
The “Sig” column indicates which groups are significantly different from each other. You will notice some redundancy because the mean differences are reported in both directions (the test reports Brooklyn vs. Manhattan as well as Manhattan vs. Brooklyn). Also pay attention to the mean difference column, this will indicate which group mean is greater.
For any significant results, report:
Post-hoc analyses utilizing Tukey's HSD indicated that rent per month in Queens is significantly higher than rent per month in Manhattan and Brooklyn, p < .05.
Below is a template for an APA-style table to go along with one or more ANOVAs:
The 2-way between-subjects ANOVA is used for designs with two "crossed" IVs. Every participant is characterized by their level on IV1 (e.g., “below $80,000” for IV “income”) and their level on IV2 (e.g., NYC Borough of residence). Both the IVs can have 2 or more levels. The 2-way ANOVA tests for the main effect of each of the IVs, and for an interaction between the two IVs.
Step 1: Set up the data. In this example, the IVs are NYC Borough (1 = Manhattan, 2 = Brooklyn, 3 = Queens) and annual income category (1 = below $80,000, 2 = above $80,000). The DV is rent:
Step 2: Select the proper test:
Step 3: Move the variables into the proper boxes. The IVs should go into the box "Fixed Factor(s)":
Step 4: To automatically generate a plot of your data, click on"Plots", then move one IV to the box "Horizontal Axis" and the other IV to the box "Separate Lines". Lick "Add", then "Continue":
Step 5: Click “OK” to run the test. The output will include the results of three F-tests, one for each IV, and one for the interaction of the two IVs (Borough * Income). If one or more of your independent variables has 3 or more groups, click “Post Hoc” to run Tukey's test for each IV, as in the 1-way ANOVA.
Step 6: Interpret the output:
Last Steps and APA writeup
Any combination of main effects and interactions may be significant. Here is an example write up (based on results above):
There was a main effect of Borough on monthly rent, F (2,23) = 39.16, p < .001. There was no main effect of income on monthly rent, F (1,23) = 1.04, p = 0.32. There was no significant interaction between Borough and annual income, F (2,23) = 1.16, p = .33 (see Figure 1). Tukey’s post-hoc analysis indicated that renters paid significantly higher rent in Queens (M = 7100.00) than Brooklyn (M = 4992.50) or Manhattan (M = 4830.00).
Note 1: for two-way ANOVA, the two degrees of freedom for each IV and the interaction are the df for that “Source” and the "Error". These numbers can be found in the ANOVA output.
Note 2: Tukey post-hoc results can be included in the write-up if one or more of your independent variables has 3 or more groups.
Here is the plot generated for the example above:
If there were an interaction, your writeup should explain it. Here is an example writeup (not based on the results above) of a significant interaction:
There was a significant interaction between Borough and annual income, F(5,23) = 1.16, p < .001, such that renters earning less than $80,000 annually significantly differed in the monthly rent they paid depending on which Borough they resided, compared to renters earning more than $80,000 annually.
A chi-square test of independence tests if observed frequencies in groups defined by two (or more) categorical variables differ significantly from expected frequencies under the assumption that the categorical variables are independent.
An example of a research question that would require a chi-square test would be: Are golfers or basketball players more likely to suffer a knee injury at some point in the athletic season? In this scenario, there are two categorical variables, sport (golf/basketball) and knee injury (yes/no).
In the example, below, we test whether introverts have different color preferences than extroverts. The variables are personality type (1 = introvert, 2 = extrovert) and color preference (1 = red, 2 = yellow, 3 = green, 4 = blue).
Step 1: Set up the data. Enter variables names and labels in the tab labeled “Variable view” this will help with interpretation of the output:
Step 2: Click “Analyze”, highlight “Descriptive Statistics,” and then select “Crosstabs”:
Step 3: Move one variable to the “Row(s)” box, and the other to the “Column(s)” box:
Step 4: Click on “Statistics”, check the box labeled “Chi-square”, and then click “Continue”:
Step 5: Click “Cells.” Under counts, the “Observed” box should already be checked, if it is not, check that box. This will give you the number of counts in each cell (the n of each cell). Under “Percentages,” check each of the boxes “Row”, “Column”, and “Total”. Click “Continue:
Step 5: Click “OK” to run the test.
Last Steps and APA writeup
If the results are significant (based on the results above):
Color preferences was associated with introvert/extrovert status, χ2 (1) = 5.05 p = .025, with introverts more likely to prefer brighter colors (70.00%) and extroverts more likely to prefer darker colors (80.00%).
Note: df = (number of levels of IV one – 1) x (number of levels of IV two – 1)
If the results are not significant (not based on stats shown above):
Color preference was not associated with introvert/extrovert status, χ2 (1) = 4.11, p = .31.
Another example:
Basketball players are more likely to have sustained a knee injury during the season (85.00%) than golfers (15.00%), χ2 (1) = 11.52, p = .006.
Below is a template for an APA-style table to go along with one or more chi-square tests. The frequencies and percentages are available in the SPSS output (see above):
Correlation characterizes the strength and direction of a linear relationship between two continuous variables. The most common correlation “coefficient” is Pearson’s r, which ranges from -1.00 to 1.00.
Correlation always involves two continuous values measured on each subject in the dataset, e.g., the lengths and weights of babies. If r is positive, then both variables tend to go in the “same” direction. That is, as one value increases, the other tends to increase as well. If r is negative, then when one variable increases, the other tends to decrease. E.g., the number of years a car is owned and its value will tend to be negatively correlated.
Step 1: Set up the data. Enter at least two variables for each subject. Here, the variables are top speed and prices. The “subjects” are car models:
Step 2: Click "Analyze", then highlight "Correlate" and click "Bivariate":
Step 3: Select the variables (in this example, “Speed” and “Price”). Be sure “Pearson” is checked under the Correlation Coefficients options, and that “Two-tailed” test of significance is selected. You can run many correlations at once by selecting more than two variables, and so if "Flag significant correlations” is checked, an asterisk will appear next to the correlations that are significant at the p < .05 level.
Step 4: Click “Options”, and check the box labeled “Means and standard deviations”. Then click “Continue,” and then click “OK”.
The results will appear in a table like the one, below. To find the correlation coefficient, find the column of the one variable (e.g., Price), and the row of the other variable (e.g., Speed), and look for the value on the line for "Pearson Correlation". The p-value for that correlation will be just below it. A few tips:
- Ignore the correlations that are equal to 1. This is just the correlation of each variable with itself.
- Each correlation appears twice. E.g., the correlation of Price and Speed, as well as the correlation of Speed and Price, which is the same thing.
- The degrees of freedom for a Pearson correlation is N-2
Note: You should always make a scatterplot of your variables before interpreting the results of a Pearson correlation.
Last Steps and APA writeup
If the results are significant:
There was a significant positive relationship between top speed (M = 144.67) and car price (M = $47,200), r(7) = .88, p = .002. As top speed increased, car price tended to increase as well.
If the results are not significant (not based on stats shown above):
There was no significant relationship between top speed (M = 130.15) and car price (M = $55,000), r(7) = .13, p = .35.
Below is a template for an APA-style table to go along with a Pearson correlation with multiple variables:
Regression is a statistical tool for predicting the value of one variable from the values of one or more other variables measured on the same subjects. In “simple linear” regression, there is only one continuous (or approximately continuous) predictor and to-be-predicted variable. The relationship between the predictor and predicted variables is assumed to be linear.
In the example below, number of siblings will be used to predict happiness ratings (on a scale from 1-10).
Step 1: Set up the data. Data are arranged with two variables measured for each subject. In this example, each subject has reported their number of siblings and subjective happiness:
Step 2: Use the menus to select the proper test:
Step 3: Select a variable as the Dependent (predicted) and Independent (predictor) variable:
Step 4: Click “Statistics” and check “Descriptives” so means and standard deviations of the variables will appear in the output:
Step 5: Click “OK” to run the test. The output will include the Pearson correlation and “R-squared”, which indicates how well the DV can be predicted by the IV:
There is also output of an ANOVA, which indicates if the regression is statistically significant:
Step 6: Get the regression equation from the output:
The simple linear regression equation has the form:
y = a + bx
"a", the y-intercept and "b", the slope are listed under the column “Unstandardized B,”. The y-intercept is the Constant, and the slope appears next to the IV, in this example, “siblings.”
The regression equation is:
y = 3.913 + .978x
or
Happiness = 3.913 + .978 (Number of Siblings)
Therefore, the regression analysis would predict that a person with two (2) siblings would have a happiness rating of 5.87:
Happiness = 3.913 + .978(2) = 5.869
Note: You should always make a scatterplot of your variables before interpreting the results of a simple linear regression.
Last Steps and APA writeup
If the results are not significant:
Number of siblings is not a significant predictor of subjective happiness, F (1,6) = 3.16, p = .13.
If the results are significant (not based on stats shown above):
Number of siblings is a significant predictor of subjective happiness, F (1,6) = 6.53, p = .023, accounting for 61.40% of variance. On average, each additional sibling increases an individual’s subjective happiness by 0.98 points on the subjective happiness measure.
Note: The degrees of freedom come from the "regression" and "residual" degrees of freedom in the ANOVA table.
Multiple regression is used for predicting a single variable from a set of two or more predictor variables. In this example, using real estate sales data, the predictor variables (IVs) are number of bedrooms, number of bathrooms, square footage of the home, and lot size. Thre single prected variable (DV) is listed sale price:
Step 1: Set up the data. In this example, each "subject" is a house:
Step 2: Click “Analyze”, highlight “Regression”, and select “Linear”:
Step 3: Select a variable as the Dependent (predicted) and the Independent (predictor) variables:
Step 4: Click “Statistics” and check “Descriptives” so means and standard deviations of the variables will appear in the output:
Step 5: Click “OK” to run the test. The output will include the Pearson correlation and “R-squared”, which indicates how well the DV can be predicted by the IV:
There is also output of an ANOVA, which indicates if the regression is statistically significant:
Step 6: Get the regression equation from the output:
The multiple linear regression equation has the same form as a linear regression equation, but with more predictor variables:
y = a + bx1 + cx2 + dx3 +ex4
Where “a” is the y intercept, “b”, “c”, “d,”, and “e” are the slopes for each predictor variables. “y” is the predicted variable. x1, x2, x3, and x4 are the predictor variables. The slopes and y-intercept are listed under the column “Unstandardized B”. In this example, the regression equation is:
Price = -297,047.42 + 20,942.72 (# of bedrooms) + 281,162.405 (# of bathrooms) + 36.650 (square feet) + .081 (lot size).
Note: The "Standardized Coefficients Beta" allow for a comparison of the relative influence of each predictor variable on the predicted variable, but cannot be used in the regression equation. The significance of each predictor is presented on the far right of the Coefficients table.
Last Steps and APA writeup
If the results are significant:
Number of bedrooms, number of bathrooms, square footage, and lot size significantly predicted home price, F (4,466) = 133.54, p < .001, accounting for 53.00% of variance. Bathrooms (t = 14.03, p < .001) and square footage (t = 4.06, p < .001) significantly predicted home price.
If the results are not significant (not based on stats shown above):
Number of bedrooms, number of bathrooms, square footage, and lot size did not significantly predict home price, F (4,466) = 2.06, p =.62.
Note: The degrees of freedom come from the "regression" and "residual" degrees of freedom in the ANOVA table. The t-statistics come from the Coefficients table.