Module 20

Reliability & EFA

General Introduction

  • In behavioral research, some measurement tools are more reliable (i.e., consistent) than others. Reliability means we can get the same result across different contexts. When it comes to a test that is implemented at different time points under similar situations with similar results, we called it test-retest reliability. When it comes to the reliability of a multi-item scale, we hope each item is similar to other items, namely, internal consistency (or simply, reliability of a scale). Such similarity can be tested by inter-item correlation, and we use Cronbach's α to quantify the level of internal consistency/inter-item correlation.

  • Reliability is conducted when you know you have a scale and you would like to test it. If you have a pool of items you would like to explore what is behind those items, or you want to develop a scale, you will need Exploratory Factors Analysis (EFA).

  • EFA can help you to extract potential factor(s). For example, if the dimension captures a great amount of variation of data and all items get loaded on a factor (like, loading > .3), we may conclude that those items more likely belong to a single dimension (1-factor solution).

1. Reliability

1.1 Reliability analysis & Cronbach's α

When we develop a new scale or apply an established scale to measure a construct, we want to make sure it is a reliable measure from the collected data. If the items are measuring the same construct, the responses should be similar, in other words, consistent among items. If the responses are dissimilar, these items might not be a reliable measure of the construct.

To test whether the items in a scale are consistently measuring the same construct, a commonly used one is Cronbach's α (alpha).

There are some commonly accepted description of a range of Cronbach's α.

Cronbach's alpha Internal consistency

0.9 ≤ α Perfect

0.8 ≤ α < 0.9 Excellent

0.7 ≤ α < 0.8 Good

0.6 ≤ α < 0.7 Acceptable

0.5 ≤ α < 0.6 Questionable

α < 0.5 Unacceptable


1. Reliability

1.1 Reliability analysis & Cronbach's α

When we develop a new scale, or apply an established scale to measure a construct, we want to make sure it is a reliable measure from the collected data. If the items are measuring the same construct, the responses should be similar, in other words, consistent among items. If the responses are dissimilar, these items might not be a reliable measure of the construct.

To ensure whether the items in a scale are consistently measuring the same construct, we use internal reliability, or internal consistency. Cronbach's α (alpha) is the most commonly-used measurement for quantifying internal reliability.

For example, we want to know whether the 9-items in the Irrational Procrastination Scale (IPS) are consistently measuring the same construct, procrastination. We will need to do an analysis on the collected data to test the internal reliability (Cronbach’s α).

Q: How do we measure the internal reliability of a scale in jamovi?

A: We use the “Reliability Analysis” in the “Factor” module.


Example 3.1 Reliability_IPS.mp4

1.2 Low Reliability & Reverse Items

Another example would be the Basic Self-Control (BSC) Scale. We want to know whether the 13-items in the BSC are consistently measuring the same construct, self-control. We will need to do an analysis on the collected data to test the internal reliability (Cronbach’s α).

Example 3.2 Reliability_BSC.mp4

But the internal reliability (Cronbach’s α) of the BSC is -0.0508, which is considered unacceptable. The main reason is that we included reverse-coded items that had not yet been reversed.

Q: How do we correct the reliability analysis with the reverse-coded items?

A: We use the “Reverse Scaled Items” in the reliability analysis to correct it. Then we find that the Cronbach’s α is 0.67.

Example 3.2 Reliability_BSC (Con).mp4

Report and Interpretation:

  • Basic self-control scale (Tangney, Baumeister, & Boone, 2004) is a 13-item, 5-point Likert scale (1 = not at all; 5 = very much) capturing particpants' basic self-control. Sample items include "I am good at resisting temptation" and "I have a hard time breaking bad habits" (reversed item). The reliability in the current sample was acceptable, with Cronbach's alpha = .67.

2. Exploratory Factors Analysis (EFA)

  • EFA is a data-driven approach that is commonly used in scale development. We are going to explore and explain if there is a latent dimension (i.e., factor) behind a group of variables. Specifically, we ask

    1. Can we extract factor(s) from the item pool? (by conducting the principle components analysis, PCA or the extraction method)

    2. If so, how many factors can be extracted? (by evaluating the eigenvalue)

    3. What does (do) the factor(s) look like? Which items are loaded on which factor(s)? (by choosing factor rotation and then interpreting the post-rotation solution)

2.1 Example: EFA on Big Five Model

Assuming Sophie doesn't believe in the original Big Five Model (the 5-factor solution from the item pool) and would like to explore any potential, alternative factor solution from the given list of items.

Q: How many factor(s) can be extracted from the current BFM item pool? Are those items passed the EFA assumption? How to interpret the result?

A: Perform EFA in jamovi, by checking the assumptions and additional output specified as the video

Example_1_EFA_BFM.mp4

Results Interpretation

  • (assumption part) Both Bartlett’s test of sphericity and KMO measure of sample adequacy were employed to assess the appropriateness of the current sample for performing exploratory factor analysis. Bartlett’s test of sphericity was significant, p < .001, indicating some inter-item correlations were not zero. The overall KMO measure was .65, indicating a marginal to middle adequacy.

  • The maximum likelihood method with Oblimin rotation was selected for the factor extraction and interpretation, respectively. The number of factors extracted was based on Eigenvalues (greater than 1) and the point of inflexion in the scree plot. The cut-off for loading of items was chosen as 0.4*; items with values <0.4 were excluded from the final analysis.

  • Three factors were extracted based on the criteria. Factor 1 was comprised of 5 items reported on a 5-point Likert scale that explained 16% of the variance with factor loadings from |.45| to |.86|. Factor 2 was comprised of 5 items that explained 14% of the variance with factor loadings from |.46| to |.92|. Factor 3 was comprised of 4 items that explained 12% of the variance with factor loadings from |.70| to |.72|.

* in the video, we hid landings below 0.3. To avoid interpreting double-loaded items and to make the result more interpretable, we adopted a more rigour cut-off and only interpret heavily loaded items (i.e., > 0.4)

2.2 Alternative EFA parameters

  • We can also extract factors with alternative methods, such as parallel analysis or fixed factors (with special reasons) or factor rotations (Varimax vs Oblimin).

  • The most commonly used methods for factor rotations are Varimax (from the orthogonal rotation category) and Oblimin (from the oblique rotation category)

Module Exercise (4% of total course assessment)

Complete the exercise!

    • Now, if you think you're ready for the exercise, you can check your email for the link.

    • Remember to submit your answers before the deadline in order to earn the credits!