It is a measurement model specification, in which it is assumed that the indicators are caused by the underlying construct.
Changes in the latent variable directly cause changes in the assigned indicators. When the construct score is high, all the answer to all indicators will be high. Conversely, when the construct score is low, all the answer to all indicators will be low.
The indicators are interchangeable and expected to be highly correlated, deleting any item will not change the meaning of the construct.
Recall... Measurement model assessment aims at ensuring construct validity & reliability.
For a reflective model, construct validity is assessed by using convergent, internal consistency and discriminant validity.
The extent to which the items of the specific construct converge together.
Reflects correlation between items measuring the same construct.
High outer loadings of measurement items indicate that the items converge together on a common construct.
All indicators' outer loadings should be statistically significant. Because a significant outer loading could still be fairly weak, a common rule of thumb is that the (standardized) outer loadings should be 0.708 or higher. The rationale behind this rule is that square of the outer loading (R-square) should be higher than 0.50.
Source: Hair et al., (2022)
Indicates how much variation in the multiple items is explained by the latent variable.
Is comparable to the proportion of variance explained in factor analysis.
Value ranges from 0 and 1.
AVE should exceed 0.5 to suggest adequate convergent validity (Bagozzi & Yi, 1988; Fornell & Larcker, 1981).
AVE is equivalent to the communality of a construct. An AVE value of 0.50 or higher indicates that the construct explains more than half of the variance of its indicators.
AVE of less than 0.50 indicates that more error remains in the items than the variance explained by the construct.
If AVE < 0.50, then item with the lowest factor loading for that particular construct should be deleted.
Indicates consistency of measurement items to measure a common construct.
Indicates consistency of measurement items to measure a common construct.
The traditional criterion for internal consistency is Cronbach's alpha, which provides an estimate of the reliability based on the inter-correlations of the observed indicator variables.
It assumes that all indicators are equally reliable (i.e., all the indicators have equal outer loadings on the construct).
It is sensitive to the number of items in the scale & generally tends to underestimate the internal consistency reliability.
To overcome the limitations of Cronbach's Alpha, Composite Reliability (CR) is suggested as a replacement of the traditional criterion.
If CR < 0.70, then item with the lowest factor loading for that particular construct should be considered to be deleted.
CR of 0.60 to 0.70 are acceptable in exploratory research, while in more advanced stages of research, values between 0.70 and 0.90 can be regarded as satisfactory (Nunally & Bernstein, 1994).
Values above 0.90 (and definitely> 0.95) are not desirable because they indicate that all the indicator variables are measuring the same phenomenon and are therefore unlikely to be a valid measure of the construct
It helps that these measurements of reliability are presented together because a rho_A value between Cronbach's alpha and composite reliability is a good indication of reliability. On the data output report, the measures are reflected side-by-side, with rho_A between Cronbach's Alpha and CR. This is helpful to determine if the value is good, in between the CA and CR values.
Cronbach’s alpha is the lower bound, and the composite reliability rho_c is the upper bound for internal consistency reliability.
The reliability coefficient rho_A usually lies between these bounds and may serve as a good representation of a construct’s internal consistency reliability.
minimum of 0.70 (or 0.60 in exploratory research).
Maximum of 0.95 to avoid indicator redundancy, which would compromise content validity.
The recommended values for all measures of internal consistency reliability are 0.80 to 0.90.
Indicates the uniqueness of a construct from other constructs.
A latent variable should explain better the variance of its own indicators than the variance of other latent variables.
For this purpose, a researcher opts to use either Heterotrait-Monotrait Ratio, Fornell and Larcker Criterion, or Cross Loading.
The loadings of an item on its assigned latent variable should be higher than its loadings on all other latent variables.
Fails to indicate a lack of discriminant validity when 2 constructs are perfectly correlated, which renders this criterion ineffective for empirical research (Hair et al., 2017; Henseler et al., 2015).
The square root of AVE of a latent variable should be higher than the correlations between the latent variable and all other variables (Chin, 2010; Chin 1998b; Fornell & Larcker, 1981).
Performs very poorly when indicator loadings of the construct differ only slightly (e.g., between 0.6 & 0.8). If the loadings vary more strongly, its performance improves, but is still rather poor overall (Hair et al., 2017; Henseler et al., 2015; Voorhees et al., 2016)
Average heterotrait-heteromethod correlations relative to the average monotrait-heteromethod correlation (Hair et al., 2017; Henseler et al., 2015).
HTMT values close to 1 indicate a lack of discriminant validity
Threshold value:
HTMT.85 (Kline, 2011): More than 0.85 indicates a lack of discriminant validity. This threshold is used when the variables are conceptually dissimilar.
HTMT.90 (Gold et al., 2001): More than 0.90 indicates a lack of discriminant validity. This threshold is used when the variables are conceptually similar.
If HTMT is greater than 0.9, then use bootstrapping to test whether HTMT is significantly different from 1 (HTMTinference). Does the 90% bootstrap confidence interval of HTMT include 1? If yes, discriminant validity is not satisfactory. If No, discriminant validity is satisfactory, then the researcher can proceed with analysis.
Source: Hair et al., (2017)
Retain the constructs that cause discriminant validity problems in the model and aims at increasing the average monotrait-heteromethod correlations and/or decreasing the average heteromethod-heterotrait correlations of the constructs measures.
One can eliminate items that have low correlations with other items measuring the same construct.
Decrease the average heteromethod-heterotrait correlations, one can...
eliminate items that are strongly correlated with items in the opposing construct, or
reassign these indicators to the other construct, if theoretically plausible.
Merge the constructs that cause the problems into a more general construct. Again, measurement theory must support this step.
Internal consistency reliability: composite reliability should be higher than 0.70 (in exploratory research, 0.60 to 0.70 is considered acceptable). Consider Cronbach’s alpha as the lower bound and composite reliability as the upper bound of internal consistency reliability.
Indicator reliability: the indicator’s outer loadings should be higher than 0.70. Indicators with outer loadings between 0.40 and 0.70 should be considered for removal only if the deletion leads to an increase in composite reliability and AVE above the suggested threshold value.
Convergent validity: the AVE should be higher than 0.50.
Discriminant validity:
Use the HTMT criterion to assess discriminant validity in PLS-SEM.
The confidence interval of the HTMT statistic should not include the value 1 for all combinations of constructs.
According to the traditional discriminant validity assessment methods, an indicator’s outer loadings on a construct should be higher than all its cross-loadings with other constructs. Furthermore, the square root of the AVE of each construct should be higher than its highest correlation with any other construct (Fornell-Larcker criterion).
If more than one item in a construct do not achieve the threshold value of outer loading, the researcher should only delete the item one at a time for that particular construct, starting form the item with lowest loading. After that, the model should be re-estimated.
Items with loading lower than 0.708 can be kept when the AVE is more than 0.50.
If negative value of outer loading is found in a construct, then the researcher should check if the reversed coded items have been addressed. If the negative value still remains, the item should be deleted.
Caveat: the researcher should not delete more than 20% of the indicators in the model (Hair, Babin, & Krey, 2017; Hair et al., 2014). Otherwise, the whole research moves into EFA rather than CFA. In addition, the credibility of the research instrument is very much questionable.
No, then refine and improve measures, and design new study.
Yes, Proceed to test structural model
Click here for convergent validity & internal consistency calculator.
Please download the data HERE, and draw the following model.
Fiancial Performance. Please download the data HERE, and draw the following model.
Manufacturing Strategy: Please download data-set here, and draw the model below.
Business Competitiveness: Please download data-set here, and draw the model below.
Smartphone Addiction: Please download data-set here, and draw the model below.
Nawanir, G., Fernando, Y., & Teong, L. K. (2018). A second-order model of lean manufacturing implementation to leverage production line productivity with the importance-performance map analysis. Global Business Review, 19(3_suppl), S114-S129. Click here.
Nawanir, G., Lim, K. T., Othman, S. N., & Adeleke, A. Q. (2018). Developing and validating Lean manufacturing constructs: An SEM approach. Benchmarking: An International Journal, 25(5), 1382-1405. Click here.