Researchers typically generalize their findings beyond the sample size at hand and to provide generalizable conclusions, researchers must determine if the findings can be applied not only to the data used in the model estimation process (in-sample) but also to other data sets (i.e., out-of-sample) (Hair et al., 2021).
Researchers have traditionally overlooked assessing a model’s out-of-sample predictive power and instead, researchers mainly rely on R-squared which only indicates a model’s explanatory power (Hair et al., 2021). This evaluation demands the use of prediction-oriented evaluation criteria in PLS-SEM.
In-sample data is data that was known at the time of model construction, whereas out-of-sample data is data that was unknown at the time of prediction. The out-of-sample analysis is believed to be more reliable than the in-sample analysis.
PLSpredict is a prediction-oriented model evaluation approach. It is a useful and straightforward approach to evaluate the out-of-sample predictive capabilities of PLS path models (Shmueli et al., 2019).
PLSpredict separates the sample data into k folds (i.e., subgroups) of nearly equal size and it combines k-1 folds into a training sample, which is then used to estimate the model.
The remaining fold is used as a holdout sample to test the model’s prediction ability, in such the holdout sample comprises instances that will be estimated using model parameters derived from the estimation of the training sample. This procedure is done until all k folds have been used as holdout samples.
Source: Shmueli et al. (2019)
(1) PLS-SEM < LM for none of the indicators: If the PLS-SEM analysis (compared to the LM) yields lower prediction errors in terms of the RMSE (or the MAE) for none of the indicators, this indicates that the model lacks predictive power.
(2) PLS-SEM < LM for a minority of the indicators: If the minority of the dependent construct’s indicators produces lower PLS-SEM prediction errors compared to the naïve LM benchmark, this indicates that the model has a low predictive power.
(3) PLS-SEM < LM for a majority of the indicators: If the majority (or the same number) of indicators in the PLS-SEM analysis yields smaller prediction errors compared to the LM, this indicates a medium predictive power.
(4) PLS-SEM < LM for all indicators: If all indicators in the PLS-SEM analysis have lower RMSE (or MAE) values compared to the naïve LM benchmark, the model has high predictive power.
Shmueli et al. (2019) provided a rules of thumb for running PLSpredict as follows:
Use ten folds (i.e., k = 10), but ensure that the training sample in a single fold still meets the model’s minimum sample size requirements. If not, choose a higher value for k.
Use ten repetitions (i.e., r = 10) when the aim is to predict a new observation using the average of predictions from multiple estimated models. Alternatively, use one repetition (i.e., r = 1) when the predictions should be based on a single model.
Assessment of a model’s predictive power should primarily rely on one key target construct.
To assess the degree of prediction error, use the RMSE (Root Mean Squared Error) unless the prediction error distribution is highly non-symmetric. In this case, the MAE (Mean Absolute Error) is the more appropriate prediction statistic.
Examine each indicator’s Q2predict value from the PLS-SEM analysis. A negative Q2predict value indicates that the model lacks predictive power.
Compare the RMSE or the MAE value with the LM value of each indicator. Check if the PLS-SEM analysis (compared to the LM) yields lower prediction errors in terms of RMSE (or MAE) for all (high predictive power), the majority (medium predictive power), the minority (low predictive power), or none of the indicators (lack of predictive power).
Examine the distribution of the prediction errors. PLS-SEM-based residuals should be normally distributed; a left-tailed distribution indicates over-prediction, a right-tailed distribution indicates under-prediction. Also compare the distributions of the prediction errors from PLS-SEM with those from LM. The distributions should correspond closely.
Indicators with low predictive power should be analyzed in terms of data issues (e.g. data distribution and outliers) and measurement model issues (e.g. loadings). Consider deleting problematic indicators, but assess the effect of this on the measurement model quality.
Shmueli, G., Sarstedt, M., Hair, J. F., Cheah, J. H., Ting, H., Vaithilingam, S., & Ringle, C. M. (2019). Predictive model assessment in PLS-SEM: guidelines for using PLSpredict. European journal of marketing.