stacking model
In a nutshell, train multiple models and use the models' outputs to train a final model.
The assumption is each individual model is good at sth but not everything. The final model combines the merits of all models.
Input
1 train dataset 80%
1 test dataset 20%
Firstly train 3 separate models M1, M2 and M3
1. xgboost.cv to train a model to find out the best parameters
2. lr with cv. another set of parameters
3. nn with a few layers. network design is determined
here has 3 models. now need to use the 3 models to predict on another set of train data, and use the predictions to train the last stacking model.
However, there is no another train dataset, or a waste to reserve another set of training data.
So simply reuse the same train data as follows.
Divide train dataset into 5 folds
for (i fold) in the (5 folds)
test_temp = i fold
train_temp = combine the other 4 folds
train Model 1' xgboost with train_temp
train Model 2' lr with train_temp
train Model 3' NN with train_temp
Predict test_temp using Model 1', 2' and 3' separately. Results are stored as separately as M1', M2' and M3'
/*Don't train model 1,2,3 on the whole train dataset, otherwise the prediction on train dataset would be overfitted
The 5-fold cv here is not for parameter turning. It's simply for providing prediction values for the train dataset by Model 1', 2' and 3'
without predicting on exactly the data used for training.
The M1', M2', M3' will be used to train the last stacked model.
The Model 1', 2' and 3' are different to the Model M1, M2 and M3 which are trained on the full train dataset.
*/
Now also has the predictions M1', M2', M3' for every record in train dataset
Use the M1', M2', M3' and optionally any existing features in train dataset to train another LR model.
The LR model is the last stacked model.
Output and Test
use the Model M1, M2 and M3 to predict on test dataset
The results from M1, M2, M3 are fed into the stacked model LR to get the final result.