Intel Compiler Benchmarks
May 2018 Edition
Datasets Used
Higgs
- 11,000,000 training observations (all)
- 406 features (interaction (multiplication) of all pairwise combinations of the 28 features)
- Metric: time taken to finish 10 boosting iterations
- 4,466,000,000 elements
- 665,073,059 zero values (14.89% sparsity)
- 35,728,026,424 bytes (33.3GB)
- Peak RAM requirement: 82GB
Servers Used (Hardware / Software)
1-64 thread runs
- CPU: Dual Xeon Gold 6130 (3.7/2.8 GHz, 32c/64t)
- RAM: 384GB 2666MHz RAM
- OS: Windows 10 Enterprise + Windows Subsystem for Linux, Ubuntu 16.04
- Virtualization: None
- R 3.5, compiled
- gcc: 5.4 and 8.1
- icc: 2018.2.199 (from Intel Parallel Studio XE 2018 Update 2)
Gradient Boosted Trees Algorithms Used
xgboost
- Versions used: commit
8f6aadd
- Flags used:
- gcc:
-O3 -mtune=native
- icc:
-O3 -ipo -qopenmp -xHost -fPIC
- gcc:
LightGBM
- Versions used: commit
3f54429
- Flags used:
- gcc:
-O3 -mtune=native
- icc:
-O3 -ipo -qopenmp -xHost -fPIC
- gcc:
Installation of Gradient Boosted Trees Algorithms
xgboost
Installing xgboost is specific to be done in bash:
git clone --recursive https://github.com/dmlc/xgboost
cd xgboost
git checkout 8f6aadd
cd R-package
In src/Makevars.in
, add -DUSE_AVX=ON
at line 11.
Then compile xgboost:
R
install.packages('.', repos = NULL, type = "source")
LightGBM
For LightGBM, this also requires a specific installation in bash:
git clone --recursive https://github.com/Microsoft/LightGBM
cd LightGBM
git checkout 3f54429
cd R-package
In src/Makevars.in
, replace cmake_cmd
content line 50 by: "cmake -DCMAKE_C_COMPILER=icc -DCMAKE_CXX_COMPILER=icpc "
:
Then compile LightGBM:
R CMD INSTALL --build . --no-multiarch
Hyperparameters Used (Full list)
xgboost
Hyperparameters, average of 5 runs (approximately 48h):
- Depth: 8
- Leaves: 255
- Hessian: 1
- Minimum Loss to split: 0
- Column Sampling: 100%
- Row Sampling: 100%
- Iterations: 10
- Learning Rate: 0.25
- Boosting method: gbdt, fast histogram
- Bins: 255
- Loss function: binary:logistic
Note: the timing takes into account the binning construction time, which is approximately 50% to 70% of the xgboost timing.
It takes 13 minutes with 1 thread, 2 minutes with 64 threads.
LightGBM
Hyperparameters, average of 5 runs (approximately 14h):
- Depth: 8
- Leaves: 255
- Hessian: 1
- Minimum Loss to split: 0
- Column Sampling: 100%
- Row Sampling: 100%
- Iterations: 10
- Learning Rate: 0.25
- Boosting method: gbdt
- Bins: 255
- Loss function: binary
Note: the timing does not take into account the binning construction time.
It takes 16 minutes using 1 thread, 23 seconds using 64 threads.
Performance Analysis (gcc 5.4 vs gcc 8.1 vs Icc)
Use the Performance Analysis if you expect to compare timings data.
Check interactively on Tableau Public:
Provided dynamic and interactive filters:
- Threads
Performance Analysis (gcc 5.4 vs Icc)
Use the Performance Analysis if you expect to compare timings data.
Check interactively on Tableau Public:
Provided dynamic and interactive filters:
- Threads