Xgb vs Lgb Benchmarks

February 2017 Edition

Datasets Used

Bosch

1,000,000 training observations (first)
183,747 testing observations (last)
969 features (ID removed, rescaled, NAs as zeros, one feature with count of NAs)
Metric: AUC (because it is a global metric)

Higgs

10,000,000 training observations (first)
1,000,000 testing observations (last)
29 features (rescaled, NAs as zeros, one feature with count of NAs)
Metric: AUC (because it is a global metric)
Reality: dropped due to convergence issues, but you can still see them

Servers Used (Hardware / Software)

1-6 + 12 thread runs

CPU: i7-3930K (3.9/3.5 GHz, 6c/12t)
RAM: 64GB 1600MHz RAM (54GB RAM child)
OS: Ubuntu 16.04 (4.4 kernel)
Virtualization: KVM + Windows Server 2012 R2 Datacenter
Microsoft R Client 3.3.2
Rtools 3.4

20 + 40 + gpu threads runs

CPU: Dual Quanta Freedom Ivy Bridge (3.3/2.7 GHz, 20c/40t)
RAM: 96GB 1600MHz RAM (80GB RAM child)
OS: Ubuntu 16.04 (4.4 kernel)
Virtualization: KVM + Windows Server 2012 R2 Datacenter
Microsoft R Client 3.3.2
Rtools 3.4
GPU: Intel OpenCL on CPU
Reality: dropped as we know too many threads/GPU is bad, but you can still see them

Gradient Boosted Trees Algorithms Used

xgboost

Versions used: commit b4d97d3
Flags used:
- Default: -O2 -mtune=core2
- Extra 1: -O3 -funroll-loops -march=native
- Extra 2: -O3 -funroll-loops -march=native -ffast-math + -Wl,-O1 -O3 -Wl,-ffast-math on linker

LightGBM

Versions used: commit ea6bc0a (v1) and 1bf7bbd (v2)
Flags used:
- Default: -O2 -mtune=core2
- Extra 1: -O3 -march=native
- Extra 2: -O3 -march=native -ffast-math + -Wl,-O1 -O3 -Wl,-ffast-math on linker
- Extra 3: -O2 -march=native
- Extra 4: -Os

Installation of Gradient Boosted Trees Algorithms

xgboost

Manual installation of commit dmlc/xgboost@b4d97d3 (Pull Request 2045, Feb 20 2017) or
devtools::install_github("Laurae2/ez_xgb/R-package@2017-02-15-v1", force = TRUE)

LightGBM

LightGBM v1: devtools::install_github("Microsoft/LightGBM@v1@ea6bc0a", subdir = "R-package", force = TRUE)
LightGBM v2: devtools::install_github("Microsoft/LightGBM@1bf7bbd", subdir = "R-package", force = TRUE)
LightGBM v2 + Intel OpenCL: devtools::install_github("Laurae2/LightGBM@7514fde", subdir = "R-package", force = TRUE) (or make an installation using https://github.com/Microsoft/LightGBM/pull/448

Hyperparameters Used (Full list)

Bosch

Leaves Hyperparameters:

Depth: ∞
Leaves: {7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095}
Hessian: 1
Column Sampling: 100%
Row Sampling: 100%
Iterations: {2K, 1.5K, 1K, 1K, 500, 400, 400, 400, 400, 400}
Learning Rate: 0.02

Depth Hyperparameters:

Depth: {3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
Leaves: {7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095}
Hessian: 1
Column Sampling: 100%
Row Sampling: 100%
Iterations: {2K, 1.5K, 1K, 1K, 500, 400, 400, 400, 400, 400}
Learning Rate: 0.02

Pruning Hyperparameters:

Depth: 10
Leaves: 1023
Hessian: {1, 5, 25, 125}
Column Sampling: 100%
Row Sampling: 100%
Iterations: 400
Learning Rate: 0.02

Sampling Hyperparameters:

Depth: 6
Leaves: 63
Hessian: 1
Column Sampling: 100%
Row Sampling: {100%, 80%, 60%, 40%}
Iterations: 1000
Learning Rate: 0.04

Higgs

Leaves Hyperparameters:

Depth: ∞
Leaves: {7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095}
Hessian: 1
Column Sampling: 100%
Row Sampling: 100%
Iterations: 500
Learning Rate: 0.25

Depth Hyperparameters:

Depth: {3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
Leaves: {7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095}
Hessian: 1
Column Sampling: 100%
Row Sampling: 100%
Iterations: 500
Learning Rate: 0.25

Pruning Hyperparameters:

Depth: 10
Leaves: 1023
Hessian: {1, 5, 25, 125}
Column Sampling: 100%
Row Sampling: 100%
Iterations: 500
Learning Rate: 0.25

Sampling Hyperparameters:

Depth: 6
Leaves: 63
Hessian: 1
Column Sampling: 100%
Row Sampling: {100%, 80%, 60%, 40%}
Iterations: 1000
Learning Rate: 0.25

Metric Performance Analysis

Use the Metric Performance Analysis if you expect to compare AUC metric data. LightGBM is not reproducible when Stochastic.

Check interactively on Tableau Public:

https://public.tableau.com/views/gbt_benchmarks/AUC-Data?:showVizHome=no

Provided dynamic and interactive filters:

Type
Dataset
Algorithm
Parameter
Threads
CPU

Gradient Boosted Trees Mega Benchmark: AUC - Data

Master Data Analysis

Use the Master Analysis if you expect all data at the same time.

Check interactively on Tableau Public:

https://public.tableau.com/views/gbt_benchmarks/Master-Data?:showVizHome=no

Provided dynamic and interactive filters:

Type
Dataset
Algorithm
Parameter
Threads
CPU

Gradient Boosted Trees Mega Benchmark: Master - Data

Disaggregated Data Analysis

Use the Disaggregated Data Analysis if you want fully detailed results.

Check interactively on Tableau Public:

https://public.tableau.com/views/GradientBoostedTreesMegaBenchmarkdisaggregated/DisaggregatedData?:showVizHome=no

Provided dynamic and interactive filters:

Type
Dataset
Algorithm
Parameter
Threads
CPU
Iteration

Gradient Boosted Trees Mega Benchmark (disaggregated)

Leaves Analysis

Bosch

Leaves Hyperparameters:

Depth: ∞
Leaves: {7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095}
Hessian: 1
Column Sampling: 100%
Row Sampling: 100%
Iterations: {2K, 1.5K, 1K, 1K, 500, 400, 400, 400, 400, 400}
Learning Rate: 0.02

Higgs

Leaves Hyperparameters:

Depth: ∞
Leaves: {7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095}
Hessian: 1
Column Sampling: 100%
Row Sampling: 100%
Iterations: 500
Learning Rate: 0.25

Use the Leaves Analysis if you expect only Leaves data to show up.

Check interactively on Tableau Public:

https://public.tableau.com/views/gbt_benchmarks/Leaves-Data?:showVizHome=no

Provided dynamic and interactive filters:

Dataset
Algorithm
Parameter
Threads
CPU

Gradient Boosted Trees Mega Benchmark: Leaves - Data

Depth Analysis

Bosch

Depth Hyperparameters:

Depth: {3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
Leaves: {7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095}
Hessian: 1
Column Sampling: 100%
Row Sampling: 100%
Iterations: {2K, 1.5K, 1K, 1K, 500, 400, 400, 400, 400, 400}
Learning Rate: 0.02

Higgs

Depth Hyperparameters:

Depth: {3, 4, 5, 6, 7, 8, 9, 10, 11, 12}
Leaves: {7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095}
Hessian: 1
Column Sampling: 100%
Row Sampling: 100%
Iterations: 500
Learning Rate: 0.25

Use the Depth Analysis if you expect only Depth data to show up.

Check interactively on Tableau Public:

https://public.tableau.com/views/gbt_benchmarks/Depth-Data?:showVizHome=no

Provided dynamic and interactive filters:

Dataset
Algorithm
Parameter
Threads
CPU

Gradient Boosted Trees Mega Benchmark: Depth - Data

Pruning Analysis

Bosch

Pruning Hyperparameters:

Depth: 10
Leaves: 1023
Hessian: {1, 5, 25, 125}
Column Sampling: 100%
Row Sampling: 100%
Iterations: 400
Learning Rate: 0.02

Higgs

Pruning Hyperparameters:

Depth: 10
Leaves: 1023
Hessian: {1, 5, 25, 125}
Column Sampling: 100%
Row Sampling: 100%
Iterations: 500
Learning Rate: 0.25

Use the Pruning Analysis if you expect only Pruning data to show up.

Check interactively on Tableau Public:

https://public.tableau.com/views/gbt_benchmarks/Pruning-Data?:showVizHome=no

Provided dynamic and interactive filters:

Dataset
Algorithm
Parameter
Threads
CPU

Gradient Boosted Trees Mega Benchmark: Pruning - Data

Sampling Analysis

Bosch

Sampling Hyperparameters:

Depth: 6
Leaves: 63
Hessian: 1
Column Sampling: 100%
Row Sampling: {100%, 80%, 60%, 40%}
Iterations: 1000
Learning Rate: 0.04

Higgs

Sampling Hyperparameters:

Depth: 6
Leaves: 63
Hessian: 1
Column Sampling: 100%
Row Sampling: {100%, 80%, 60%, 40%}
Iterations: 1000
Learning Rate: 0.25

Use the Sampling Analysis if you expect only Sampling data to show up.

Check interactively on Tableau Public:

https://public.tableau.com/views/gbt_benchmarks/Sampling-Data?:showVizHome=no

Provided dynamic and interactive filters:

Dataset
Algorithm
Parameter
Threads
CPU

Gradient Boosted Trees Mega Benchmark: Sampling - Data

Google Sites

Report abuse