Windows vs WSL Benchmarks
May 2018 Edition
WSL what the hell?
The Windows Subsystem for Linux (WSL) is a (partially) native Linux running right in Windows:
- Sort of virtualization, even if it is not truly virtualization.
- Forking is available!!!!! (when creating parallel R processes, you share the main memory at the time of forking, and the new objects inside the parallel processes are not shared and must be gathered manually)
- It incurs a small performance cost as the kernel catches the Linux calls to translate them for Windows - in addition, not every Linux feature is available (no Linux kernel for instance)
- You can install R and Python inside without any issues
- You can even install RStudio Server, but it must be run every time after you reboot (this is very simple: open a Bash shell, then write
sudo rstudio-server restart
) - You can also run Jupyter Notebook / Jupyter Lab no question asked
- WSL obeys Windows Firewall: you can use the Advanced settings with a graphical user interface to setup the protection of your Linux processses
- WSL cannot yet passthrough GPU: no GPU access in WSL!
You can find the installation steps for a full R and Python setup in the following link: https://github.com/Laurae2/R_Installation
Try to run the following in R to test whether forking is available:
library(parallel)
set.seed(11111)
y <- runif(n = 100000000)
format(object.size(y), units = "Gb")
cl <- makeForkCluster(detectCores())
parSapply(cl, X = seq_len(detectCores()), function(x) {c(exists("X"), exists("y"))})
parSapply(cl, X = seq_len(detectCores()), function(x) {length(y)})
parSapply(cl, X = seq_len(detectCores()), function(x) {head(y)})
stopCluster(cl)
You must get results on the last printed element (a matrix with a number of columns equal to the number of your cores).
Datasets Used
Higgs
- 11,000,000 training observations (all)
- 406 features (interaction (multiplication) of all pairwise combinations of the 28 features)
- Metric: time taken to finish 10 boosting iterations
- 4,466,000,000 elements
- 665,073,059 zero values (14.89% sparsity)
- 35,728,026,424 bytes (33.3GB)
- Peak RAM requirement: 82GB
Servers Used (Hardware / Software)
1-64 thread runs
- CPU: Dual Xeon Gold 6130 (3.7/2.8 GHz, 32c/64t)
- RAM: 384GB 2666MHz RAM
- OS: Windows 10 Enterprise + Windows Subsystem for Linux, Ubuntu 16.04
- Virtualization: None
- R 3.5, compiled
- Windows: Visual Studio 2017
- WSL: gcc 5.4
Gradient Boosted Trees Algorithms Used
xgboost
- Versions used: commit
8f6aadd
- Flags used:
- gcc:
-O3 -mtune=native
- gcc:
LightGBM
- Versions used: commit
3f54429
- Flags used:
- gcc:
-O3 -mtune=native
- gcc:
Installation of Gradient Boosted Trees Algorithms
xgboost
Installing xgboost directly from R:
devtools::install_github("Laurae2/xgbdl")
xgbdl::xgb.dl(compiler = "gcc", commit = "8f6aadd", use_avx = TRUE, use_gpu = FALSE)
LightGBM
Installing LightGBM directly from R:
devtools::install_github("Laurae2/lgbdl")
lgbdl::lgb.dl(commit = "3f54429", compiler = "gcc")
Hyperparameters Used (Full list)
xgboost
Hyperparameters, average of 5 runs (approximately 48h):
- Depth: 8
- Leaves: 255
- Hessian: 1
- Minimum Loss to split: 0
- Column Sampling: 100%
- Row Sampling: 100%
- Iterations: 10
- Learning Rate: 0.25
- Boosting method: gbdt, fast histogram
- Bins: 255
- Loss function: binary:logistic
Note: the timing takes into account the binning construction time, which is approximately 50% to 70% of the xgboost timing.
It takes 13 minutes with 1 thread, 2 minutes with 64 threads.
LightGBM
Hyperparameters, average of 5 runs (approximately 14h):
- Depth: 8
- Leaves: 255
- Hessian: 1
- Minimum Loss to split: 0
- Column Sampling: 100%
- Row Sampling: 100%
- Iterations: 10
- Learning Rate: 0.25
- Boosting method: gbdt
- Bins: 255
- Loss function: binary
Note: the timing does not take into account the binning construction time.
It takes 16 minutes using 1 thread, 23 seconds using 64 threads.
Performance Analysis (Windows vs WSL)
Use the Performance Analysis if you expect to compare timings data.
Check interactively on Tableau Public:
Provided dynamic and interactive filters:
- Threads