Benchmark
The last thing after having your package tested, cleaned, and ready for release is to produce some statistical data out: benchmark. To do that, we will use the benchmark functionalities inside the testing
standard package.
General Rules of Thumbs
- Avoid global variables in the benchmark script. It has direct influences to the entire package including the main source codes.
- Anything private in benchmark script should start with the name "benchmark" similar to the test script, as in
benchmarkYourNameHere
. - Anything public in benchmark script should be the test suite themselves, as in
func BenchmarkFunctionA(b *testing.B) {
. - Always save the output and use it. This is to avoid the compiler function's optimization where it automatically deletes function that produces output without any consumption.
Writing Benchmark Script
Benchmark codes are essentially test codes. Ideally, you should write the benchmark script inside the test script.
However, from experience standpoint, test script itself is already a bloating script. Hence, I would rather create a separate benchmark test script next to it for reducing the scrolling nightmare just to alter benchmark codes.
Filename
The script's filename should contain a _test.go
at its ending suffix similar to test script. My practice would usually be appending a _benchmark_test.go
instead, so that it produces the following cleaner file structures:
sourceCode.go
- the source codesourceCode_test.go
- the test script for the source codesourceCode_benchmark_test.go
- the benchmark script for the source code
Writing Benchmark
Now inside the sourceCode_benchmark_test.go
script, you will write the benchmark function. A basic format is as follows:
func BenchmarkYourFunction(b *testing.B) {
... // prepare
for i := 0; i < b.N; i++ {
... // your function here
}
... // output dumping
}
- You start off with
Benchmark
prefix in the function name. - Then pass in the benchmark parameters
(b *testing.B)
. - Inside the function, you must place your function inside the benchmark loops and capture the output for later output dumping.
- Do not forget this since a lot of beginner tends to forget the loop that is important to quantify the performance.
A good example is:
func BenchmarkAlgoSearchShort(b *testing.B) {
var r bool
for i := 0; i < b.N; i++ {
r = fragmentSearch(&patShort, &txtShort, sensitivity)
}
res = r
}
A beginner mistake would be:
func BenchmarkAlgoSearchShort(b *testing.B) {
r := fragmentSearch(&patShort, &txtShort, sensitivity)
res = r
}
One Benchmark Per Case Study
Now, you must write your benchmark for each case study. That also means for different output, you need to duplicate the functions. Here is an example, wrapping the benchmark function in a private function benchmarkAlgos
and then call them out consistently across each case studies:
package main
import (
"testing"
)
const (
benchmarkSensitivity = uint(3)
benchmarkPatShort = "ABC"
benchmarkTxtShort = "AABAETHAENABCDBERHAEAFZCBHBAAA"
benchmarkPatMedium = "ABCTHMTRGDHNMTR<&^%#$%J$$GAEHAEZDFBETJAEAFBEAHAEHREJMNRNETNEANEARNXGFMX<RSYDAETM<R$W%"
benchmarkTxtMedium = "AABAETHAENABCDBERHAEAFZCBHBAAAYUOTNTNTYPGFDFBNTMTPYNDTMNTMTPNGTMNTYMTPMTYATNRM&%#L%#@YU%IHERNAERNMEATJEQANJAETMKEATJEARJNETJAERNEANEANAENEANTEAETRNEN RNWRNRWMTNMTPYMDTMTYFHGMTMDFMVNFVNMTYT YTJTYJTYNTMTDGFNDGRTU%^%^#$%#$34#$TNTHMFYUKDNGDDGNRTTRJRTJRTJ"
benchmarkPatLong = "ABCTHMTRGDHNMTR<&^%#$%J$$GAEHAEZDFBETJAEAFBEAHAEHREJMNRNETNEANEARNXGFMX<RSYDAETM<R$W%"
benchmarkTxtLong = "AABAETHAENABCDBERHAEAFZCBHBAAAYUOTNTNTYPGFDFBNTMTPYNDTMNTMTPNGTMNTYMTPMTYATNRM&%#L%#@YU%IHERNAERNMEATJEQANJAETMKEATJEARJNETJAERNEANEANAENEANTEAETRNEN RNWRNRWMTNMTPYMDTMTYFHGMTMDFMVNFVNMTYT YTJTYJTYNTMTDGFNDGRTU%^%^#$%#$34#$TNTHMFYUKDNGDDGNRTTRJRTJRTJ"
)
func benchmarkAlgos(selector int, pat []byte, txt []byte, b *testing.B) bool {
r := false
for i := 0; i < b.N; i++ {
switch selector {
case 0:
r = fragmentSearch(&pat, &txt, benchmarkSensitivity)
case 1:
r = fragmentSearchX2(&pat, &txt)
case 2:
r = fragmentSearch2(&pat, &txt, benchmarkSensitivity)
}
}
return r
}
func BenchmarkAlgoSearchShort(b *testing.B) {
_ = benchmarkAlgos(0,
[]byte(benchmarkPatShort),
[]byte(benchmarkTxtShort),
b)
}
func BenchmarkAlgoSearchMedium(b *testing.B) {
_ = benchmarkAlgos(0,
[]byte(benchmarkPatMedium),
[]byte(benchmarkTxtMedium),
b)
}
func BenchmarkAlgoSearchLong(b *testing.B) {
_ = benchmarkAlgos(0,
[]byte(benchmarkPatLong),
[]byte(benchmarkTxtLong),
b)
}
func BenchmarkAlgoX2SearchShort(b *testing.B) {
_ = benchmarkAlgos(1,
[]byte(benchmarkPatShort),
[]byte(benchmarkTxtShort),
b)
}
func BenchmarkAlgoX2SearchMedium(b *testing.B) {
_ = benchmarkAlgos(1,
[]byte(benchmarkPatMedium),
[]byte(benchmarkTxtMedium),
b)
}
func BenchmarkAlgoX2SearchLong(b *testing.B) {
_ = benchmarkAlgos(1,
[]byte(benchmarkPatLong),
[]byte(benchmarkTxtLong),
b)
}
func BenchmarkAlgo2SearchShort(b *testing.B) {
_ = benchmarkAlgos(2,
[]byte(benchmarkPatShort),
[]byte(benchmarkTxtShort),
b)
}
func BenchmarkAlgo2SearchMedium(b *testing.B) {
_ = benchmarkAlgos(2,
[]byte(benchmarkPatMedium),
[]byte(benchmarkTxtMedium),
b)
}
func BenchmarkAlgo2SearchLong(b *testing.B) {
_ = benchmarkAlgos(2,
[]byte(benchmarkPatLong),
[]byte(benchmarkTxtLong),
b)
}
NOTE:
- If you are doing compiler optimization, you need to not only save the output inside the function but also all the way to the package level. Example from Dave Cheney:
var result int
func BenchmarkFibComplete(b *testing.B) {
var r int
for n := 0; n < b.N; n++ {
// always record the result of Fib to prevent
// the compiler eliminating the function call.
r = Fib(10)
}
// always store the result to a package level variable
// so the compiler cannot eliminate the Benchmark itself.
result = r
}
Run Benchmark
Now that we have the benchmark script, we execute the benchmark command. There are a lot of forms depending on the type of profiles you are interested in.
The basic CPU and Memory Profiling
The basic and simplest profiling is to run CPU and Memory testing. Also, we need to explicitly tell the go test to skip all the tests. Therefore, the command is as such:
$ go test -run=none -bench=. -benchmem
This produces the following statistics:
goos: linux
goarch: amd64
pkg: gosandbox
BenchmarkAlgoSearchShort-8 100000000 12.8 ns/op 0 B/op 0 allocs/op
BenchmarkAlgoSearchMedium-8 50000000 32.4 ns/op 0 B/op 0 allocs/op
BenchmarkAlgoSearchLong-8 50000000 32.6 ns/op 0 B/op 0 allocs/op
BenchmarkAlgoX2SearchShort-8 5000000 252 ns/op 176 B/op 3 allocs/op
BenchmarkAlgoX2SearchMedium-8 20000 91206 ns/op 352240 B/op 16 allocs/op
BenchmarkAlgoX2SearchLong-8 20000 91688 ns/op 352241 B/op 16 allocs/op
BenchmarkAlgo2SearchShort-8 10000000 152 ns/op 0 B/op 0 allocs/op
BenchmarkAlgo2SearchMedium-8 10000000 219 ns/op 0 B/op 0 allocs/op
BenchmarkAlgo2SearchLong-8 10000000 222 ns/op 0 B/op 0 allocs/op
PASS
This is useful for a quick comparison but not insightful for optimization.
Advanced Profiling
Now that we have the basic, it's time to go into advanced profiling. Go allows you to profile not only cpu and memory resources but also, calls, blocking, mutex, etc (refer documentation). Since we already enabled memory allocation statistics, we can proceed to enable everything, for the sake of learning experience. Here we go:
$ go test -run=none \
-bench=<Benchmark function name> \
-benchmem \
-benchtime 1s \
-blockprofile /tmp/gobenchmark_block.out \
-cpuprofile /tmp/gobenchmark_cpu.out \
-memprofile /tmp/gobenchmark_mem.out \
-mutexprofile /tmp/gobenchmark_mutex.out
-benchmem
: enable memory allocation statistics-benchtime
: set benchmark time.s
is second,m
is minutes,h
is hour etc-blockprofile
: profile the go-routine blocking activities. It takes a filepath for writing out the data.-cpuprofile
: profile to cpu processing activities. It takes a filepath for writing out the data.-memprofile
: profile the memory usage. It takes a filepath for writing out the data.-mutexprofile
: profile mutex activities. It takes a filepath for writing out the data.
With all the files ready, it is time to process the data. The easiest way to do it is to process it with go tool pprof
command and profile the data graphically. Before that, you need to install graphviz software. On Debian Stretch is:
$ sudo apt install graphviz -y
Here is the example for processing all the benchmark output data file into usable visuals:
$ go tool pprof -output /tmp/gobenchmark_cpu.svg -svg /tmp/gobenchmark_cpu.out
$ go tool pprof -output /tmp/gobenchmark_mem.svg -svg /tmp/gobenchmark_mem.out
$ go tool pprof -output /tmp/gobenchmark_block.svg -svg /tmp/gobenchmark_block.out
$ go tool pprof -output /tmp/gobenchmark_mutex.svg -svg /tmp/gobenchmark_mutex.out
Now that we have all the profile images ready, it's time to open them using the default program. On Debian, it is:
$ xdg-open /tmp/gobenchmark_cpu.svg &> /dev/null
$ xdg-open /tmp/gobenchmark_mem.svg &> /dev/null
$ xdg-open /tmp/gobenchmark_block.svg &> /dev/null
$ xdg-open /tmp/gobenchmark_mutex.svg &> /dev/null
You should see a launch diagram as the following:
Calculating Delta
There are additional tools that can help you calculate the delta between the benchmark results. This tool is known as benchcmp
, where you can install it using go get
:
$ go get golang.org/x/tools/cmd/benchcmp
This program will calculate the delta between the old results with the new results from both statistics. Example, by feeding this command:
$ 2>&1 benchcmp /tmp/gobenchmark_old.log /tmp/gobenchmark.log
we got:
benchmark old ns/op new ns/op delta
BenchmarkAlgoSearchShort-8 12.9 12.9 +0.00%
BenchmarkAlgoSearchMedium-8 33.0 32.5 -1.52%
BenchmarkAlgoSearchLong-8 34.1 32.5 -4.69%
BenchmarkAlgoX2SearchShort-8 270 255 -5.56%
BenchmarkAlgoX2SearchMedium-8 104180 100286 -3.74%
BenchmarkAlgoX2SearchLong-8 111012 104952 -5.46%
BenchmarkAlgo2SearchShort-8 154 152 -1.30%
BenchmarkAlgo2SearchMedium-8 215 215 +0.00%
BenchmarkAlgo2SearchLong-8 216 215 -0.46%
benchmark old allocs new allocs delta
BenchmarkAlgoSearchShort-8 0 0 +0.00%
BenchmarkAlgoSearchMedium-8 0 0 +0.00%
BenchmarkAlgoSearchLong-8 0 0 +0.00%
BenchmarkAlgoX2SearchShort-8 3 3 +0.00%
BenchmarkAlgoX2SearchMedium-8 16 16 +0.00%
BenchmarkAlgoX2SearchLong-8 16 16 +0.00%
BenchmarkAlgo2SearchShort-8 0 0 +0.00%
BenchmarkAlgo2SearchMedium-8 0 0 +0.00%
BenchmarkAlgo2SearchLong-8 0 0 +0.00%
benchmark old bytes new bytes delta
BenchmarkAlgoSearchShort-8 0 0 +0.00%
BenchmarkAlgoSearchMedium-8 0 0 +0.00%
BenchmarkAlgoSearchLong-8 0 0 +0.00%
BenchmarkAlgoX2SearchShort-8 176 176 +0.00%
BenchmarkAlgoX2SearchMedium-8 352243 352243 +0.00%
BenchmarkAlgoX2SearchLong-8 352243 352243 +0.00%
BenchmarkAlgo2SearchShort-8 0 0 +0.00%
BenchmarkAlgo2SearchMedium-8 0 0 +0.00%
BenchmarkAlgo2SearchLong-8 0 0 +0.00%
The only problem is you need to keep the old statistic each time you perform a benchmark.
Bundling Together
Now that we have all the desired commands for advanced profiling, it's time to bundle them together. I usually bundle the following function in my ~/.bashrc command:
gobenchmark() {
open="false"
arg="${1:-.}"
timeout="${2:-"1s"}"
if [ "$1" == "-r" ]; then
open="true"
arg="${2:-.}"
timeout="${3:-"1s"}"
fi
if [ -f "/tmp/gobenchmark.log" ]; then
mv "/tmp/gobenchmark.log" "/tmp/gobenchmark_old.log"
fi
go test -run=none \
-bench="$arg" \
-benchmem \
-benchtime "$timeout" \
-blockprofile /tmp/gobenchmark_block.out \
-cpuprofile /tmp/gobenchmark_cpu.out \
-memprofile /tmp/gobenchmark_mem.out \
-mutexprofile /tmp/gobenchmark_mutex.out \
| tee /tmp/gobenchmark.log
if [ $? != 0 ] ;then
return 1
fi
2>&1 go tool pprof \
-output /tmp/gobenchmark_cpu.svg \
-svg /tmp/gobenchmark_cpu.out \
> /dev/null
2>&1 go tool pprof \
-output /tmp/gobenchmark_mem.svg \
-svg /tmp/gobenchmark_mem.out \
> /dev/null
2>&1 go tool pprof \
-output /tmp/gobenchmark_block.svg \
-svg /tmp/gobenchmark_block.out \
> /dev/null
2>&1 go tool pprof \
-output /tmp/gobenchmark_mutex.svg \
-svg /tmp/gobenchmark_mutex.out \
> /dev/null
if [ ! -z "$(2>&1 type benchcmp)" ]; then
2>&1 benchcmp \
/tmp/gobenchmark_old.log \
/tmp/gobenchmark.log \
> /tmp/gobenchmark_delta.log
if [ "$open" == "true" ]; then
xdg-open /tmp/gobenchmark_delta.log &> /dev/null
fi
fi
if [ "$open" == "true" ]; then
xdg-open /tmp/gobenchmark_cpu.svg &> /dev/null
xdg-open /tmp/gobenchmark_mem.svg &> /dev/null
xdg-open /tmp/gobenchmark_block.svg &> /dev/null
xdg-open /tmp/gobenchmark_mutex.svg &> /dev/null
xdg-open /tmp/gobenchmark.log &> /dev/null
fi
}
export -f gobenchmark
Then, the next time I run a benchmark without opening the program to display the data, I could just run this:
$ gobenchmark
$ gobenchmark .
$ gobenchmark ./...
$ gobenchmark "${HOME}/Document/myproject/..."
$ gobenchmark . 20m
$ gobenchmark ./... 20m
$ gobenchmark "${HOME}/Document/myproject/..." 20m
Otherwise, if I want to open the program to display the data, I could just run this:
$ gobenchmark -r
$ gobenchmark -r .
$ gobenchmark -r ./...
$ gobenchmark -r "${HOME}/Document/myproject/..."
$ gobenchmark -r . 20m
$ gobenchmark -r ./... 20m
$ gobenchmark -r "${HOME}/Document/myproject/..." 20m
That's all about benchmark in Go.