The last thing after having your package tested, cleaned, and ready for release is to produce some statistical data out: benchmark. To do that, we will use the benchmark functionalities inside the testing standard package.
benchmarkYourNameHere.func BenchmarkFunctionA(b *testing.B) {.Benchmark codes are essentially test codes. Ideally, you should write the benchmark script inside the test script.
However, from experience standpoint, test script itself is already a bloating script. Hence, I would rather create a separate benchmark test script next to it for reducing the scrolling nightmare just to alter benchmark codes.
The script's filename should contain a _test.go at its ending suffix similar to test script. My practice would usually be appending a _benchmark_test.go instead, so that it produces the following cleaner file structures:
sourceCode.go - the source codesourceCode_test.go - the test script for the source codesourceCode_benchmark_test.go - the benchmark script for the source codeNow inside the sourceCode_benchmark_test.go script, you will write the benchmark function. A basic format is as follows:
func BenchmarkYourFunction(b *testing.B) { ... // prepare for i := 0; i < b.N; i++ { ... // your function here } ... // output dumping}Benchmark prefix in the function name.(b *testing.B).A good example is:
func BenchmarkAlgoSearchShort(b *testing.B) { var r bool for i := 0; i < b.N; i++ { r = fragmentSearch(&patShort, &txtShort, sensitivity) } res = r}A beginner mistake would be:
func BenchmarkAlgoSearchShort(b *testing.B) { r := fragmentSearch(&patShort, &txtShort, sensitivity) res = r}Now, you must write your benchmark for each case study. That also means for different output, you need to duplicate the functions. Here is an example, wrapping the benchmark function in a private function benchmarkAlgos and then call them out consistently across each case studies:
package mainimport ( "testing")const ( benchmarkSensitivity = uint(3) benchmarkPatShort = "ABC" benchmarkTxtShort = "AABAETHAENABCDBERHAEAFZCBHBAAA" benchmarkPatMedium = "ABCTHMTRGDHNMTR<&^%#$%J$$GAEHAEZDFBETJAEAFBEAHAEHREJMNRNETNEANEARNXGFMX<RSYDAETM<R$W%" benchmarkTxtMedium = "AABAETHAENABCDBERHAEAFZCBHBAAAYUOTNTNTYPGFDFBNTMTPYNDTMNTMTPNGTMNTYMTPMTYATNRM&%#L%#@YU%IHERNAERNMEATJEQANJAETMKEATJEARJNETJAERNEANEANAENEANTEAETRNEN RNWRNRWMTNMTPYMDTMTYFHGMTMDFMVNFVNMTYT YTJTYJTYNTMTDGFNDGRTU%^%^#$%#$34#$TNTHMFYUKDNGDDGNRTTRJRTJRTJ" benchmarkPatLong = "ABCTHMTRGDHNMTR<&^%#$%J$$GAEHAEZDFBETJAEAFBEAHAEHREJMNRNETNEANEARNXGFMX<RSYDAETM<R$W%" benchmarkTxtLong = "AABAETHAENABCDBERHAEAFZCBHBAAAYUOTNTNTYPGFDFBNTMTPYNDTMNTMTPNGTMNTYMTPMTYATNRM&%#L%#@YU%IHERNAERNMEATJEQANJAETMKEATJEARJNETJAERNEANEANAENEANTEAETRNEN RNWRNRWMTNMTPYMDTMTYFHGMTMDFMVNFVNMTYT YTJTYJTYNTMTDGFNDGRTU%^%^#$%#$34#$TNTHMFYUKDNGDDGNRTTRJRTJRTJ")func benchmarkAlgos(selector int, pat []byte, txt []byte, b *testing.B) bool { r := false for i := 0; i < b.N; i++ { switch selector { case 0: r = fragmentSearch(&pat, &txt, benchmarkSensitivity) case 1: r = fragmentSearchX2(&pat, &txt) case 2: r = fragmentSearch2(&pat, &txt, benchmarkSensitivity) } } return r}func BenchmarkAlgoSearchShort(b *testing.B) { _ = benchmarkAlgos(0, []byte(benchmarkPatShort), []byte(benchmarkTxtShort), b)}func BenchmarkAlgoSearchMedium(b *testing.B) { _ = benchmarkAlgos(0, []byte(benchmarkPatMedium), []byte(benchmarkTxtMedium), b)}func BenchmarkAlgoSearchLong(b *testing.B) { _ = benchmarkAlgos(0, []byte(benchmarkPatLong), []byte(benchmarkTxtLong), b)}func BenchmarkAlgoX2SearchShort(b *testing.B) { _ = benchmarkAlgos(1, []byte(benchmarkPatShort), []byte(benchmarkTxtShort), b)}func BenchmarkAlgoX2SearchMedium(b *testing.B) { _ = benchmarkAlgos(1, []byte(benchmarkPatMedium), []byte(benchmarkTxtMedium), b)}func BenchmarkAlgoX2SearchLong(b *testing.B) { _ = benchmarkAlgos(1, []byte(benchmarkPatLong), []byte(benchmarkTxtLong), b)}func BenchmarkAlgo2SearchShort(b *testing.B) { _ = benchmarkAlgos(2, []byte(benchmarkPatShort), []byte(benchmarkTxtShort), b)}func BenchmarkAlgo2SearchMedium(b *testing.B) { _ = benchmarkAlgos(2, []byte(benchmarkPatMedium), []byte(benchmarkTxtMedium), b)}func BenchmarkAlgo2SearchLong(b *testing.B) { _ = benchmarkAlgos(2, []byte(benchmarkPatLong), []byte(benchmarkTxtLong), b)}NOTE:
var result intfunc BenchmarkFibComplete(b *testing.B) { var r int for n := 0; n < b.N; n++ { // always record the result of Fib to prevent // the compiler eliminating the function call. r = Fib(10) } // always store the result to a package level variable // so the compiler cannot eliminate the Benchmark itself. result = r}Now that we have the benchmark script, we execute the benchmark command. There are a lot of forms depending on the type of profiles you are interested in.
The basic and simplest profiling is to run CPU and Memory testing. Also, we need to explicitly tell the go test to skip all the tests. Therefore, the command is as such:
$ go test -run=none -bench=. -benchmemThis produces the following statistics:
goos: linuxgoarch: amd64pkg: gosandboxBenchmarkAlgoSearchShort-8 100000000 12.8 ns/op 0 B/op 0 allocs/opBenchmarkAlgoSearchMedium-8 50000000 32.4 ns/op 0 B/op 0 allocs/opBenchmarkAlgoSearchLong-8 50000000 32.6 ns/op 0 B/op 0 allocs/opBenchmarkAlgoX2SearchShort-8 5000000 252 ns/op 176 B/op 3 allocs/opBenchmarkAlgoX2SearchMedium-8 20000 91206 ns/op 352240 B/op 16 allocs/opBenchmarkAlgoX2SearchLong-8 20000 91688 ns/op 352241 B/op 16 allocs/opBenchmarkAlgo2SearchShort-8 10000000 152 ns/op 0 B/op 0 allocs/opBenchmarkAlgo2SearchMedium-8 10000000 219 ns/op 0 B/op 0 allocs/opBenchmarkAlgo2SearchLong-8 10000000 222 ns/op 0 B/op 0 allocs/opPASSThis is useful for a quick comparison but not insightful for optimization.
Now that we have the basic, it's time to go into advanced profiling. Go allows you to profile not only cpu and memory resources but also, calls, blocking, mutex, etc (refer documentation). Since we already enabled memory allocation statistics, we can proceed to enable everything, for the sake of learning experience. Here we go:
$ go test -run=none \ -bench=<Benchmark function name> \ -benchmem \ -benchtime 1s \ -blockprofile /tmp/gobenchmark_block.out \ -cpuprofile /tmp/gobenchmark_cpu.out \ -memprofile /tmp/gobenchmark_mem.out \ -mutexprofile /tmp/gobenchmark_mutex.out-benchmem : enable memory allocation statistics-benchtime : set benchmark time. s is second, m is minutes, h is hour etc-blockprofile : profile the go-routine blocking activities. It takes a filepath for writing out the data.-cpuprofile : profile to cpu processing activities. It takes a filepath for writing out the data.-memprofile : profile the memory usage. It takes a filepath for writing out the data.-mutexprofile : profile mutex activities. It takes a filepath for writing out the data.With all the files ready, it is time to process the data. The easiest way to do it is to process it with go tool pprof command and profile the data graphically. Before that, you need to install graphviz software. On Debian Stretch is:
$ sudo apt install graphviz -yHere is the example for processing all the benchmark output data file into usable visuals:
$ go tool pprof -output /tmp/gobenchmark_cpu.svg -svg /tmp/gobenchmark_cpu.out$ go tool pprof -output /tmp/gobenchmark_mem.svg -svg /tmp/gobenchmark_mem.out$ go tool pprof -output /tmp/gobenchmark_block.svg -svg /tmp/gobenchmark_block.out$ go tool pprof -output /tmp/gobenchmark_mutex.svg -svg /tmp/gobenchmark_mutex.outNow that we have all the profile images ready, it's time to open them using the default program. On Debian, it is:
$ xdg-open /tmp/gobenchmark_cpu.svg &> /dev/null$ xdg-open /tmp/gobenchmark_mem.svg &> /dev/null$ xdg-open /tmp/gobenchmark_block.svg &> /dev/null$ xdg-open /tmp/gobenchmark_mutex.svg &> /dev/nullYou should see a launch diagram as the following:
There are additional tools that can help you calculate the delta between the benchmark results. This tool is known as benchcmp, where you can install it using go get:
$ go get golang.org/x/tools/cmd/benchcmpThis program will calculate the delta between the old results with the new results from both statistics. Example, by feeding this command:
$ 2>&1 benchcmp /tmp/gobenchmark_old.log /tmp/gobenchmark.logwe got:
benchmark old ns/op new ns/op deltaBenchmarkAlgoSearchShort-8 12.9 12.9 +0.00%BenchmarkAlgoSearchMedium-8 33.0 32.5 -1.52%BenchmarkAlgoSearchLong-8 34.1 32.5 -4.69%BenchmarkAlgoX2SearchShort-8 270 255 -5.56%BenchmarkAlgoX2SearchMedium-8 104180 100286 -3.74%BenchmarkAlgoX2SearchLong-8 111012 104952 -5.46%BenchmarkAlgo2SearchShort-8 154 152 -1.30%BenchmarkAlgo2SearchMedium-8 215 215 +0.00%BenchmarkAlgo2SearchLong-8 216 215 -0.46%benchmark old allocs new allocs deltaBenchmarkAlgoSearchShort-8 0 0 +0.00%BenchmarkAlgoSearchMedium-8 0 0 +0.00%BenchmarkAlgoSearchLong-8 0 0 +0.00%BenchmarkAlgoX2SearchShort-8 3 3 +0.00%BenchmarkAlgoX2SearchMedium-8 16 16 +0.00%BenchmarkAlgoX2SearchLong-8 16 16 +0.00%BenchmarkAlgo2SearchShort-8 0 0 +0.00%BenchmarkAlgo2SearchMedium-8 0 0 +0.00%BenchmarkAlgo2SearchLong-8 0 0 +0.00%benchmark old bytes new bytes deltaBenchmarkAlgoSearchShort-8 0 0 +0.00%BenchmarkAlgoSearchMedium-8 0 0 +0.00%BenchmarkAlgoSearchLong-8 0 0 +0.00%BenchmarkAlgoX2SearchShort-8 176 176 +0.00%BenchmarkAlgoX2SearchMedium-8 352243 352243 +0.00%BenchmarkAlgoX2SearchLong-8 352243 352243 +0.00%BenchmarkAlgo2SearchShort-8 0 0 +0.00%BenchmarkAlgo2SearchMedium-8 0 0 +0.00%BenchmarkAlgo2SearchLong-8 0 0 +0.00%The only problem is you need to keep the old statistic each time you perform a benchmark.
Now that we have all the desired commands for advanced profiling, it's time to bundle them together. I usually bundle the following function in my ~/.bashrc command:
gobenchmark() { open="false" arg="${1:-.}" timeout="${2:-"1s"}" if [ "$1" == "-r" ]; then open="true" arg="${2:-.}" timeout="${3:-"1s"}" fi if [ -f "/tmp/gobenchmark.log" ]; then mv "/tmp/gobenchmark.log" "/tmp/gobenchmark_old.log" fi go test -run=none \ -bench="$arg" \ -benchmem \ -benchtime "$timeout" \ -blockprofile /tmp/gobenchmark_block.out \ -cpuprofile /tmp/gobenchmark_cpu.out \ -memprofile /tmp/gobenchmark_mem.out \ -mutexprofile /tmp/gobenchmark_mutex.out \ | tee /tmp/gobenchmark.log if [ $? != 0 ] ;then return 1 fi 2>&1 go tool pprof \ -output /tmp/gobenchmark_cpu.svg \ -svg /tmp/gobenchmark_cpu.out \ > /dev/null 2>&1 go tool pprof \ -output /tmp/gobenchmark_mem.svg \ -svg /tmp/gobenchmark_mem.out \ > /dev/null 2>&1 go tool pprof \ -output /tmp/gobenchmark_block.svg \ -svg /tmp/gobenchmark_block.out \ > /dev/null 2>&1 go tool pprof \ -output /tmp/gobenchmark_mutex.svg \ -svg /tmp/gobenchmark_mutex.out \ > /dev/null if [ ! -z "$(2>&1 type benchcmp)" ]; then 2>&1 benchcmp \ /tmp/gobenchmark_old.log \ /tmp/gobenchmark.log \ > /tmp/gobenchmark_delta.log if [ "$open" == "true" ]; then xdg-open /tmp/gobenchmark_delta.log &> /dev/null fi fi if [ "$open" == "true" ]; then xdg-open /tmp/gobenchmark_cpu.svg &> /dev/null xdg-open /tmp/gobenchmark_mem.svg &> /dev/null xdg-open /tmp/gobenchmark_block.svg &> /dev/null xdg-open /tmp/gobenchmark_mutex.svg &> /dev/null xdg-open /tmp/gobenchmark.log &> /dev/null fi}export -f gobenchmarkThen, the next time I run a benchmark without opening the program to display the data, I could just run this:
$ gobenchmark$ gobenchmark .$ gobenchmark ./...$ gobenchmark "${HOME}/Document/myproject/..."$ gobenchmark . 20m$ gobenchmark ./... 20m$ gobenchmark "${HOME}/Document/myproject/..." 20mOtherwise, if I want to open the program to display the data, I could just run this:
$ gobenchmark -r$ gobenchmark -r .$ gobenchmark -r ./...$ gobenchmark -r "${HOME}/Document/myproject/..."$ gobenchmark -r . 20m$ gobenchmark -r ./... 20m$ gobenchmark -r "${HOME}/Document/myproject/..." 20mThat's all about benchmark in Go.