Entropy Coder / Entropy Coding Benchmark

TurboBench : Single core in memory entropy coder benchmark

Libraries compiled with 64 bits gcc 8.3

Uniform distribution: enwik9 Skewed distribution: enwik9 bwt generated w/ libdivsufsort

(MB=1.000.000 bold/bold o1 = Pareto frontier on ratio %)

Size Ratio % C.MB/s D.MB/s Compressor enwik9 CPU:Skylake 3.4 GHz 2019-07

620061620 62.0 53.64 47.81 TurboRC bit o0
620593232 62.1 164.75 50.90 TurboRC nibble o0 NEW
621445788 62.1 78.56 108.75 TurboANXN (nibble) o0
622002650 62.2 17.94 14.34 AOMedia AV1 nibble o0
630669115 63.1 10.61 1097.37 TurboHF 0 optimal
631936952 63.2 9.96 1457.74 TurboANX 0 optimal
634474177 63.4 33.52 20.98 subotin ac
635255076 63.5 605.91 1523.95 TurboANX
635943929 63.6 174.26 847.66 fpc 0
637466340 63.7 839.44 1182.66 TurboHF 12
637963432 63.8 205.64 223.18 zlibh 13k
639048689 63.9 379.92 501.70 fse 32
639740506 64.0 273.05 676.69 rans_static16 32
640334759 64.0 942.23 1306.65 TurboHF 32
641336375 64.1 601.32 928.53 fpc 32
641412595 64.1 604.16 831.45 fsehuf 32
642305248 64.2 459.58 1346.12 rans_static32 32

Size Ratio % C.MB/s D.MB/s Compressor enwik9bwt CPU: Skylake 3.4 GHz 2019-07

183901004 18.4 77.51 74.20 TurboRC bit o0
194543776 19.5 181.95 87.45 TurboRC nibble o0 NEW
194784498 19.5 88.84 103.48 TurboANXN (nibble) o0
196471059 19.6 22.57 18.58 AOMedia AV1 nibble o0
215874369 21.6 15.98 1267.40 TurboANX,0 optimal
243044814 24.3 696.63 1785.67 TurboANX
246878292 24.7 9.43 1206.86 TurboHF 0
249576054 25.0 163.24 920.69 fpc 0
262677970 26.3 940.12 1360.01 TurboHF 8
263945442 26.4 64.14 38.52 subotin ac
265110032 26.5 402.88 537.83 fse 32
265538954 26.6 345.60 700.67 rans_static16 32
268176778 26.8 469.73 1465.28 rans_static32 32
284721687 28.5 255.95 275.84 zlibh 13k
289217878 28.9 1028.89 1442.79 TurboHF 32
289879586 29.0 655.22 1017.13 fsehuf 32
289892417 29.0 667.01 1022.59 fpc 32

Size Ratio % C.MB/s D.MB/s Compressor enwik9 CPU: i7-2600k 4.4 GHz 2017-05

440163308 44.0 52.78 39.98 TurboAC bit o1 sliding context
477069134 47.7 145.88 191.55 TurboANXX o1
541944518 54.2 131.62 273.52 rans_static16o1
620044520 62.0 22.18 16.77 vecrc_sh
620061616 62.0 59.00 49.40 TurboRC bit o0
620097748 62.0 21.92 24.87 fpaq0p_sh
620278358 62.0 19.50 14.99 fpaqc_mm
625933204 62.6 47.26 31.58 TurboAC byte
634103875 63.4 11.87 1184.60 TurboANX,9 JIT switch scalar/sse/avx2
634474173 63.4 38.78 21.67 Subotin
635429360 63.5 412.97 1219.70 TurboANX
637630650 63.8 654.12 955.22 TurboHF
637813190 63.8 259.52 275.07 zlibh 13k
639033488 63.9 184.66 54.54 FastAC
639048689 63.9 345.25 490.27 fse 32k
639740506 64.0 338.37 597.83 rans_static16 32k
640295271 64.0 711.43 1035.40 TurboHF 30K
640700579 64.1 208.21 170.72 fsc 32k
641305739 64.1 246.75 358.98 rANS_ryg
641412595 64.1 633.27 876.62 fsehuf 32k
642288251 64.3 140.46 634.95 rANS_ryg_sse4.1
643834665 64.4 129.31 100.62 FastHF
644090542 64.4 285.64 147.30 tornado huff
644883587 64.5 189.77 135.40 arith_static jamesB
648726912 64.9 84.87 48.89 polar tree
1000000000 100.0 7760.82 7832.90 memcpy

Size Ratio % C.MB/s D.MB/s Compressor enwik9bwt CPU: i7-2600k 4.4 GHz 2017-05

175943420 17.6 64.15 53.15 TurboRC bit o1 (sliding context)
183565924 18.4 37.41 49.21 fpaq0p_sh
183670473 18.4 22.38 14.95 fpaqc_mm
183896215 18.4 23.42 20.59 vecrc_sh
183901000 18.4 80.15 70.28 TurboAC bit o0
197653870 19.8 224.34 271.76 TurboANXXo1
215874337 21.6 15.59 1080.36 TurboANX,9
215877030 21.6 194.96 404.15 rans_static16o1
221038728 22.1 60.81 40.03 TurboAC byte
243033685 24.3 404.83 1625.87 TurboANX
247416080 24.7 260.13 412.21 rANS_Ryg
263031843 26.3 772.25 1100.41 TurboHF 8k
263945438 26.4 64.89 40.29 subotin
265110032 26.5 370.60 488.65 fse 32k
265538954 26.6 404.19 615.34 rans_static16
266129293 26.6 220.00 137.00 fsc
268113114 26.8 783.67 1108.92 TurboHF
284721687 28.5 307.39 336.12 zlibh
285724002 28.6 233.46 179.11 arith_static jamesB
289879586 29.0 689.13 1079.40 fsehuf 32k
297969227 29.8 247.55 72.38 FastAC
305848788 30.6 112.98 76.68 polar tree
335145308 33.5 170.13 123.45 FastHF
337038897 33.7 324.57 156.13 tornado huff
463797334 46.2 162.80 645.88 rANS_ryg_sse4.1 (SIMD)
1000000008 100.0 7943.18 7850.77 memcpy

Uniform distribution:enwik8

Size Ratio % C.MB/s D.MB/s Compressor RASPBERRY PI3 1.2GHz ARM gcc 6.3 64 bits 2018-04

61245756 61.2 9.46 9.53 TurboRC bit
61335538 61.3 6.44 13.75 TurboANXN nibble o0
61401168 61.4 3.09 2.84 AOMedia AV1 nibble o0
61571700 61.6 9.68 7.32 TurboAC_byte
62677385 62.7 8.38 5.07 subotin
62753950 62.8 37.28 196.18 TurboANX 12
62782152 62.8 21.16 176.25 fpc 0
63188022 63.2 23.91 13.64 FastAC
63193639 63.2 41.94 67.29 zlibh
63202025 63.2 73.33 94.10 fse 32k
63287917 63.3 58.02 92.96 rans_static16
63376580 63.4 190.21 301.99 TurboHF 32
63415358 63.4 98.27 267.38 fpc 32
63420890 63.4 97.88 189.14 fsehuf 32
63648861 63.6 23.71 16.89 FastHF
100000004 100.0 1016.00 1016.00 memcpy

Entropy Coder / Entropy Coding Benchmark

Only the most efficient, fast or popular entropy coders are tested. 2 practical uniform and skewed distributions like that found in lz77, bwt, image and video compression.

- ANS : Asymmetric Numeral Systems All the following ANS entropy coders are block based : each block is encoded separately using the statistics from this block only

- TurboANX in LzTurbo v1.2 (JIT scalar/sse/avx2) - World's fastest and most efficient entropy coder. Automatic JIT switching between Scalar/SSE/AVX2. All modes 32/64 bits scalar, SSE and AVX2 are binary compatible.
TurboANX optimal = Shortest path driven optimal partitioning (Variable length blocks).

- TurboANXX order o1 context - order o1 context entropy coder
- rANS interleaved v14.02 - Range Asymmetric Nummeral System
- FSC Finite State Coder v15.04
- Table Asymmetric Numeral Systems
- FSE Finite State Entropy v18.04
- Table Asymmetric Numeral Systems
- rANS Static v18.04
- Range Asymmetric Numeral Systems

References:
- Asymmetric Numeral Systems
- The use of Asymmetric Numeral Systems as an accurate replacement for Huffman Coding
- List of Asymmetric Numeral Systems implementations
- Wikipedia: Asymmetric Numeral Systems

- rANS : Adaptive Range Asymmetric Numeral Systems
- TurboANXN - Full adaptive multisymbol scalar/SSE/AVX2 rANS

References:

- Models for adaptive arithmetic coding

- ABS : Asymmetric Binary Systems - fpaqc_mm (Asymmetric Binary System) adaptive bitwise entropy coder

- Huffman Coding
- TurboHF from LzTurbo v1.1 - Block based Huffman Coding. Decoding one Billion Symbols per second. TurboHF optimal = variable length block partitioning
- FSEHUF Huffman Coding v18.05 - Block based Huffman Coding
- FastHF - quasi-adaptive Huffman Coding
- FPC - Fast Prefix Coder - Block based Huffman Coding.
- Polar Tree - Non-Huffman binary tree
- Tornado v0.6 - semi-adaptive Huffman Coder
- zlibh - Huffman Coding

*quasi-adaptive = semi-adaptive model with "periodic rescale" or "deferred summation".

The benchmark shows that block based coders are more efficient

References:

- It's Been 1,000,000 Years Since Huffman + Video - In-place calculation of minimum-redundancy codes + Software - A Memory-efficient Huffman Decoding Algorithm + Sildes

- Arithmetic Coding / Range Coder
- TurboRC bit in LzTurbo v1.2 - bitwise Range Coder
- TurboRC nibble - Multi-symbol nibble-wise Range Coder
- vecrc_sh: vectorized range coder"
- bitwise Range Coder
- fpaq0p_sh: simple binary rangecoder - bitwise Range Coder
- TurboAC byte in LzTurbo - bytewise Range Coder
- Subotin - bytewise Range Coder
- FastAC - quasi-adaptive (or semi-adaptive) Range Coder
- entenc/entdec - AOMedia AV1 Multi-symbol (non-binary) adaptive entropy encoder

References:

- Arithmetic Coding revealed A guided tour from theory to praxis incl. c++ code
- Introduction to Arithmetic Coding - Theory and Practice (FastAC)
- Context Adaptive Binary Arithmetic Coding
- Arithmetic Coding
- Data Compression With Arithmetic Coding
- Entrop Coding/Non-binary Arithmetic Coding - Daala_EC Entropy Coder in AV1

Benchmark Forum

Last update: 25 AUG 2019