INTGeoServer for Developers

Seismic data ZFP Compression (IVAAP only)

ZFP compression algorithms used by IVAAP

The ZFP compression algorithms are based on a ZFP project and library. It is intended to reduce the amount of data being transferred between IVAAP back-end and client software, thus improving the UX. There are two ZFP compression algorithms available: lossless ZFP and lossy ZFP. The former is primarily intended to be used in a situation when the data distortion isn't acceptable - for example, the client side seismic data processing or the data distortion sensitive visualisation. While providing the bit perfect data compression, the compression ratio is usually relatively small. On the other hand, the lossy ZFP compression can provide a way better data compression ratio in a tradeoff of small and usually acceptable data distortion.

Compression and data layout

Regardless of what variety of ZFP compression was applied to a seismic data, the resulting raw bytes are transferred to a client as a single binary chunk. To assure the client's ability to decompress the data, it's prepended with small JSON object, containing all the necessary information. This includes the name and required version of corresponding ZFP decompression algorithm, offset to the beginning of compressed data relative to the end of the JSON and the dimensions of the original 2D array of seismic samples. For lossy ZFP it could also include the maximum sample amplitude and error tolerance.

JSON header format and examples

Here's an example of the JSON header in case the lossy non-chunked ZFP compression algorithm was used by a server to compress the data:

{"height":2501,"width":256,"transf":[{"name":"invSampleTypes","version":"1.0"},{"name":"invZfp","version":1.0,"accuracyPercents":61.0,"maxAmplitude":1.60938131E12}],"begin":0}

Here's an example of the JSON header in case the lossless ZFP compression algorithm was used by a server to compress the data:

{"height":2501,"width":256,"transf":[{"name":"invSampleTypes","version":"1.0"},{"name":"invZfp","version":1.0,"accuracyPercents":0.0,"maxAmplitude":1.60938131E12}],"begin":0}

As you can see, the only difference in the JSON is the value of error tolerance - it equals to zero for lossless ZFP algorithm.

And finally, here's an example of JSON corresponding to chunked ZFP compression algorithm:

{“height”:2501,“width”:256,“transf”:[{“name”:“invSampleTypes”,“version”:“1.0”},{“name”:“invZFPChunked”,“version”:1.0,“accuracyPercents”:61.0,“maxAmplitude”:1.60938131E12,“chunkSize”:64}],“begin”:0}

Chunked vs Non-chunked lossy ZFP compression

The difference between the two is how data compressed on a server side. For non-chunked format, all the samples of all the requested traces are interpreted as single 2-dimensional array of samples. Thus, the compression algorithm uses the single call of ZFP library compression routine to compress this array at once.

The chunked algorithm , on the other hand, splits the 2D array of seismic samples into smaller rectangular subarrays (chunks) of fixed size. This provides the less data distortion in case the sample amplitude is varying between the traces or between different areas of the same trace.

The original 2D sample array splitting into chunks column-wise: first chunk of size N consist of first N samples of first N traces, second chunk consist of first N samples of traces N...2N-1 and so on. If there's not enough samples to fill the chunk, it is being compressed without padding zeroes, i.e. only the actual samples.

Difference in data layout

For both lossless and non-chunked ZFP compression algorithms, the resulting array of bytes can be imagined as a single chunk of binary data. It is supposed to be decompressed all at once, using the single call to ZFP library decompression routine. The only difference between algorithms is the ZFP decompression mode being used - lossy for lossy, lossless for lossless.

On the other hand, the array of bytes comprising the chunked ZFP compression output consist of a small chunks of various size. Every such small chunk begins with binary header containing information necessary to decompress it, such as : the magic number, chunked algorithm version, error tolerance value, maximum amplitude of original samples comprising the chunk, checksum and compressed byte size. This is enough information to both decompress the chunk correctly and find the first byte of next chunk.

The decompressed chunks then supposed to be transferred into the preallocated sample buffer of size specified in the JSON header. The decompressed samples are being put into the resulting buffer in a same order as the original array was split into chunks before compression (see above).

Compression ratio

Here are some examples of the ZFP compression ratio in a typical scenario of seismic data visualization in IVAAP

The URL below corresponds to a request of 256 traces with 1993 samples per trace. These samples, when converted to float, has size 1.95Mb:

https://someserver.ivaap/ivaap/api/ds/geofiles/v1/sources/some-uuid/seismic/some-seismic-datasource/querytraces?starttraceindex=0&endtraceindex=255&includesamples=true&sampleformat=Float&byteorder=LITTLE_ENDIAN&query={"keys":[{"name":"INLINE","min":1442,"max":1442,"step":1,"order":"","extraOptions":{"currentIncrement":1}},{"name":"XLINE","min":1303,"max":2170,"step":1,"order":"asc"}],"emptyTracesKey":{"name":"XLINE","min":1303,"max":2170}}

Following compression ration could be achieved:

Lossless ZFP: 1.7Mb

Lossy ZFP with visually acceptable data distortion: 161Kb

Report abuse