DCT/DFT (or other frequency domain basis change based methods) based methods are very commonly used in image and video compression. Practically all standards like JPEG, H264, H265 etc use some form of frequency domain transform. However fractal based image compression uses a completely different type of mathematical mechanism to achieve compression.
Fractal based image compression is based of self similarity in the image. Here is a great tutorial on how it works. The blog has a nice explanation and also links to Python code implementing fractal image compression.
The idea sounded very new and interesting to me, so I decided to try it out and implement in C++ here. The original Python code that it is based of is here.
The reference Python implementation (based off the blog mentioned above) is able to encode a 64x64 image in 7.6 seconds and a 256x256 image in 1817.55 seconds. In my C++ implementations, the times are 0.89 seconds and 193.53 seconds. That is a significant speedup.
The encoding of each block is independent of other blocks, and that calls for a multithreaded approach. By firing the encoding of different blocks on different worker threads, we can get a 64x64 image encoded in 0.31 seconds, and a 256x256 image in 60.12 seconds.
Note in the graphs below increasing the number of threads does not scale linearly, and after a certain number of threads (say around 8), the computation time does not decrease.
Table on the left shows the Python, naive and multithreaded C++ encoding times (and the relative speedups)
Number of threads vs encode time (seconds) for 64x64 image
Number of threads vs encode time (seconds) for 256x256 image
For decoding one can start with any random image, and then apply the transformations calculated in the encoding phase iteratively. With each subsequent iteration the image converges to its fixed point, which should be close to the original image we are trying to compress.
Clearly, with increasing decode iterations we expect to see the PSNR increase, but we pay the price in increased computation time. The graphs below show that trend.
Also shown are the decode iterations, which show the image slowly converging to its fixed point. We can observe from the PSNR graph and the reconstructed image that we get diminishing returns on running more decode iterations. 8 iterations seem enough for a good reconstruction
PSNR vs decode iterations for 64x64 image
Time (seconds) vs decode iterations for 64x64 image
1
2
3
4
5
6
Original
Reconstructed (after 16 iterations)
The images above show the first 6 iterations of the decoding phase. The images on the left show the original and the final decoded image (after 16 iterations)
The above experiments were done for grayscale images only. Adding support for colour images could be the next step. Also while the encoder was sped up, the decoder was not optimized and there might be some scope for speedup there. Finally one could do a comparision with DCT based techniques (on the metrics of time and PSNR)