This week I am documenting the results of the homework.
Activity 1: Image restoration
In the first part, we are given the image of a flower which is blurry. The blur looks isotropic, so a averaging point spread function looked ideal.
Original image
Blurred image
Simple inversion of PSF (ie considering no noise) gives very poor results as expected:
Simple inversion
After exploring some of the advanced deconvolution functions available in MATLAB, I got the following results:
Weiner deconvolution
Lucy Richardson deconvolution
The Lucy Richardson deconvolution, which is an iterative process, gives a sharper image, but has more ringing artifacts. If we could remove the ringing from the Lucy Richardson restored image, it would be better.
Most of the ringing occurs at the boundary at the boundaries, so to remove that I copied patches around the boundary from the original blurry image to the restored image. This gave a subjectively better looking result.
Final restored image
For the caterpiller image there seems to be a motion in the 75 degree direction. After creating a filter in that direction and applying weiner filtering, we get this result. For comparision, I have also shown the result of using an isotropic PSF (like averaging).
Average PSF
Motion filter PSF
The results look almost similar though. Both images look a little sharper than the original. The black bristles near the head of the caterpillar look sharper for the motion filter psf, compared to the average filter psf.
Also on trying blind deconvolution, where the blurred image is given along with an initial PSF estimate, maximum likelihood estimator is used to estimate both the original image as well as the point spread function. The results from blind deconvolution are as follows. Atleast for the flower image blind deconvolution performance is comparable to Lucy Richardson algorithm.
flower
caterpillar
Activity 2: Image compression
In this activity a simple lossy encoder and decoder is implemented. Here is a block diagram
Block Diagram
First we convert the RGB image into YUV and then downsample the U and V channels. Then we treat each channel separately, but equivalently.
I have implemented variable block size encoding (sizes 32x32, 16x16 and 8x8 are allowed). In a single image multiple block sizes are allowed. This allows smooth regions to be encoded using large blocks and detailed regions use smaller block sizes. To decide the block size, I calculate the variance in 32x32 blocks. If the variance is very low (smooth region) 32x32 block size is used. If its medium 16x16 is used, else 8x8. Note that having block size can increase overhead, as we have to transmit the blocksize information also.
Then DCT and quantization is performed. For this I use JPEG's 8x8 quantization table, and for 16x16 and 32x32, I simply upsample the original 8x8 quantization table. The quality factor scales the quantiation table like in JPEG.
Then the DC components of all the blocks are collected and differentially encoded. For example if the DC coefficients are 1 4 3 2 5, they are differentially encoded as 1 3 -1 -1 3. The AC coefficients are collected in a zigzag scan and run length encoding is done on them. Finally from the statistics of the symbols generated by the image, a probability distribution is generated and lossless huffman coding is done.
Here are the 2 original images:
Moon image
Baboon image
For the moon image, here are some images showing intermediate stages:
Variance map
Block sizes (based on variance)
DC only approximation
decoded image
The results at a PSNR level of 35 are as follows:
Here is the distribution of symbols. Clearly there are a few symbols that occur with high probability. Exploiting this concentration of probability, huffman coding compresses well.
The probability distribution of the symbols looks like this:
probability of symbols
For the baboon, we have 3 different planes, Y, U and V. The images below show the 3 channels and their blocksizes. Note that the U and V channels have half length and breadth as the Y channel due to the downsampling. Also note that in smoother regions, large block size is used, but for detailed regions, small block size is used.
Y
U
V
Keeping quality factor = 15 (same as the quality factor of moon), we get the following results after combining the statistics from all 3 channels
The PSNR value came out to be 22.19 under these settings
Here is the decoded image and the probability distribution of the symbols
decoded image
probability distribution
Finally in a short discussion about how the compression happens I would like to note the following:
1. Converting from RGB to YUV and then downsampling U and V channels reduces the data by a factor of 4 for 2 channels. So overall the data becomes half. This is not applicable to grayscale images that have only 1 channel. The original baboon image was of a resolution 512x512. With 3 channels, it had 512x512x3 = 786432 bytes of data. After this reduction it only has 512x512 + 256x256 + 256x256 = 393216.
2. From these 393216 YUV bytes, DCT, quantization and then differential and run length coding, reduces the number of symbols to 57189. This is a huge gain of about about 6.88 times reduction.
3. Finally huffman coding generated 163399 bits or 163399/8 = 20424 bytes. So we see a furthur compression by a factor of about (#symbols/#bytes in huffman code) = 57189/20424 = 2.8.
Note that multiplying these reduction factors (2*2.8*6.88) we get 38.5 (which is what we calculated in the table earlier)
Thus the major compression occur in the quantization stage.
Here are two plots showing the relation between quality factor, bits needed and psnr. As expected, with increase in quality, no of bits icreases, and with increase in bit rate, PSNR decreases, but with diminishing returns.
Q vs bits required
bits required vs PSNR