11mar: Run length codec

This week I will develop and compare 2 lossless codecs for binary images. These codecs convert a binary image to a symbol stream of integers (but not to actual bitstreams) and back.

Activity 1: Run length codec for binary images

We have encountered run length coding for binary image. It performs better for binary images because in such images, usually there are long runs of whites or blacks that is succinctly represented with a single number. In this section I will describe a simple codec (both the encoder and the decoder) that encoded a binary image into a sequence of positive numbers.

Description of Simple Run length Codec

We consider that the image is composed of 0 (black) and 1 (white). We assume that each row starts with a 1 and then we encode the whole row with a string of numbers. Each number represents the run length of the current pixel. The next number in the sequence would represent the run length of the flipped version of the current bit.

Lets see an example. I will mark the encoded symbol stream with underline. On encoding 1 1 0 0 0 1 0 0 0 0 1 1, we get 2 3 1 4 2, because we have 2 ones, followed by 3 zeroes etc. Similarly 0 0 1 1 0 1 1, is encoded as 0 2 2 1 2, because we assume we start with one, but since there were no ones in the beginning, that run length is 0. That is followed by 2 zeroes, 2 ones, 1 zero and so on.

Now we introduce the symbol 0 as representing a new line. That is when encoding, if we reach the end of a row, a 0 is added to the symbol stream. Similarly on decoding side, if the decoder encounters a 0_, then it starts a new row in the image. Note since we are using 0 as a symbol for new line, it causes ambiguity as 0 could also mean a run length of length 0 (which is possible when the row begins with a 0, as shown in the second example above). Therefore if a runlength is of length n, I encode it using the symbol n+1, so that the symbol 0 uniquely represents a newline. Thus the symbol 1 would represent a run length of length 0. In the decoder side, when a non zero symbol is encountered, the decoder understands that it should subtract 1 from it to get the actual runlength. Note that the possible symbols thus formed can be positive numbers only (including 0) and the scheme is lossless.

Lets consider the following image:

1 1 0 0 1 1 1 0 1

0 0 0 0 1 1 0 0 0

This is encoded as 3 3 4 2 2 0 1 5 3 4 0. Note that a runlength of length n is coded by the symbol n+1.

Code

Lets see the code for the encoder.

%0 used for newline and a run of n positions is encoded as n+1

function [symbolstream, symbols, bitsneeded] = encode_runlength(img)

sz = size(img);

symbolstream = [];

bitsneeded = 0;

symbols = [0 0]; %unique symbols present and their count

for i = 1:sz(1)

currrun = 0; currpixel = 1; %current run length and current value of pixel

for j = 1:sz(2)

if (img(i,j) ~= currpixel) %current bit has flipped

currpixel = 1 - currpixel; %flip the current bit

%add the symbol to the encoder stream

symbolstream = [symbolstream currrun + 1];

%number of bits needed to represent this symbol in simple binary

bitsneeded = bitsneeded + ceil(log2(currrun + 1));

symbols = addsymbol(symbols, currrun);

currrun = 1;

else

currrun = currrun + 1; %increment the run if the bit hasn't flipped

end

symbols = addsymbol(symbols, currrun+1);

symbolstream = [symbolstream currrun + 1 0]; %add newline symbol

bitsneeded = bitsneeded + 1;

symbols(1,2) = symbols(1,2) + 1; %the 1st symbol is 0 (new line)

end

Now lets see the function for the decoder.

function decodedimg = decode_runlength(symbolstream)

decodedimg = [];

currrow = [];

currpixel = 1;

for i = 1:length(symbolstream)

if (symbolstream(i) == 0) %newline encountered

decodedimg = [decodedimg; currrow]; %append row to image

currrow = [];

currpixel = 1; %since a row is beginning we assume the first pixel is 1

else

%add n-1 number of 'currpixel', where n is the symbol

currrow = [currrow repmat(currpixel, 1, symbolstream(i)-1)];

currpixel = 1 - currpixel;

end

Now let us see some results. I read in an image that is possible to binarize and find the 50 percentile gray level value and binarize the image on that threshold. Below are the original images, binarised image and decoded image (that is I first encode the image and then decode it using the codec described above).

Activity 2: Run length codec using prediction for binary images

Now I describe a second approach where I predict the current row based on the previous row and then encode the difference.

Description of Predictive Runlength Codec

At the encoder, the first row of the image is coded using the previous coding scheme. Now consider all rows n > 1. Then we consider the difference between the current row and the previous row. If at position j the difference is zero then it means that this pixel at the j^th position is same as the j^th position of the previous row. If there is a non-zero difference (1 or -1)

then we add the start of the flip and the length of the flip to the symbol stream. Lets work out an example to make this clear.

Say we wish to encode the following:

1 1 1 0 0 0 0 0 1 1

1 1 0 0 0 1 1 1 1 1

The first line is encoded as 4 6 3 0 (according to the last scheme). Now to encode the 2nd row, lets see the difference of the 2nd and the 1st row. The difference is: 0 0 -1 0 0 1 1 1 0 0. Thus we see a non zero values at 3, 6, 7 and 8 (that is the 2nd row and the 1st row differ at these points). We encode this as 3 1 6 3. What it means is take the 1st row. Flip the 3rd bit and continue flipping for length one. Then flip the 6th bit and continue flipping for 3 places (ie 6th, 7th and 8th bits are flipped).

Now note that the starting point of the flip can be anything from 0 to the width of the picture, that is it can take a wide range of numbers. To get over that we can code the flip start incrementally. Lets take an example. Say a particular row is encoded as 3 1 7 2. We could use the following stream: 3 1 4 2. It means that at position 3 there is a flip of length 1. Then at position (3+4 = 7), there is a flip of length 2. So instead of adding the symbol 7 in the bitstream, we add the symbol 4. Since the last flip position was 3, we add 4 to 3 and get 7. Note that since the symbol 0 is used to denote newline, and we could have flips at position 0, so we encode the positions by adding 1 to them. So instead of 3 1 4 2, we have 4 1 5 2.

Thus this stream:

1 1 1 0 0 0 0 0 1 1

1 1 0 0 0 0 1 1 1 1

1 2 3 4 5 6 7 8 9 10

gets encoded as: 4 6 3 0 4 1 5 2 0. The first row (4 6 3 0) is same as earlier encoder. The next symbols (4 1 5 2 0) tell us that at position 4-1 = 3, for 1 runlength the bits differ from 1st row. Then at position 3+5-1 for a run length of 2, the bits in the second row differ from bits in the first row. The differing bits and their positions are shown in red.

Code

The encoder is shown below

function [symbolstream, symbols, bitsneeded] = encode_runlength_mod(img)

sz = size(img);

symbolstream = [];

bitsneeded = 0;

symbols = [0 0]; %unique symbols present and their count

%process first row separately using normal runlength encoder

[symbolstream symbols bitsneeded] = encode_runlength(img(1,:));

prevrow = img(1,:);

startflip = 0;

prevflip = 0;

for i = 2:sz(1) %start the modified encoding from the second row, as first row is encoded using normal runlength encoding

diffrow = img(i,:) - prevrow; %difference of current row and previous row

currpixel = img(1,1);

startflip = 0;

prevflip = 0;

for j = 1:sz(2)

if (diffrow(j) ~= 0)

if (startflip == 0)

startflip = j;

end

else

if (startflip ~= 0)

symbolstream = [symbolstream startflip-prevflip+1 j-startflip]; %insert the pair (start of flip, length of flip)

bitsneeded = bitsneeded + ceil(log2(startflip-prevflip+1)) + ceil(log2(j-startflip));

symbols = addsymbol(symbols, startflip-prevflip+1); symbols = addsymbol(symbols, j-startflip);

prevflip = startflip;

startflip = 0;

end

prevrow = img(i, :);

symbolstream = [symbolstream 0];

bitsneeded = bitsneeded + 1;

symbols(1,2) = symbols(1,2) + 1; %the 1st symbol stored is 0 (new line)

end

The decoder is as follows:

function decodedimg = decode_runlength_mod(symbolstream)

decodedimg = [];

currrow = [];

currpixel = 1;

temp = find(symbolstream == 0);

decodedimg = decode_runlength(symbolstream(1:temp(1))); %1st row is decoded using normal runlength decoder

remaining = symbolstream(temp+1:end);

prevrow = decodedimg(1,:);

beginnew = 1;

currrow = decodedimg(1,:);

i = 1;

prevflip = 0;

while(1)

if remaining(i) == 0 %new row

decodedimg = [decodedimg; currrow];

prevrow = currrow;

beginnew = 1;

i = i+1;

startflip = 0;

prevflip = 0;

else

if beginnew==1

beginnew = 0;

currrow = prevrow;

end

startflip = remaining(i) + prevflip - 1; %start of flip is old flip point + current symbol -1

flipdist = remaining(i+1); %runlength of flip

for k = startflip:startflip + flipdist - 1 %flip the required bits in this loop

currrow(k) = 1 - currrow(k);

end

prevflip = startflip;

i = i+2; %read 2 symbols (flip start, length of flip)

end

if (i > length(remaining)) break; end %break if all symbols have been used up

end

We get exactly same results (that is perfectly lossless encoding and decoding), as expected. Since the results are same as shown in the table of images above, I am not making a new table here.

Activity 3: Comparision of the two codecs

Now that we have seen the 2 codecs (Simple (codec1) and Predictive (codec2)), lets compare them. Consider an image of width W, then both codecs generate a symbol stream containing values from the set {0, 1, 2, ..., W+1}. Both codecs use the horizontal redundancies present in the image

Before discussing the advantages of the Predictive codec, lets keep in mind the following:

1. Its better if we have a smaller number of symbols, as we can use small number of bits for each symbol then.

2. Its better if some symbols have very high probability,, as then short bit stream code words can be assigned to them and a lot of compression is achieved.

Now lets look atthe predictive codecs advantages:

1. It uses the vertical redundancies also. Since each row looks very much like the row above it, the error between them is very less, hence we can save a lot if we only encode the difference.

2. Also since the difference is probably not much between consecutive rows we have short runs of flips. Thus smaller symbols become more probable, making it possible to encode them with small number of bits. Thus this scheme concentrates the probability towards the smaller symbols.

To compare the two codecs, I collected the following statistics:

1. the number of unique symbols used. (the less the better)

2. Max value of symbol. (the less the better)

3. Histogram (relatively fewer symbols should have more occurrences)

4. No of bits if constant length binary converted bitstream is generated. Here each symbol uses ceil(log2(max)) number of bits, where max is the maximum value among all the symbols used. This is because to encode max, we need atleast that many bits and since it is a fixed length scheme, all symbols are encoded using that many bits. Clearly its not very efficient. (the less the better)

5. No of bits if each symbol is encoded in binary using minimal number of bits. So to encode the symbol 5 we need 3 bits and for 9 we need 4 bits. So to encode the symbol stream 5 9 5, we need 10 bits. Note this kind of binary conversion using variable code length is ambiguous and is not a practical thing. I just use this as a metric to compare the codecs. (the less the better)

The table below compares the numbers for the 3 images shown above. In each cell the left hand number is for codec 1 and the right hand number is for codec 2. The number in green is indicates better performance

Image

Image size (# pixels)

50336

50421

50625

Max value of symbol

# unique symbols

Bits used for constant length binary symbols

Bits used for variable length binary symbols

Comments

Note that the constant length encoding actually gives more bits than the original image, but the variable bit length can give much less. Codec 2 outperforms codec 1 in all departments

Since the max value of symbol for codec 2 is much less, it uses smaller number of bits for all the symbols and thus has better performance for constant length.

Since the max value of a symbol is same, constant length bitstream is almost same for both. But codec 2 is better when variable length is considered

260

160

123300

119552

24497

25974

344

197

84015

49984

23107

13481

226

30184

31032

12178

9577

If we compare the two codecs, we see that the predictive codec mostly outperforms the simple codec, mostly because it gives lesser number of unique symbols and the max value of its symbols are less too. Lets now look at the histograms of the symbols generated by both codecs for the three test images.

Histogram of symbols for image 1 using codec 1

Histogram of symbols for image 1 using codec 2

Histogram of symbols for image 2 using codec 1

Histogram of symbols for image 2 using codec 2

Histogram of symbols for image 3 using codec 1

Histogram of symbols for image 3 using codec 2

In the first image we see that the range of symbols is same for both, so nothing much can be concluded here. But in image 2, codec uses smaller number of symbols and in image 3, codec 2 a few symbols have a very large concentration (around 1500) but in codec 1, the maximum used symbol is only around 450.

Hopefully next time I will explore converting the symbol stream into actual bitstreams, perform more rigorous analysis and observe which codec does better.

Page updated

Google Sites

Report abuse