This week I will demonstrate a lossless intra prediction coding scheme and a functional programming approach to Huffman coding using Scala.
Activity 1: Intra Predective Image Coding
Here I implement a simpler form of the Block based Intra Prediction scheme used in H.264 I frames. The image is divided into blocks of size 4x4. A block is encoded one at a time. When a block is being encoded it uses blocks on top of it and to its left for prediction. It doesn't use blocks to its right or below it as the decoder decodes the blocks in raster scan order, hence those blocks won't be decoded before the current block is decoded. The following image shows the current block in orange, the already encoded blocks in green and the blocks stiill to be encoded are in red.
Block predictive scheme
Now H.264 codec uses the following modes for intra prediction. Note that all the modes use only the pixels on the top or left of the current block.
H.264 Intra prediction modes
I implement only 4 modes of the 9 modes shown here.
Modes implemented (Vertical, Horizontal, Mean, Diagonal)
The code is shown below. Blocks of size 4x4 are processed in raster order. For each block, all 4 modes are checked and the modes that gives the smallest prediction error is used for that particular block.
function intrapred(img, sz)
blocksize = 4; %we will consider blocks of size 4x4
predictedimg = zeros(sz);
modeused = zeros(sz);
for i = 1:4:sz(1)
for j = 1:4:sz(2)
if (i == 1 && j == 1) %topmost-leftmost block so no prediction from neighbours
predictedimg = mean(img(:)) * ones(blocksize, blocksize);
else
patch = img(i:i+3, j:j+3);
[top, left] = getneighbours(img, i, j); %get top 4 and left 4 neighbouring pixels of current block
%find the mode that gives min error
minerr = inf; minmode= -1; minpatch = [];
for k = 0:3 %scan through all the 4 modes
%some boundary checks
if (i == 1)
if (k == 0 || k == 3) %in the first row, vertical and diagonal modes are not possible
continue;
end
end
if (j == 1)
if (k == 1 || k ==3) %in the first column, vertical and diagonal modes are not possible
continue;
end
end
%get the predicted patch for this mode
predictedpatch = predictpatch(top, left, k, blocksize);
%find the error and store that mode's prediction that gives least error
err = sum((double(predictedpatch(:))-double(patch(:))).^2);
if err < minerr
minerr = err;
minmode = k;
minpatch = predictedpatch;
end
end
predictedimg(i:i+3, j:j+3) = minpatch;
modeused(i:i+3, j:j+3) = minmode * ones(4,4);
end
end
end
%display the predicted image, mode used and the error
modeused = normalize(modeused);
figure; imshow(modeused);
predictedimg = normalize(predictedimg);
figure; imshow(predictedimg);
predictedimg = uint8(floor(255*predictedimg));
figure; imshow(predictedimg - img);
end
%depending on mode output a predicted block
function predictedpatch = predictpatch(top, left, mode, blocksize)
predictedpatch = ones(blocksize, blocksize);
if (mode == 0) %vertical mode
for m = 1:blocksize %fill each column of the predictor
if (length(top) == blocksize+1)
predictedpatch(:, m) = top(m+1);
else
predictedpatch(:, m) = top(m);
end
end
end
if (mode == 1) %horizontal mode
for m = 1:blocksize %fill each row of the predictor
if (length(left) == blocksize+1)
predictedpatch(m, :) = left(m+1);
else
predictedpatch(m, :) = left(m);
end
end
end
if mode == 2 %mean predictor
predictors = [];
if (length(top) ~= 0)
predictors = [predictors top];
end
if (length(left) ~= 0)
predictors = [predictors left'];
end
predictedpatch = (predictedpatch * mean(predictors));
end
if (mode == 3) %diagonal mode
for m = 1:blocksize
for n = 1:blocksize
switch (m-n)
case 0
predictedpatch(m,n) = top(1);
case -1
predictedpatch(m,n) = top(2);
case -2
predictedpatch(m,n) = top(3);
case -3
predictedpatch(m,n) = top(4);
case 1
predictedpatch(m,n) = left(2);
case 2
predictedpatch(m,n) = left(3);
case 3
predictedpatch(m,n) = left(4);
end
end
end
end
end
%function to get the top and left neighbours for prediction of current block
function [top, left] = getneighbours(img, i, j)
topflag = 1; leftflag = 1;
%take care of border case
if (i - 1 <= 0 )
top = []; topflag = 0;
end
if(j - 1 <= 0)
left = []; leftflag = 0;
end
%if its not a border case, find top and left
if (topflag == 1)
if (leftflag == 1)
top = img(i-1, j-1:j+3);
else
top = img(i-1, j:j+3);
end
end
if (leftflag == 1)
if (topflag == 1)
left = img(i-1:i+3, j-1);
else
left = img(i:i+3, j-1);
end
end
end
Below I show the original image, the predicted image, the error and the mode used for different resolutions. The 4 modes are denoted by 4 colours in the mode image (black for vertical, dark grey for horizontal, light grey for mean mode and white for diagonal mode.
Note the jagged edges of the predicted image. This is because of the blocky nature of the prediction. Also note that as resolution increases the predicted image gets better. That's because the size of the block remains 4x4 and in large resolutions 4x4 blocks are mostly smooth, so prediction is better.
Original (512x512)
Predicted
Error
Modes used
Original (256x256)
Predicted
Error
Modes used
Original (128x128)
Predicted
Error
Modes used
Activity 2: Functional implementation of Huffman Coding
In this section I describe an implementation of Huffman coding using functional programming concepts that I did around a year ago as part of this. The language used is Scala. Since functional programming is used, most things are written in a recursive manner
In the Scala project I am attaching in the files section, there are many functions in the Huffman object and some examples how to use them in the Main object. You can find the source code in this directory of the project: patmat\src\main\scala\patmat. Let me describe a few important functions here, which give a flavour of the recursive techniques that can be implemented very succinctly in a functional language like Scala.
The tree is represented using the class CodeTree. It can have two case classes (that is 2 possible forms), leaves (dangling nodes) and forks (or internal nodes)
abstract class CodeTree
case class Fork(left: CodeTree, right: CodeTree, chars: List[Char], weight: Int) extends CodeTree
case class Leaf(char: Char, weight: Int) extends CodeTree
Here is the decoding function
def decode(tree: CodeTree, bits: List[Bit]): List[Char] =
{
//helper function to decode one character only
def decodeOneChar(tree: CodeTree, bits: List[Bit]): (Char, List[Bit]) =
{
tree match
{
case Fork(l,r,chlist,w) => if (bits.head==0) decodeOneChar(l, bits.tail) else decodeOneChar(r, bits.tail) //if 0 choose left fork, else choose right
case Leaf(c,w) => (c,bits) //once we reach a leaf, return the characters and the rest of the bitstream that remained unused
}
}
//main body of decode function
if (bits.isEmpty) List() //end of bitstream reached
else
{
val tupl = decodeOneChar(tree, bits) //a tuple containing the character decoded and remaining bits is returned
List(tupl._1)++decode(tree, tupl._2) //return the list and add to it a recursive call to decode using the remaining bits
}
}
This is the encoding function
def encode(tree: CodeTree)(text: List[Char]): List[Bit] =
{
def whichchild(tree: CodeTree, ch:Char): Bit = //assume that if left doesn't contain, right must contain ch
{
tree match
{
case Fork(l,r,chlist,w) => if (chlist.contains(ch)) 0 else 1
case Leaf(c,w) => if (c==ch) 0 else 1
}
}
def encodeonechar(tree: CodeTree, ch: Char): List[Bit]=
{
tree match
{
case Fork(l,r,chlist,w) =>
{
val bitt = whichchild(l,ch); //check which child (left or right) contains the character, then go down that path recursively
bitt :: encodeonechar(if (bitt==0) l else r, ch)
}
case Leaf(c,w) => {List()}
}
}
//main body of encode function starts here
if (text.isEmpty) List()
else (encodeonechar(tree, text.head)) ++ encode(tree)(text.tail) //encode the first character (text.head) and then append to the list the result of the recursion on encode using rest of the characters (text.tail) as argument
}
Other than the classic encoding scheme described above (which requires traversing the tree everytime to encode each characters) we could build a look-up table for each character. Then for each character in the string we wish to encode, we look-up the bit pattern that represents it and add it to the encoded bitstream.
To do that we first convert the CodeTree into a CodeTable, which is a list of tuples defined like this: type CodeTable = List[(Char, List[Bit])]
def convert(tree: CodeTree): CodeTable =
{
tree match
{
case Fork(Leaf(c,w),Leaf(c1,w1),chlist,w2) => {List((c,List(0)),(c1,List(1)))} //fork with two leaves (penultimate fork)
case Fork(Leaf(c,w),r,chlist,w1) => { mergeCodeTables(List((c,List())), convert(r))} //fork with left child as leaf
case Fork(l,Leaf(c,w),chlist,w1) => { mergeCodeTables(convert(l),List((c,List())))} //fork with right child as leaf
case Fork(l,r,chlist,w) => {mergeCodeTables(convert(l),convert(r))} //fork with both children as forks
case Leaf(c,w) => {List() }
}
}
//merges two code tables
def mergeCodeTables(a: CodeTable, b: CodeTable): CodeTable =
{
def helper(ct:CodeTable,b:Bit):CodeTable=
{
if (ct.isEmpty) List()
else
(ct.head._1, b::ct.head._2) :: helper(ct.tail,b) //add the bit to the first symbol in the list and recurse on the rest of the list
}
//main body of mergeCodeTables
helper(a,0) ++ helper(b,1) //add an extra zero to the left table and a 1 to the right table
}
Finally we can call the function quickEncode(). This function uses the functions defined above to covert a CodeTree into a CodeTable and then quickly encodes a string using look-up.
def quickEncode(tree: CodeTree)(text: List[Char]): List[Bit] =
{
val tabletree = convert(tree) //convert the CodeTree to a CodeTable for easy look-up
def helper(tt: CodeTable, text: List[Char]):List[Bit]=
{
if (text.isEmpty) List()
else
codeBits(tt)(text.head) ++ helper(tt, text.tail) //lookup the bit pattern for the firs caracter in the list and recurse on rest of the elements
}
helper(tabletree,text) //using the table, encode the text
}