It is a compression technique.
It is used to reduce the size of data or messages.
The data is compressed to reduce the size when we want to transmit data in the network to minimize the cost of transmission.
Problem: Suppose we have a message with a length of 20.
Message: BCCABBDDAECCBBAEDDCC
The length of this message is 20
We need ASCII code to transmit this data
ASCII code is 8 bits per character
For 'A', the ASCII code is 65 = 01000001
For 'B', the ASCII code is 66 = 01000010
For 'C', the ASCII code is 67 = 01000011
For 'D', the ASCII code is 68 = 01000100
For 'E', the ASCII code is 69 = 01000101
If we want to send the message with the ASCII code of each character, the size will be 8 * 20 =160 bits.
So, we can use our own code rather than use the ASCII code or 8 bits code.
We can use 3 bits instead of 8 bits as we have only 5 characters in our message.
We define the length based on the number of characters in the message.
For 5 characters, we can use 3 bits code as we can have a 2^3 =8 combination of binary
The size of the message = 20 * 3 = 60 bits
We have to send the table with the data so the receiver can decode the message according to our coding method (Character + Code).
The size of the table = Total bit required for representing the characters + Binary bits of code = 5 * 8 + 3 * 5 =40+15 = 55 bits
Total size of the message= 60 bits + 55 bits = 105 bits
What if we have to reduce more? So, here we have "Huffman Coding"
Message: BCCABBDDAECCBBAEDDCC
We are applying the optimal merge technique here. If you are not gone through follow this Lecture-08: Optimal Merge Pattern.
Total Cost= 40 + 12+ 45 = 97 bits !!