There square measure numerous algorithms to compress files to smaller sizes some involve loss in info resulting in poor quality whereas some don't. we have a tendency to selected Huffman writing as a result of it's a lossless form of compression during which none of the initial info is lost, it's merely compressed.
David Albert Huffman invented Huffman coding in 1952.
In his article 'A Method for the Construction of Minimum-Redundancy Codes,' he describes this approach. Each symbol is given a code of varying length. The shorter codes are assigned to symbols that appear more frequently, whereas the longest codes are applied to symbols that appear less frequently.
The variable-length codes assigned to input symbols are Prefix Codes, which means that the code assigned to one symbol is not the same as the prefix of code assigned to any other letter. This ensures that when decoding, there is no confusion between two symbols. A Binary tree can be used to create these Prefix codes, for example :
When we move to the left child of the root in this tree, we count it as 1 and when we move to the right child of the root, we count it as 0. We can then assign a node to a symbol to obtain its codeword. The children of that node can no longer be used. All of the symbols must have nodes assigned to them. We can assign nodes to symbols in a variety of ways, therefore we need to figure out which option is the most efficient. Large The rate R, which describes the average amount of expected bits to represent a symbol, can be used to calculate the efficiency of all the different situations of the binary tree. Which one is it ?
The probability of the symbol is denoted by p_i, while the number of bits in the codeword assigned to the symbol is denoted by n_i. Entropy H(X), which is defined as the amount of information carried by each individual bit, can also be used to calculate efficiency. Which is supplied by :
We may now compress the file using these codes after giving each symbol a code. However, the table containing the original symbols and the codes assigned to them should be present when decoding the compressed file. The decoder can now decode the compressed text and read the table.
Because we want to compress photos, we need to know how they are stored. A picture is made up of many tiny pixels, each of which is represented by a Binary value. The term "bit plane" refers to this type of colour representation. Each bit doubles the amount of colours accessible. A single bit colour, for example, can only have two colours: black and white. White gets a 0 while black gets a 1. Scan lines are used to store images. Each line is encoded from top to bottom, from left to right.
We'll look at and explain the algorithm using a 5x5 image with a 3 bit size and the following pixel representation: