Reed–Solomon (RS) codes are cyclic error-correcting codes invented by Irving S. Reed and Gustave Solomon. RS code provides a systematic way of building codes that can detect multiple symbol errors. It is probably one of the most widely adopted error-correction codes, that has been used in so many areas that affects every person's everyday life. Its application ranging from Compact Disc to TV broadcast. So understanding Reed-Solomon code would be beneficial to any engineering professionals. Here I put Prof. Reed and Dr. Solomon's picture in this page, even though I think the pictures deserve to appear in a full treatise web-site of coding theory history. I found my inspirations from these pictures, so I still include them in here, and salute to the two great mathematicians! Here, in this page, I will just write down my own notes in reading about Reed-Solomon code and hope this can help me and also help others to better understand it. Moreover, I will also put in my notes in my implementation using JavaScript.
Galois field is the basis of error correction coding. In this chapter, the basic of Galois field is introduced in a practical manner, with special interesting to how the multiplication and division are conducted. In this way, for someone with interest of implementing an error correction code from scratch, the readings in this chapter will suffice to get him or her start coding with confidence.
When in grade school, we learn how to do addition, subtraction, multiplication and division on integers. We also learn some laws of these operations, such as the Associative Law, stating that the result of (a+b)+c is equal to the result of a+(b+c). What we are trying to achieve in error correction coding is based on these same arithmetic and laws. However, we face a problem when implementing them with computer programming that is, integers are limitless. Computers can only efficiently represent an integer with an upper boundary. Thus, Galois field is used by computer scientists for rescue. Given the set of all m-bit unsigned integers, 2m elements in total, Galois field defines the addition, subtraction, multiplication and division operations on any two elements such that the results of the operations are all within the set. It can further be mathematically proven that all the arithmetic laws on normal integers are applicable in the same manner with the Galois field arithmetic.
Galois Field Elements
A Galois field is based on a set of 2m elements, and we assume that this set has a primitive element, denoted as α, so that all the elements can be represented as the powers of α:
0, α0, α1 , α2, …, αN-1
where N=2m-1. This field is then denoted as GF(2m).
In addition to this power of α form, elements of a GF(2m) can also be represented as a polynomial. For example, GF(16), where m=4, has 16 elements, which are from 0000 to 1111. Element a3a2a1a0 can be represented as the polynomial:
a3x3+a2 x2+a1 x1+a0 x0
Galois Field Addition and Subtraction
The result of adding two field elements is always equal to conducting bit-wise XOR operation the bits of the elements. Using GF(16) as example, when we add 1010 and 1101, we can conduct the addition as 1010 XOR 1101=0111. In polynomial representation, addition can be conducted as:
+)
a3x3+a2 x2+a1 x1+a0 x0
b3x3+b2 x2+b1 x1+b0 x0
c3x3+c2 x2+c1 x1+c0 x0
where ci = ai+bi . Still using two elements 1010 and 1101 as example, the addition of the two is represented as:
+)
x3+ x1
x3+ x2+ 1
x2+ x1+ 1
Thus, we have 1010 + 1101 = 0111.
If we are familiar with the XOR operation, or if we look more closely at the way addition is conducted, we can notice that the result of adding two Galois filed elements is the same as the subtraction of them. For the coefficients produced by subtracting two polynomials, ci = ai - bi , we can see that ci is always equal to the result of ai XOR bi .
Thus, we also have 1010 – 1101 = 0111.
Galois Field Generator Polynomial
An important part of defining a Galois field is its generator polynomial, normally denoted as p(x). It is also known as primitive polynomial. This is a polynomial of degree m which is irreducible, that is, a polynomial without any factors. Even though we have not defined the Galois field multiplication yet, we can still use the conventional polynomial multiplication and Galois field addition to understand the factors and irreducible polynomials. Let us look at this example:
(x2+x+1)(x+1) = (x3+ x2+x)+ (x2+x+1) = x3+ x2+ x2+x+x+1 = x3+1
Thus, we know polynomial x3+1 is reducible, since it has two factors x2+x+1 and x+1. On the other hand, polynomial p(x)=x4+x+1 does not have any factors, therefore, p(x) can be the generator polynomial of a Galois field.
One fantastic property of generator polynomial is the way it can be used to generate all the non-zero elements of the field, using the power of α form introduced in the beginning of this section. This is based on the fact that the primitive element α is a root of the generator polynomial. That is, we always have:
p(α) = 0.
Using p(x)=x4+x+1 as an example, we have:
α 4+ α + 1 = 0,
or, more importantly, we have:
α 4 = α + 1.
This means we are always able to substitute α 4 by α + 1 whenever possible. Continuing this line of facts, we can derive the equivalent of α 5 as:
α 5= α . α 4= α(α + 1)= α2+ α.
We can continue with all the powers of α and find the equivalent of each one with a polynomial of α of degree m-1. The following table lists all the non-zero elements of GF(16) and their equivalents.
Galois Field Multiplication and Division
Using the conventional polynomial multiplication, multiplying two polynomials of degree m-1 should generally yield a polynomial of degree 2m-2. For Galois field multiplication, the product of two elements should also fall in the field. Thus, we then need to divide the 2m-2 degree product polynomial by the generator polynomial, p(x), and taking the remainder as result of Galois field multiplication.
We can still use the polynomial arithmetic to get the result of Galois field multiplication. In a first look, it seems intimidating for a programmatic implementation. In fact, with careful programming, stunning simple program can be written to implement all the steps described in here. However, this document will not delve into the details of all that, and the program is very difficult to read of any one other than the author himself. With the interest of efficient and straight implementation, we look at the alternative technique for Galois multiplication. This technique is based on logarithms, and one further advantage, the division can be equally convenient to conduct.
If two elements to be multiplied are represented in the power of α form, the product can be obtained by adding the power indices, and if the sum of indices is greater than N=2m-1, we can subtract the sum by N. For example, to get the product of 1010 and 1101, we can first find from the previous table that:
1010 = α 9 and 1101 = α 13.
So,
1010 × 1101 = α 9× α 13 = α (9+13) mod 15= α22 mod 15 = α7= 1011
Similarly, the Galois field division can be obtained using this logarithm technique with equal convenience. For two elements a and b, the result of a divided by b can be obtained by subtracting the power index of b from that of a, and if the subtraction yields negative number, we can add N=2m-1 to it to adjust the result into the field’s power index range. For example, to get the result of 1010 divided by 1101, we can refer to the same power index table shown previously:
1010 / 1101 = α 9 / α 13 = α (9-13) mod 15= α(-4) mod 15 = α11= 1110
Lastly, since zero cannot be represented by any power of α, the implementation based on this logarithm technique needs to detect the presence of zero as an operand and force the result accordingly.
Even though the mathematical basis of Reed-Solomon coding is complicated, it is not out of hand of an inquisitive programmer to have a reasonable understanding of it. In this section, we will focus on the encoding part, and it soon will be shown that the arithmetical steps used in the encoding are straight-forward to carry out.
From the coding theory’s point of view, the original message is taken as a stream of bits. However, a class of coding methods divides the original message into separate blocks of data. They are called the block codes, and Reed-Solomon code is a block code. If each block is 8 bits, we then view the original message as a string of symbols, and each symbol is a computer byte.
A Reed-Solomon code of parameters (n,k) means that there are in total n symbols in a block, among which the first k symbols are the information symbols of original message, and attached by (n – k) parity symbols. Many literatures also call parity symbols as error correction symbols. If each symbol is m bits, we always need to have
n ≤ 2 m - 1
Since there are (n – k) parity symbols, the number of errors that can be identified and corrected, t, will satisfy the following:
t = floor ((n – k) /2).
We definitely can view the values of the message and parity symbols as the elements of a Galois Field GF(2m). An (n,k) Reed-Solomon code is constructed by forming a generator polynomial g(x) consisting of n-k=2t factors. The root of the generator polynomial should be a series of 2t consecutive elements in the field, thus, we have,
g(x) = (x+ α b) (x+ α b+1)….(x+ α b+2t-1)
In the QR-code specification, it is chosen for b=0 for simplicity of implementation, and also GF(256) is the symbol’s field. Next, let us go through (15,11) Reed-Solomon code in detail as a working example.
For (15,11) code, there are 4 parity symbols, and t is 2, which means the code can correct up to 2 errors appearing anywhere in a block of 15 symbols. The generator polynomial should have 4 factors. and it can be calculated as:
Please note that, the coefficients of the polynomial are all elements of GF(16), and the multiplication and addition are defined in the same way as shown in Table 1. Let’s just show how in the last step the coefficient of x2 in the result of (x3+7x2+14x+8)(x+8) is calculated as:
7×8+14×1 = α 10× α 3+ α 11 = α 13+ α 11 = 1101 + 1110 = 0011 = 3
In QRcode, Reed-Solomon code is used to add protecting extra symbols to the original message. GF(256) is the range of all the symbols, as well as the coefficients of the generating polynomials.
In an (n,k) Reed-Solomon code, every block has k information symbols and n-k parity symbols. We can use a polynomial M(x) of degree k-1 to encode the k information symbols, so that:
M(x) = Mk-1xk-1+ … +M1x+M0
Here, Mk-1 is the first symbol of the message. To calculate the parity symbols, the message polynomial is first multiplied by xn-k and the result divided by the generator polynomial, g(x). Division by produces a quotient q(x) and a reminder r(x), where r(x) is of degree n-k-1. Thus:
Then, the coefficients of will be the parity symbols to be appended after the information symbols. Thus, the encoded block of n symbols is encoded as the polynomial T(x) as the following:
T(x)
=
=
M(x)×xn-k+r(x)
Mk-1xn-1+ … +M1xn-k+1+M0xn-k+rn-k-1xn-k-1+…+r0
This completes the encoding process of an block of symbols for (n,k) Reed-Solomon code. Let’s look at why the encoded polynomial T(x) can be used to detect errors. Once detected, T(x) can further be used to locate the positions and correct the errors, thus, the original message is derived against the errors.
From the previous division equation, multiplying g(x) to both side, we will have:
and rearranging:
We can see that the left side is actually the T(x). Given that g(x) is defined to have factors (x+ α i) for all i from 0 to n-k-1, we can have this important property of T(x): T(x) is divisible by all α i for i from 0 to n-k-1. Thus, when receiver is checking T(x) and found this not true, it is clear that one or more errors have occurred.
In order to form the basis of the program implementation of encoding, we walk through a simplified example. Here, the symbols are elements of GF(16), and the original message symbols is {1,2,3,…,11}, and we are encoding for (15,11) code. For message polynomial being:
M(x) = x10+2x9+3x8+4x7+5x6+6x5+7x4+8x3+9x2+10x+11,
we can rearrange M(x) into the following iterative form:
M(x) = ((((((((((1∙x)+2)x+3)x+4)x+5)x+6)x+7)+8)x+9)x+10)x+11.
To calculate the parity symbols, we need to divide M(x)∙x15-11 by the g(x) for (15,11) code. In the previous section, we have chosen the g(x) to be:
g(x) = x4+ 15x3+3x2+x+12.
In order to calculate the r(x), we need to re-arrange M(x)∙x4 into the following form:
M(x)∙x4 = ((((((((((x4∙x)+2x4)x+3x4)x+4x4)x+5x4)x+6x4)x+7x4)+8x4)x+9x4)x+10x4)x+11x4
We can first calculate the remainder of x4 divided by g(x):
x4 = (x4+ 15x3+3x2+x+12) + r10(x)
thus,
r10(x)
=
=
(x4) + (x4+ 15x3+3x2+x+12)
15x3+ 3x2+x1+12
Next, to calculate the remainder of (x4∙x)+2x4 divided by g(x), which is denoted as r9(x), it is equivalent to calculate the remainder of r10(x) divided by g(x). Thus, we have:
r9(x) = (r10(x)x+2x4) mod g(x).
Therefore,
Continuing to the next step, the remainder, r8(x), of ((x5+2x4)x+3x4)x+4x4 divided by g(x) is equal to:
r8(x) = (r9(x)x+4x4) mod g(x).
Therefore, we can continue the next steps to further define the series of remainders, r7(x), …, r1(x) , r0(x), where r0(x) is the remainder we are looking for, which is M(x)∙x4 divided by g(x). This technique of calculating the remainder is iterative and each step only takes the remainder of the previous step, multiplied by x and added in the next information symbol and then calculate the current remainder. This pipelined operation is suitable for a programmatic implementation, with JavaScript code is shown in the next section.