Distributed virtual HDD based on digital fountains (pdf 4.4 Mb)
in-itinere decoding for Raptor codes: matrix evolution (avi 1.9 Mb)
LDPC Digital Fountains
Parity bits are created by xoring randomly selected data bits. This codes are rateless because parity bits can be continuously generated on a per-need basis.
As an example a simple error correction algorithm will be illustrated: hard-decision majority voting.
The loop is repeat until all checks are satisfied (no flipping in step 2) or until a maximum number of iterations is reached.
A more sophisticated algorithm uses soft values. E.g. floating point values ranging from -1 (certainly zero), 0 (maximum uncertainty), to +1 (certainly one).
As an example a simple erasure recovery algorithm will be described: message passing. Each parity bit form a group with its connected data bits.
When one data bit is recovered, it can happen that for some other parity bit only one (different) data bit is missing. In this case the recovery process can continue until all data bits are recovered or until nothing more can be done (the decoder gets stuck).
A more sophisticated algorithm solve the corresponding equation system using Gauss elimination. Erased data bits are the unknowns. Each parity represents a known term. As many (linearly independent) known terms as unknowns are needed to solve the system. The decoder never gets stuck but the number of computations can be very large (cubic complexity).
Several variations have been developed:
Random fountains: data bits to be xored are chosen randomly. Every data bits is selected with 50% probability. If there are N data bits, the number of selected bits (degree) has a binomial distribution.
LT fountains: the number of selected bits (degree) is random but follows a precise law known as soliton distribution. Encoding is done in two steps: first the degree is chosen (robust soliton distribution), second data bits are chosen (uniform distribution). Decoding can be done with Gauss elimination, however LT codes are designed to be decoded by message passing: at every iteration there is at least one degree one parity bit, and it is very improbable for the decoder to get stuck.
Raptor fountains: a rateless weakened LT code is concatenated with a fixed rate LDPC code. Weakened LT code: the number of selected bits (degree) is bounded (e.g. max degree is 10); because of this it may happen that some data bit (e.g. 5%) is never selected to generate parity bits. Therefore an LDPC pre-code is used to recover those unselected data bits.
128 data symbols plus 29 LDPC parity constraints are encoded into 157 encoded symbols. The evolution of the decoding matrix can be seen in this video avi 1.9 Mb. Roughly speaking: every white dot correspond to one XOR operations to be done to decode. Every video frame corresponds to the processing of one encoded symbol (then there are many xor ops per frame).
frame 41 out of 157
Top row: LDPC parity constraints are used first, then weakened LT encoded symbols are inserted into the matrix. Bottom row: the use of parity constraints is delayed until the matrix is full enough, this helps keeping the matrix sparse. The sparser the better: less XOR operations and more compact representations using bit lists instead of bitmaps.
frame 110 out of 157
Left column: two steps Gauss elimination (triangularization then diagonalization), there is a very large final peak of XOR operations. Center column: one step Gauss elimination (immediate diagonalization), more XOR operations are done but they are evenly spread in time. Right column: one step is attempted however operations are undone and the algorithm fall back to two steps if the matrix is denser.
The evolution of the decoding matrix can be seen in these videos. Roughly speaking: every white dot correspond to one XOR operations to be done to decode. Every video frame corresponds to one row-wise xor operation (xor of the entire row at the same time). The matrix is partitioned in four macro-rows which are processed in parallel.
Random fountain. 128 data symbols are randomly encoded into 128 encoded symbols; 50% of data symbols are used on average to compute each encoded symbol. Video avi 8.2 Mb.
frame 779, frame 2097, frame 2198 of 3601 frames
Left: first phase, insertion, pipeline because rows are dense. Center: second phase, local diagonalization of macro-rows, fully parallel. Right: third phase, global diagonalization, partially parallel.
Raptor fountain. 128 data symbols plus 29 LDPC parity constraints are encoded into 157 encoded symbols; LDPC parity constraints (high density of ones) are fed after LT constraints (low density of ones) to mantain the matrix sparse and to reduce the total number of xor operations to be done. Video avi 4.3 Mb.
frame 779, frame 1098, frame 1190 of 1745 frames
Left: first phase, insertion, almost parallel because rows are very sparse. Center: second phase, local diagonalization of macro-rows, fully parallel. Right: third phase, global diagonalization, partially parallel.
Created: 29th April 2008. Updated: 20th October 2008.