Download

Distributed virtual HDD based on digital fountains (pdf 4.4 Mb)

in-itinere decoding for Raptor codes: matrix evolution (avi 1.9 Mb)

parallel decoding for random (avi 8.2 Mb) and Raptor (avi 4.4 Mb)

Partners

 

Polytechnic of Milan (PoliMI)

 

 

Polytechnic of Turin (PoliTO)

 

University of Bergamo (UniBG) 


LDPC Digital Fountains

Random parity codes

Parity bits are created by xoring randomly selected data bits. This codes are rateless because parity bits can be continuously generated on a per-need basis.

Error correction

As an example a simple error correction algorithm will be illustrated: hard-decision majority voting.

  1. Every parity bit send an estimate of the correct value to each connected data bit; for a given bit the correct value is the xor of the parity and all others connected bits.
  2. Every data bit receive the estimate from each connected parity bit (and from the channel, the previous value) and takes an hard-decision using majority voting (the next value).

The loop is repeat until all checks are satisfied (no flipping in step 2) or until a maximum number of iterations is reached.

A more sophisticated algorithm uses soft values. E.g. floating point values ranging from -1 (certainly zero), 0 (maximum uncertainty), to +1 (certainly one).

Erasure recovery

As an example a simple erasure recovery algorithm will be described: message passing. Each parity bit form a group with its connected data bits.

  • if the parity bit is missing in the group, it can be recovered by xoring connected data bits (this recovery may not be needed).
  • if one data bit is missing in the group, it can be recovered by xoring the parity and remaining connected data bits (if more than one data bit is missing, no recovery can be done).

When one data bit is recovered, it can happen that for some other parity bit only one (different) data bit is missing. In this case the recovery process can continue until all data bits are recovered or until nothing more can be done (the decoder gets stuck).

A more sophisticated algorithm solve the corresponding equation system using Gauss elimination. Erased data bits are the unknowns. Each parity represents a known term. As many (linearly independent) known terms as unknowns are needed to solve the system. The decoder never gets stuck but the number of computations can be very large (cubic complexity).

Several variations have been developed:

  • message passing with fall-back to gauss elimination (LT/Raptor overhead reduced from 30% downto 1-2%, random fountains decoded with minimal overhead)
  • targeted recovery of a specific data symbol (encoded symbols are taken in a specific optimal order, complete decoding not needed)
  • in-itinere decoding (encoded symbols are processed one at a time and computations are evenly spread in time - no final peak)
  • robust decoding (fake encoded symbols are identified by using CRC and/or cryptographic hash-tree of data symbols)
  • parallel decoding (computations are pipelinized and parallelized, decoding tasks are optimally scheduled and executed on several processing units)

Random fountains, LT/Raptor codes

Random fountains: data bits to be xored are chosen randomly. Every data bits is selected with 50% probability. If there are N data bits, the number of selected bits (degree) has a binomial distribution.

LT fountains: the number of selected bits (degree) is random but follows a precise law known as soliton distribution. Encoding is done in two steps: first the degree is chosen (robust soliton distribution), second data bits are chosen (uniform distribution). Decoding can be done with Gauss elimination, however LT codes are designed to be decoded by message passing: at every iteration there is at least one degree one parity bit, and it is very improbable for the decoder to get stuck.

Raptor fountains: a rateless weakened LT code is concatenated with a fixed rate LDPC code. Weakened LT code: the number of selected bits (degree) is bounded (e.g. max degree is 10); because of this it may happen that some data bit (e.g. 5%) is never selected to generate parity bits. Therefore an LDPC pre-code is used to recover those unselected data bits.

In-itinere decoding for Raptor codes

128 data symbols plus 29 LDPC parity constraints are encoded into 157 encoded symbols. The evolution of the decoding matrix can be seen in this video avi 1.9 Mb. Roughly speaking: every white dot correspond to one XOR operations to be done to decode. Every video frame corresponds to the processing of one encoded symbol (then there are many xor ops per frame).

frame 41 out of 157

Top row: LDPC parity constraints are used first, then weakened LT encoded symbols are inserted into the matrix. Bottom row: the use of parity constraints is delayed until the matrix is full enough, this helps keeping the matrix sparse. The sparser the better: less XOR operations and more compact representations using bit lists instead of bitmaps.

 frame 110 out of 157

Left column: two steps Gauss elimination (triangularization then diagonalization), there is a very large final peak of XOR operations. Center column: one step Gauss elimination (immediate diagonalization), more XOR operations are done but they are evenly spread in time. Right column: one step is attempted however operations are undone and the algorithm fall back to two steps if the matrix is denser.

Parallel decoding for Random/Raptor

The evolution of the decoding matrix can be seen in these videos. Roughly speaking: every white dot correspond to one XOR operations to be done to decode. Every video frame corresponds to one row-wise xor operation (xor of the entire row at the same time). The matrix is partitioned in four macro-rows which are processed in parallel.

  • First phase: insertion of encoded symbols and their corresponding row in the decoding matrix. Incoming rows are dispatched to their macro-row based on the position of the first raised bit. When the row is already occupied the incoming row is xored with existing row, the result is re-dispatched again. This phase is parallel when incoming rows are sparse, otherwise this phase is like a pipeline with incoming rows flowing from one macro-row to the next. 
  • Second phase, local diagonalization of macro-rows, fully parallel. Note that the last macro-row (4th) is diagonal and can be used for message passing (see third phase).
  • Third phase, global diagonalization, partially parallel. The last macro-row is used to clear raised bits in other macro-rows in parallel. The next to last row is now diagonal and can be used for message passing. The process is repeated until the matrix is diagonal (4th macro-row for 3, 2 and 1; then 3rd for 2 and 1, then 2nd for 1).

Random fountain. 128 data symbols are randomly encoded into 128 encoded symbols; 50% of data symbols are used on average to compute each encoded symbol. Video avi 8.2 Mb.

frame 779, frame 2097, frame 2198 of 3601 frames


Left: first phase, insertion, pipeline because rows are dense. Center: second phase, local diagonalization of macro-rows, fully parallel. Right: third phase, global diagonalization, partially parallel.

Raptor fountain. 128 data symbols plus 29 LDPC parity constraints are encoded into 157 encoded symbols; LDPC parity constraints (high density of ones) are fed after LT constraints (low density of ones) to mantain the matrix sparse and to reduce the total number of xor operations to be done. Video avi 4.3 Mb.

frame 779, frame  1098, frame 1190 of 1745 frames

Left: first phase, insertion, almost parallel because rows are very sparse. Center: second phase, local diagonalization of macro-rows, fully parallel. Right: third phase, global diagonalization, partially parallel.

Foreseen applications

  • Memory block devices such as Flash RAMs: parity blocks are used to protect data blocks. When a given data block fails recovery takes place.
  • Storage block devices such as hard-disk drives: blocks of a virtual reliable device are distributed over several unreliable devices; this is much more efficient than data replication and much mure flexible than parity schemes (such as RAID which can recover from only one loss and requires devices of the same size). (pdf 4.4 Mb)
  • File block transfer protocols such as in next-generation mobile networks (3GPP).
  • File sharing protocols such as BitTorrent in peer2peer networks; asking a specific data block to several peers is not efficient as several useless copies are received if more than one peer respond, and if a given data block is missing it must be requested again (delay, end-game problem); asking a random parity block to several peers is much more efficient, no useless copies are received, there are no specific missing blocks to be requested (no delay, no end-game problem).

Other Resources

David MacKay review of digital fountain codes papers, presentation

Digital Fountain  (DF), LT/Raptor codes


Developers

 

 Antonella Citrano (PoliTO)


 

Michele Bologna (UniBG) 


Andrea Rota (UniBG)


Ruben Lino VIlla (PoliMI)


Raffaele Riva (PoliMI)