The goal of this project was to transmit and receive a GIF video across a classroom in fewer than 75 seconds using handheld radios and Rasberry Pis. Existing implementations of compression and communication were prohibited. The video was given the day of the presentation, and was graded based on highest Peak Signal to Noise Ratio (PSNR).
We implemented multiple stages of compression to maximize the data we were able to packetize and transmit. The first compression stage involved our own version of JPEG compression. We first took each frame of the GIF, and downsampled the GIF spacially by a predetermined factor. Each RGB pixel was then converted to to a different color scheme, chrominance and luminance (Y,Cr, Cb). After this step, we took each color component of each frame and zero padded to the closest multiple 8, before computing the two dimensional Discrete Cosine Tranform (DCT2). These DCT coefficients were divided by a JPEG quantization table, a lossy compression which greatly increased the quantity of zeros.
The increased zero count enabled further compression via run length encoding. After converting the quantized, DCT, coefficients to 8 bit binary numbers, we encoded the quantity of consecutive zeros in a "zero sandwich." For example, 100000 was encoded as 00000001 00000000 00000101 00000000. This losslessly compressed the image data significantly.
Prior to transmission, we inserted flags into our binary data to distinguish where the data for each color component of each frame began and ended. We selected a flag that we knew would never arise in our encoding scheme: 11111111 00000000 00000000 00000000 11111111 (0xFF000000FF). Immediately following the first flag was a packet that stored the GIF's dimensions and framerate, which were critical for reconstruction. Each subsequent flag was followed by a packet that indicated the color type (Y,Cr, or Cb), as well as the frame index. In this way, partial reconstruction was always possible with the loss of any flag except the first.
The modulation scheme that we implemented was Audio Frequency Shift Keying 1200(AFSK1200), which encodes data at a rate of 1200b/s using the frequencies 1200Hz and 2200Hz to encode '0's and '1's respectively. AFSK is generated in the following way to ensure that the high frequency components from discontinuities due to phase are mitigated (see right).
AX.25 packetization is an amateur-radio-data-link protocol used most commonly for APRS (Automatic Positioning and Reporting System). AX.25 packets take on the following form
Here is the final presentation. Slides 11-13 detail some of the GIFS that we were able to transmit and their respective PSNR.