Project

Competition Results

Original image

256 x 144 pixels, 24 bits/pixel, 13 frames

1st place: Leyla Kabuli & Anmol Parande

Whole-frame prediction based on H.264

PSNR: 22.24

2nd place (tie): Luke Dai & George Hutchinson

Time decimation

PSNR: 23.4

2nd place (tie): Kaitlyn Chan & Jenny Wang

Additional quantization and DCT-domain time differencing

PSNR: 21.31

Video Communication

The project is an open competition in which the winning teams will get to keep their radios, interfaces and the Pi

Task:

Your task is to combine Part I -- image compression with Part II -- data communication such that you will be able to send the best quality video in no more than 60 APRS packets (~10KB).

Specs

  • You are allowed 60 APRS packets, which include starting and ending packets. This gives you 58 packets of pure data, which is about 10KB.

  • On the day of submission, we will post a video (stack of Tiff images). The video could be of arbitrary size. You will need to compress, and transmit it to our server. One of the header packets you will send will contain your git repository. We will use the code from the repository to decode the image, and provide you with a quality score.

  • The receiver should produce a video of the exact same dimensions as the one transmitted (not necessarily the same quality though)

  • We will provide example videos on this page for you to practice on


To get 80% grade:

  • Use Downsampling followed by JPEG123 from part I to compress each frame to the right size. (You will need both!)

  • Use Parts II+III to packetize the data (including information on how to construct the size of the original video) and send APRS packets to DireWolf.

  • Receive packets, extract the data, decompress the data and interpolate back to video frames with the same dimensions as the original.

Final submission

  • Final submission is 05/12/19 11:55pm

    • You will submit your code, compression result of test images.

    • Project checkoffs will be performed May 13 and 14.

Final checkoff:

  • Prepare 3 slides:

    • Slide 1: Description of your method of compression.

    • Slide 2: Result of PSNR of test animated images, and total number of packets.

    • Slide 3: Result of PSNR and number of packets for competition animated image.


How to reconstruct your file on the server:

We have modified the server to accept a GitHub repository name within the EE123-students organization in the comment field of your START packet. It will then clone the repository as https://github.com/EE123-students/your-repository-name and use the contents of the repo to reconstruct your animated images. Then it will copy the received binary data, reconstructed file, and a log of stdout and stderr during the reconstruction to the public directory so that you can debug any problems.

You may use one of your existing final project repos from Parts I-III, or create a new one for this purpose here: https://classroom.github.com/g/Lx-aYNXZ

Your reconstruction code must be in a file called reconstruction.py in the top level of your repo. You can see an example reconstruction implementation in the template repository or by accepting the GitHub assignment above.

The comment field of your start packet should be "github:your-repo-name" to inform the server that it should be using your custom reconstruction function.

TL;DR

  • Use the github assignment or copy the template repository for your reconstruction code

  • Change the comment field of your START packet to "github:your-repo-name"

  • Your reconstructed image and a log of anything you printed will show up on the file server

How can I get more than 80%?


What you could work on (partial list…)


  • Improve Compression:

    • Explore tradeoffs with downsampling of space and time, and quality factor of the compression to improve visual quality and/or PSNR. You will need to interpolate back after decompression.

    • Replace DCT compression with Wavelets (JPEG 2000 like)🌶

    • Explore 3D wavelet (space and time).🌶🌶

    • Implementation of video based compression (MPG), which leverage differences between frames (Can get 0.5-4db improvement depending on the video. Do differencing in YCbCr not RGB -- explore uniform quantization tables because difference images have high frequency info) 🌶

    • Implementation of video compression which uses motion vectors before differencing. (You will have to store the motion vectors for macroblocks in your file header!)🌶🌶

    • Non conventional compression: Like PCA, dictionary based. 🌶🌶


  • Post processing

    • Deblocking post processing filter to reduce artifacts.



Evaluation:

  • For each of the videos below, you should have a result of a transmission and the resulting PSNR that you got. This should be ready at the time of the checkoff. In addition, you will submit the resulting images on gradescope.

The video quality evaluation will be done by

  • Qualitative assessment 1-10 by the staff of the class.

  • Peak Signal-to-Noise Ratio measure (PSNR) which is defined as:

where MAX is the maximum possible image pixel value (255 for 8bits) and MSE is the mean-square error, or can be expressed as:

This assumes RGB videos are compared, where each color channel has M-by-N pixels and the total number of frames is T.

Resources:

  • Here's a link to an explanation of how JPEG works: Link

  • ICIP tutorial on JPEG2000 Link

  • EE225B lecture notes on JPEG2000 Link

  • Slides on video compression Link, and Link

  • Slides on video standards Link

  • Here's how to read an image in python:

  • Here’s some helper functions that we wrote to help you do the project (Link), the usage are summarized below:

- Import the helper functions:

- Load PNG stack as 4D numpy array (frame#, Y, X, RGB)

- Save Tiff images into a PNG stack with 10 frame/sec frame rate

- Display Tiff stack as video with display window size of 400 and frame rate of 10 frame/sec

- Compute peak signal-to-noise ratio (PSNR):

Animated PNG Example Videos: Right click and save to download

We use animated PNGs as opposed to animated GIF, since GIF is limited to a palette of only 256 colors.

Rolling Cal video

(framerate=12)

Andy Samberg thumbs up

(framerate=12)

Dog shaking head

(framerate=12)

Simpson breaking down

(framerate=20)

Simple shape

(framerate=10)

River flowing

(framerate=10)

Ice growing

(framerate=20)

Milky way

(framerate=10)

Example of Motion JPEG123 compression

For your reference: These are the same video, but each frame was downsampled, and then JPEG123 compressed to a rate such that the total image after compression is about 10KB. In many cases, JPEG123 alone is still too large, or extremely poor quality. These examples did not exploit any temporal redundancy!!!!

Compressed Rolling Cal video

(PSNR=17.93 dB)

Compressed Andy Samberg thumbs up

(PSNR=28.6dB )

Dog shaking head

(PSNR=31.63)

Simpson breaking down

(PSNR=24.24)

Simple shape

(PSNR=21.82)

River flowing

(PSNR=20.24)

Ice growing

(PSNR=20.58)

Milky way

(PSNR=18.6)