Quadtree image compression.
MultiMedia Coding (UniMI)
The course started in 2007. Updates are very frequent while teaching (March-June).
The course is aimed to give basic notions of digital processing of multimedia signals (pictures, video and audio).
1 Introduction (pdf 1.6Mb)
Course outline. From analog to digital signal representation, compression issues, transmission issues.
Examples for transmission: Multiple Description, Adaptive Playout.
Examples for compression of video: JPEG, MPEG-2, H.264, SVC; and for audio: MP3, vocoder.
2 Sampling and Quantization (pdf 6.3Mb)
Sampling theorem, filters to reject interferents and avoid aliasing. Reconstruction by cardinal sinus, zero order hold, filters to reject armonics.
Quantizers with uniform step with/without dead-band, dithering, delta-sigma and noise-shaping. Vector quantization. Non uniform / non-linear quantization, gamma correction.
Spatial sampling of Red/Green/Blue, color spaces (YCbCr), under-sampling of luma/chroma (4:2:2 and 4:2:0 formats). Format ratio: 4:3, 16:9 or wide-screen, Temporal sampling to get smooth motion (24 Hz), to display (50-60 Hz), to avoid large-area flicker (100-120 Hz). Spatial/temporal sampling, interlaced video. Conversion progressive-interlaced, 50-100 Hz, 4:3-16:9.
Analog TV. Quadrature amplitude modulation of color, luma/chroma separation. Frequency modulation of audio; stereo - dual channel audio. Spectrogram.
Quality, objective assessment (PSNR, peak signal to noise ratio). Examples of noisy, blurred, blocketized images. Enhancement filters: lowpass filters, median filters.
3 Filters, Up/Down/Re-sampling (pdf 4.4Mb)
Frequency domain processing. Examples of 2D spectrum of images.
Filter. Frequency response, phase and group delay. Symmetric coefficients (linear phase) and symmetric frequency response (half-band). Quadrature mirror filters. Examples: moving averages and comb filters with zero / unit coefficients. Concatenated filters. Sharpening.
Filter implementation. FIR/IIR parts. I/II direct/Transposed form. Second order sections. Filter design: sampled cardinal sinus as lowpass prototype. Modulation of coefficients.
Interpolation by zero insertion and filtering. Filtering and decimation. Examples for 1:2, 1:3, 2:1 and 3:1 up/downsamplers. Sample rate converters (SRC): synchronous; asynchronous with stored coefficients (polyphase, two branch polyphase with linear refinement) and computed coefficients (Farrow, modified Farrow).
Examples for interpolation/decimation applied to images and video. Conversion among QCIF, CIF and 4CIF formats. Conversion among interlaced 4:2:0 and 4:2:2 (field dependent processing).
Filterbanks: 2D Discrete Cosine Transform. Applicatin to compression of images, DPCM compression (applied to DC coefficient of 2D DCT), quadtree compression. MP3 filter-bank. Application to music compression (masking effects).
Filters study using MATLAB. Length estimation, short versus long filters. Test images. Examples of filtered images.
4 Entropy coding (pdf 0.7Mb)
Entropy, entropy of DPCM, entropy of groups of symbols. Entropy of variable length codes (VLC), istantenous decodability.
Huffman codes, Huffman codes for groups of symbols, non-binary Huffman codes. Other VLC codes: unary codes, Golomb codes, Rice codes. Fixed length code for variable number of symbols: Tunstall codes. Arithmetic coding (with some implementation detail).
Dictionary techniques: Lempel-Ziv (LZ77, LZ78), Lempel-Ziv Welch (LZW). Other techniques: move-to-front; invertible block sorting: Burrows-Wheeler transform (BWT).
5 Video coding: JPEG and MPEG-2 (pdf 2.6 Mb)
JPEG image coding: block discrete cosine transform (DCT), quantization, DPCM, zig-zag scan, run-level coding, variable length coding.
MPEG-1 and MPEG-2 video coding: profiles and levels; hierarchy from groups of pictures (GOP), to slices, macroblocks and blocks of pixels; DCT transform and quantization; temporal prediction: motion estimation, motion compensation (ME/MC). Data partitioning, SNR scalability, spatial scalability.
5 Video coding: H.264 (pdf 1 Mb)
H.264 video coding: intra spatial prediction; inter temporal prediction with variable block size, multiple reference frames, generalized B pictures, reference B pictures, weighted prediction. Integer pseudo-DCT, hadamard transform. Non-linear extended-range quantization. Deblocking loop filter. Context adaptive variable length coding (CAVLC) and binary arithmetic coding (CABAC). Profiles.
5 Video coding: SVC (pdf 1.6 Mb)
Scalable Video Coding (SVC): temporal scalability by using hierarchy of temporal prediction, spatial scalability by down/up sampling, SNR scalability by re-quantization. Adaptive GOP structure. Extended spatial scalability. Fine grain SNR scalability. Motion compensated temporal filtering (MCTF), temporal/spatial wavelet transform. Access units, packetization and layer dependency.
5 Video coding: MVC (pdf 5.3 Mb)
Multiview Video Coding (MVC): 3DTV, free viewpoint television; multi-view/3D video capture, 3D video display; 3D picture/video format, depth map extraction, rendering and synthesis.
6 Audio coding (pdf 2 Mb)
Human auditory system (HAS), masking effects in time and frequency, de-masking. Filterbanks. MPEG-1 layer I, II and III (mp3), MPEG-2 advanced audio coding (AAC).
Created: 2nd April 2007. Updated: 31st July 2008.