Synthesis-based Texture Video Coing with Side Information

Research Description

The overall system of synthesis-based texture coding algorithm is illustrated in Fig. 1. That is, instead of decoding the texture data directly, a module called the texture synthesizer (TS) is used at the decoder to generate visually similar texture. For this scheme to work properly, one module called the texture analyzer (TA) is included in the encoder to examine the input video, separate texture and non-texture regions, extract parameters from texture regions and encode non-texture regions. When the coded bit-stream is sent to the decoder, the decoder then decodes the bit-stream of non-texture regions as usual and uses TS to fill out texture regions.

Fig. 1. The overall system of video coding with TA and TS.

Proposed Approach

The proposed algorithm introduces a synthesis-based texture coding technique that uses low-quality video as side information to control the output texture for video compression. As compared with the current pure synthesis algorithm, the proposed algorithm is generic, in the sense that the behavior and quality of the output texture can be adjusted by the amount of the side information and determined by the user. We develop an area-adaptive side information assignment technique to improve coding efficiency by given bit-budget. Additionally, we also provide the texture decomposition algorithm to maximize the synthesis performance by decomposing the non-synthesizable illumination component from the input video. Simulations demonstrate the performance of the proposed technique.

Resuts

Although the proposed synthesis algorithm is targeted for video, it can be applied to images as well. In this section, we show both image and video texture synthesis results. For the acquisition of the side information, we use JM(ver.11) reference software maintained by the Joint Video Team (JVT) operating with an IPPP GOP structure and 4x4 transform only coding for the test convenience.

Figs. 2 and 3 show synthesized image and video textures based on the decoded seed with QP=20 and the different amount of the side information for various textured image/video. (The video clips are downloadable when clicking each figures in Fig. 3) In the toilet sequence, only the center regions of the sequence are processed as the seed or target texture. Results without the side information are given in (c) using the patch growing method and in (d) using the example-based optimization method with alpha=0. Results using the texture growing method often have spatial or temporal discontinuity even with the use of the seam-hiding technique due to the difficulty of finding a suitable size of blocks or cubes. The noticeable seams in the straw image and the duck-take-off sequence are its good examples. The example-based optimization could overcome this disadvantage using the multi-resolution/multi-scale approach. However, results from both methods have repeated texture patterns with a global structure quite different from that of the original texture, which is undesirable in many cases. It is obvious when comparing the target texture and synthesized texture in the toilet sequence, where the synthesized toilet hole are moving with different shape.

Fig. 2. Texture image synthesis results for I. the block image, II. the straw image : (a) decoded seed image by QP=20, (b) target image to be synthesized, (c) synthesized texture by patch growing method, (d) synthesized texture without side information, (f) synthesized texture with the decoded (QP=50) side information as given (e), (h) synthesized texture with the decoded (QP=40) side information as given (g).

Fig. 3. Texture video synthesis results for I. the Toilet sequence, II. the Duck-take-off sequence, III. the Coast-guard sequence, IV. Shuttle-start sequence : (a) decoded seed sequence by QP=20, (b) target sequence to be synthesized, (c) synthesized texture by patch growing method, (d) synthesized texture without side information, (f) synthesized texture with the decoded (QP=40) side information as given (e), (h) synthesized texture with the decoded (QP=30) side information as given (g).
(Please click the image to download the video file)

Illumination-variant texture decomposition results are shown in Fig. 4, where the block image and the toilet sequence are spatially and temporally illumination-variant, respectively. The input texture shown in (a) is first decomposed into texture and non-texture components as shown in (b) and (c), respectively. Without decomposition, the synthesis algorithm cannot regenerate the input texture well as shown in (d). After decomposition, the proposed synthesis-based texture coding algorithm was applied to the texture as given in (f) with the help of its side information as given in (e), and the final result is obtained by summing the synthesized texture and the decoded non-texture as given in (g). To compare the synthesis-based and the conventional coding methods, we encode/decode the input texture as given in (h) with the same bit-rate as the synthesis-based method. We see that texture coded by the conventional coding method has lots of spatial or temporal discontinuity, such as blocking or flickering artifacts, while the proposed method generates perceptually similar and visually pleasant results in spite of its pixel-wise difference.

Fig. 4. Illumination-variant texture synthesis results for (I) the block image and (II) the toilet sequence: (a) the original Illumination-variant texture, (b) the decomposed non-texture (NT) component, (c) the decomposed texture (T) component, (d) synthesized texture without decomposition, (f) synthesized texture with the decoded side information (QP=40) as given (e), (g) final results summed by decoded NT (QP=20) and synthesized T as given in (f), (h) decoded texture with same bit-rate as (g).
(Please click the image to download the video file)

Question

Any question? Please contact to btoh77@gmail.com