Author
Huihui Bai
Bio: Huihui Bai is an academic researcher from Beijing Jiaotong University. The author has contributed to research in topics: Encoder & Motion estimation. The author has an hindex of 6, co-authored 8 publications receiving 256 citations.
Papers
More filters
TL;DR: A new control-point representation that favors differential coding is proposed for efficient compression of affine parameters by exploiting the spatial correlation between adjacent coding blocks, motion vectors at control points can be predicted and thus efficiently coded, leading to overall improved performance.
Abstract: The affine-motion model is able to capture rotation, zooming, and the deformation of moving objects, thereby providing a better motion-compensated prediction. However, it is not widely used due to difficulty in both estimation and efficient coding of its motion parameters. To alleviate this problem, a new control-point representation that favors differential coding is proposed for efficient compression of affine parameters. By exploiting the spatial correlation between adjacent coding blocks, motion vectors at control points can be predicted and thus efficiently coded, leading to overall improved performance. To evaluate the proposed method, four new affine prediction modes are designed and embedded into the high-efficiency video coding test model HM1.0. The encoder adaptively chooses whether to use the new affine mode in an operational rate-distortion optimization. Bitrate savings up to 33.82% in low-delay and 23.90% in random-access test conditions are obtained for low-complexity encoder settings. For high-efficiency settings, bitrate savings up to 14.26% and 4.89% for these two modes are observed.
90 citations
TL;DR: An effective MD image coding scheme is introduced based on the MD lattice vector quantization (MDLVQ) for the wavelet transformed images with better performance than some other tested MD image codecs including that based on optimized MD scalar quantization.
Abstract: Multiple description (MD) coding is a promising alternative for robust transmission of information over non-prioritized and unpredictable networks. In this paper, an effective MD image coding scheme is introduced based on the MD lattice vector quantization (MDLVQ) for the wavelet transformed images. In view of the characteristics of wavelet coefficients in different frequency subbands, MDLVQ is applied in an optimized way, including an appropriate construction of wavelet coefficient vectors, the optimization of MDLVQ encoding parameters such as the choice of sublattice index values and the quantization accuracy for different subbands. More importantly, optimized side decoding is employed to predict lost information based on inter-vector correlation and an alternative transmission way for further reducing side distortion. Experimental results validate the effectiveness of the proposed scheme with better performance than some other tested MD image codecs including that based on optimized MD scalar quantization.
67 citations
01 Nov 2012
TL;DR: The affine motion model is employed in SKIP and DIRECT modes to produce a better prediction and no additional motion estimation is needed, so the proposed method is also quite practical.
Abstract: Higher-order motion models were introduced in video coding a couple of decades ago, but have not been widely used due to both difficulty in parameters estimation and their requirement of more side information. Recently, researchers have put them back into consideration. In this paper, the affine motion model is employed in SKIP and DIRECT modes to produce a better prediction. In affine SKIP/DIRECT, candidate predictors of the motion parameters are derived from the motions of neighboring coded blocks, with the best predictor determined by rate-distortion tradeoff. Extensive experiments have shown the efficiency of these new affine modes. No additional motion estimation is needed, so the proposed method is also quite practical.
38 citations
TL;DR: It is found that both GVF and GGVF snakes essentially yield the same performance in capturing LTIs of odd widths, and generally neither can converge to even-width LTIs, and a novel external force termed as component-normalized GGVf (CN-GGVF) is proposed to eliminate the problem.
Abstract: Snakes, or active contours, have been widely used in image processing applications. An external force for snakes called gradient vector flow (GVF) attempts to address traditional snake problems of initialization sensitivity and poor convergence to concavities, while generalized GVF (GGVF) aims to improve GVF snake convergence to long and thin indentations (LTIs). In this paper, we find and show that both GVF and GGVF snakes essentially yield the same performance in capturing LTIs of odd widths, and generally neither can converge to even-width LTIs. Based on a thorough investigation of the GVF and GGVF fields within the LTI during their iterative processes, we identify the crux of the convergence problem, and accordingly propose a novel external force termed as component-normalized GGVF (CN-GGVF) to eliminate the problem. CN-GGVF is obtained by normalizing each component of initial GGVF vectors with respect to its own magnitude. Experimental results and comparisons against GGVF snakes show that the proposed CN-GGVF snakes can capture LTIs regardless of odd or even widths with a remarkably faster convergence speed, while preserving other desirable properties of GGVF snakes with lower computational complexity in vector normalization.
37 citations
TL;DR: Compared with the relevant existing schemes, the experimental results exhibit better performance of the proposed scheme at same bit rates, in terms of perceptual evaluation and subjective viewing.
Abstract: In this paper, a novel multiple description video coding scheme is proposed based on the characteristics of the human visual system (HVS). Due to the underlying spatial-temporal masking properties, human eyes cannot sense any changes below the just noticeable difference (JND) threshold. Therefore, at an encoder, only the visual information that cannot be predicted well within the JND tolerance needs to be encoded as redundant information, which leads to more effective redundancy allocation according to the HVS characteristics. Compared with the relevant existing schemes, the experimental results exhibit better performance of the proposed scheme at same bit rates, in terms of perceptual evaluation and subjective viewing.
33 citations
Cited by
More filters
1,584 citations
TL;DR: A new control-point representation that favors differential coding is proposed for efficient compression of affine parameters by exploiting the spatial correlation between adjacent coding blocks, motion vectors at control points can be predicted and thus efficiently coded, leading to overall improved performance.
Abstract: The affine-motion model is able to capture rotation, zooming, and the deformation of moving objects, thereby providing a better motion-compensated prediction. However, it is not widely used due to difficulty in both estimation and efficient coding of its motion parameters. To alleviate this problem, a new control-point representation that favors differential coding is proposed for efficient compression of affine parameters. By exploiting the spatial correlation between adjacent coding blocks, motion vectors at control points can be predicted and thus efficiently coded, leading to overall improved performance. To evaluate the proposed method, four new affine prediction modes are designed and embedded into the high-efficiency video coding test model HM1.0. The encoder adaptively chooses whether to use the new affine mode in an operational rate-distortion optimization. Bitrate savings up to 33.82% in low-delay and 23.90% in random-access test conditions are obtained for low-complexity encoder settings. For high-efficiency settings, bitrate savings up to 14.26% and 4.89% for these two modes are observed.
90 citations
TL;DR: A simplified affine motion model-based coding framework to overcome the limitation of a translational motion model and maintain low-computational complexity is studied.
Abstract: In this paper, we study a simplified affine motion model-based coding framework to overcome the limitation of a translational motion model and maintain low-computational complexity. The proposed framework mainly has three key contributions. First, we propose to reduce the number of affine motion parameters from 6 to 4. The proposed four-parameter affine motion model can not only handle most of the complex motions in natural videos, but also save the bits for two parameters. Second, to efficiently encode the affine motion parameters, we propose two motion prediction modes, i.e., an advanced affine motion vector prediction scheme combined with a gradient-based fast affine motion estimation algorithm and an affine model merge scheme, where the latter attempts to reuse the affine motion parameters (instead of the motion vectors) of neighboring blocks. Third, we propose two fast affine motion compensation algorithms. One is the one-step sub-pixel interpolation that reduces the computations of each interpolation. The other is the interpolation-precision-based adaptive block size motion compensation that performs motion compensation at the block level rather than the pixel level to reduce the number of interpolation. Our proposed techniques have been implemented based on the state-of-the-art high-efficiency video coding standard, and the experimental results show that the proposed techniques altogether achieve, on average, 11.1% and 19.3% bits saving for random access and low-delay configurations, respectively, on typical video sequences that have rich rotation or zooming motions. Meanwhile, the computational complexity increases of both the encoder and the decoder are within an acceptable range.
84 citations
Patent•
05 Jun 2012TL;DR: A unified candidate block set for both adaptive motion vector prediction (AMVP) mode and merge mode for use in inter-prediction is proposed in this paper, where the same candidate block sets are used regardless of which AMVP prediction mode (e.g., merge mode or AMVP) is used.
Abstract: A unified candidate block set for both adaptive motion vector prediction (AMVP) mode and merge mode for use in inter-prediction is proposed. In general, the same candidate block set is used regardless of which motion vector prediction mode (e.g., merge mode or AMVP mode) is used. In other examples of this disclosure, one candidate block in a set of candidate blocks is designated as an additional candidate block. The additional candidate block is used if one of the other candidate blocks is unavailable. Also, the disclosure proposes a checking pattern where the left candidate block is checked before the below left candidate block. Also, the above candidate block is checked before the right above candidate block.
81 citations
TL;DR: The compressive sensing (CS) principles are studied and an alternative coding paradigm with a number of descriptions is proposed based upon CS for high packet loss transmission and Experimental results show that the proposed CS-based codec is much more robust against lossy channels, while achieving higher rate-distortion performance.
Abstract: Multiple description coding (MDC) is one of the widely used mechanisms to combat packet-loss in non-feedback systems. However, the number of descriptions in the existing MDC schemes is very small (typically 2). With the number of descriptions increasing, the coding complexity increases drastically and many decoders would be required. In this paper, the compressive sensing (CS) principles are studied and an alternative coding paradigm with a number of descriptions is proposed based upon CS for high packet loss transmission. Two-dimentional discrete wavelet transform (DWT) is applied for sparse representation. Unlike the typical wavelet coders (e.g., JPEG 2000), DWT coefficients here are not directly encoded, but re-sampled towards equal importance of information instead. At the decoder side, by fully exploiting the intra-scale and inter-scale correlation of multiscale DWT, two different CS recovery algorithms are developed for the low-frequency subband and high-frequency subbands, respectively. The recovery quality only depends on the number of received CS measurements (not on which of the measurements that are received). Experimental results show that the proposed CS-based codec is much more robust against lossy channels, while achieving higher rate-distortion (R-D) performance compared with conventional wavelet-based MDC methods and relevant existing CS-based coding schemes.
80 citations