Restricted affine motion compensation and estimation in video coding with particle filtering and importance sampling: a multi-resolution approach
TL;DR: Results of extensive experimentation show reduced residual energy and better Peak Signal-to-Noise Ratio (PSNR) as compared to H.264/HEVC for instance, especially in regions of complex motion such as zooming and rotation.
Abstract: In this paper, we propose a multi-resolution affine block-based tracker for motion estimation and compensation, compatible with existing video coding standards such as H.264 and HEVC. We propose three modifications to traditional motion compensation techniques in video coding standards such as H.264 and HEVC. First, we replace traditional search methods with an efficient particle filtering-based method, which incorporates information from both spatial and temporal continuity. Second, we use a higher order linear model in place of the traditional translation motion model in these standards to efficiently represent complex motions such as rotation and zoom. Third, we propose a multi-resolution framework that enables efficient parameter estimation. Results of extensive experimentation show reduced residual energy and better Peak Signal-to-Noise Ratio (PSNR, hereafter) as compared to H.264/HEVC for instance, especially in regions of complex motion such as zooming and rotation.
Citations
More filters
TL;DR: The adaptive mode selection is introduced here by analyzing the pixel values of the current block to be coded with those of a motion compensated reference block using fuzzy holoentropy and can reduce the computation time without affecting the visual quality of frames.
Abstract: Due to the advancement of multimedia and its requirement of communication over the network, video compression has received much attention among the researchers. One of the popular video codings is scalable video coding, referred to as H.264/AVC standard. The major drawback in the H.264 is that it performs the exhaustive search over the interlayer prediction to gain the best rate-distortion performance. To reduce the computation overhead due to exhaustive search on mode prediction process, this paper presents a new technique for inter prediction mode selection based on the fuzzy holoentropy. This proposed scheme utilizes the pixel values and probabilistic distribution of pixel symbols to decide the mode. The adaptive mode selection is introduced here by analyzing the pixel values of the current block to be coded with those of a motion compensated reference block using fuzzy holoentropy. The adaptively selected mode decision can reduce the computation time without affecting the visual quality of frames. Experimentation of the proposed scheme is evaluated by utilizing five videos, and from the analysis, it is evident that proposed scheme has overall high performance with values of 41.367 dB and 0.992 for PSNR and SSIM respectively.
7 citations
References
More filters
TL;DR: An overview of the technical features of H.264/AVC is provided, profiles and applications for the standard are described, and the history of the standardization process is outlined.
Abstract: H.264/AVC is newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goals of the H.264/AVC standardization effort have been enhanced compression performance and provision of a "network-friendly" video representation addressing "conversational" (video telephony) and "nonconversational" (storage, broadcast, or streaming) applications. H.264/AVC has achieved a significant improvement in rate-distortion efficiency relative to existing standards. This article provides an overview of the technical features of H.264/AVC, describes profiles and applications for the standard, and outlines the history of the standardization process.
8,646 citations
TL;DR: A technique for image encoding in which local operators of many scales but identical shape serve as the basis functions, which tends to enhance salient image features and is well suited for many image analysis tasks as well as for image compression.
Abstract: We describe a technique for image encoding in which local operators of many scales but identical shape serve as the basis functions. The representation differs from established techniques in that the code elements are localized in spatial frequency as well as in space. Pixel-to-pixel correlations are first removed by subtracting a lowpass filtered copy of the image from the image itself. The result is a net data compression since the difference, or error, image has low variance and entropy, and the low-pass filtered image may represented at reduced sample density. Further data compression is achieved by quantizing the difference image. These steps are then repeated to compress the low-pass image. Iteration of the process at appropriately expanded scales generates a pyramid data structure. The encoding process is equivalent to sampling the image with Laplacian operators of many scales. Thus, the code tends to enhance salient image features. A further advantage of the present code is that it is well suited for many image analysis tasks as well as for image compression. Fast algorithms are described for coding and decoding.
6,975 citations
"Restricted affine motion compensati..." refers methods in this paper
...In our experiments, we chose 2 levels of the Gaussian pyramid [36] for 16 × 16 and 3, for 32 × 32 and 64 × 64 blocks....
[...]
TL;DR: The Condensation algorithm uses “factored sampling”, previously applied to the interpretation of static images, in which the probability distribution of possible interpretations is represented by a randomly generated set.
Abstract: The problem of tracking curves in dense visual clutter is challenging. Kalman filtering is inadequate because it is based on Gaussian densities which, being unimo dal, cannot represent simultaneous alternative hypotheses. The Condensation algorithm uses “factored sampling”, previously applied to the interpretation of static images, in which the probability distribution of possible interpretations is represented by a randomly generated set. Condensation uses learned dynamical models, together with visual observations, to propagate the random set over time. The result is highly robust tracking of agile motion. Notwithstanding the use of stochastic methods, the algorithm runs in near real-time.
5,804 citations
"Restricted affine motion compensati..." refers background in this paper
...A particle filter/CONDENSATION filter can work with any distribution represented as a collection of N particles [30]....
[...]
TL;DR: A unified approach to the coder control of video coding standards such as MPEG-2, H.263, MPEG-4, and the draft video coding standard H.264/AVC (advanced video coding) is presented.
Abstract: A unified approach to the coder control of video coding standards such as MPEG-2, H.263, MPEG-4, and the draft video coding standard H.264/AVC (advanced video coding) is presented. The performance of the various standards is compared by means of PSNR and subjective testing results. The results indicate that H.264/AVC compliant encoders typically achieve essentially the same reproduction quality as encoders that are compliant with the previous standards while typically requiring 60% or less of the bit rate.
3,312 citations