scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Restricted affine motion compensation in video coding using particle filtering

TL;DR: This work proposes a novel particle filter-based motion compensation strategy for video coding that uses a higher order linear model in place of the traditional translational model used in standards such as H.264.
Abstract: We propose a novel particle filter-based motion compensation strategy for video coding. We use a higher order linear model in place of the traditional translational model used in standards such as H.264. The measurement/observation process in the particle filter is a computationally efficient mechanism as opposed to traditional search methods. We use a multi-resolution framework for efficient parameter estimation. Results of our experimentation show reduced residual energy and better PSNR as compared to traditional video coding methods, especially in regions of complex motion such as zooming and rotation.
Citations
More filters
DOI

[...]

04 Jun 2012
TL;DR: An automatic approach, SAMBA, that computes a pleasing choreography by using a novel combination of a distance weighted, least-squares registration between a previous and a subsequent frame and of a modified SAM interpolation.
Abstract: Given the start positions of a group of dancers, a choreographer specifies their end positions and says: "Run!" Each dancer has the choice of his/her motion. These choices influence the perceived beauty (or grace) of the overall choreography. We report experiments with an automatic approach, SAMBA, that computes a pleasing choreography. Rossignac and Vinacua focused on affine motions, which, in the plane, correspond to choreographies for three independent dancers. They proposed the inverse of the Average Relative Acceleration (ARA) as a measure of grace and their Steady Affine Morph (SAM) as the most graceful interpolating motion. Here, we extend their approach to larger groups. We start with a discretized (uniformly time-sampled) choreography, where each dancer moves with constant speed. Each SAMBA iteration steadies the choreography by tweaking the positions of dancers at all intermediate frames towards corresponding predicted positions. The prediction for the position of dancer at a given frame is computed by using a novel combination of a distance weighted, least-squares registration between a previous and a subsequent frame and of a modified SAM interpolation. SAMBA is fully automatic, converges in a fraction of a second, and produces pleasing and interesting motions.

1 citations

Journal ArticleDOI

[...]

TL;DR: Results of extensive experimentation show reduced residual energy and better Peak Signal-to-Noise Ratio (PSNR) as compared to H.264/HEVC for instance, especially in regions of complex motion such as zooming and rotation.
Abstract: In this paper, we propose a multi-resolution affine block-based tracker for motion estimation and compensation, compatible with existing video coding standards such as H.264 and HEVC. We propose three modifications to traditional motion compensation techniques in video coding standards such as H.264 and HEVC. First, we replace traditional search methods with an efficient particle filtering-based method, which incorporates information from both spatial and temporal continuity. Second, we use a higher order linear model in place of the traditional translation motion model in these standards to efficiently represent complex motions such as rotation and zoom. Third, we propose a multi-resolution framework that enables efficient parameter estimation. Results of extensive experimentation show reduced residual energy and better Peak Signal-to-Noise Ratio (PSNR, hereafter) as compared to H.264/HEVC for instance, especially in regions of complex motion such as zooming and rotation.

1 citations


Cites methods from "Restricted affine motion compensati..."

  • [...]

  • [...]

References
More filters
Journal ArticleDOI

[...]

TL;DR: The Condensation algorithm uses “factored sampling”, previously applied to the interpretation of static images, in which the probability distribution of possible interpretations is represented by a randomly generated set.
Abstract: The problem of tracking curves in dense visual clutter is challenging. Kalman filtering is inadequate because it is based on Gaussian densities which, being unimo dal, cannot represent simultaneous alternative hypotheses. The Condensation algorithm uses “factored sampling”, previously applied to the interpretation of static images, in which the probability distribution of possible interpretations is represented by a randomly generated set. Condensation uses learned dynamical models, together with visual observations, to propagate the random set over time. The result is highly robust tracking of agile motion. Notwithstanding the use of stochastic methods, the algorithm runs in near real-time.

5,749 citations


"Restricted affine motion compensati..." refers background in this paper

  • [...]

  • [...]

  • [...]

Book

[...]

19 Dec 2003
TL;DR: In this article, the MPEG-4 and H.264 standards are discussed and an overview of the technologies involved in their development is presented. But the focus is on the performance and not the technical aspects.
Abstract: About the Author.Foreword.Preface.Glossary.1. Introduction.2. Video Formats and Quality.3. Video Coding Concepts.4. The MPEG-4 and H.264 Standards.5. MPEG-4 Visual.6. H.264/MPEG-4 Part 10.7. Design and Performance.8. Applications and Directions.Bibliography.Index.

2,490 citations

Book ChapterDOI

[...]

28 May 2002
TL;DR: This work introduces a new Monte Carlo tracking technique based on the same principle of color histogram distance, but within a probabilistic framework, and introduces the following ingredients: multi-part color modeling to capture a rough spatial layout ignored by global histograms, incorporation of a background color model when relevant, and extension to multiple objects.
Abstract: Color-based trackers recently proposed in [3,4,5] have been proved robust and versatile for a modest computational cost They are especially appealing for tracking tasks where the spatial structure of the tracked objects exhibits such a dramatic variability that trackers based on a space-dependent appearance reference would break down very fast Trackers in [3,4,5] rely on the deterministic search of a window whose color content matches a reference histogram color modelRelying on the same principle of color histogram distance, but within a probabilistic framework, we introduce a new Monte Carlo tracking technique The use of a particle filter allows us to better handle color clutter in the background, as well as complete occlusion of the tracked entities over a few framesThis probabilistic approach is very flexible and can be extended in a number of useful ways In particular, we introduce the following ingredients: multi-part color modeling to capture a rough spatial layout ignored by global histograms, incorporation of a background color model when relevant, and extension to multiple objects

1,530 citations

MonographDOI

[...]

02 Sep 2003
TL;DR: This paper presents a meta-review of the MPEG-4 and H.264 standards for video quality and design, and some of the standards themselves have been revised and improved since their publication in 2009.
Abstract: About the Author.Foreword.Preface.Glossary.1. Introduction.2. Video Formats and Quality.3. Video Coding Concepts.4. The MPEG-4 and H.264 Standards.5. MPEG-4 Visual.6. H.264/MPEG-4 Part 10.7. Design and Performance.8. Applications and Directions.Bibliography.Index.

1,492 citations

[...]

01 Nov 1984
TL;DR: A variety of pyramid methods that have been developed for image data compression, enhancement, analysis and graphics that are versatile, convenient, and efficient to use.
Abstract: The data structure used to represent image information can be critical to the successful completion of an image processing task. One structure that has attracted considerable attention is the image pyramid This consists of a set of lowpass or bandpass copies of an image, each representing pattern information of a different scale. Here we describe a variety of pyramid methods that we have developed for image data compression, enhancement, analysis and graphics. ©1984 RCA Corporation Final manuscript received November 12, 1984 Reprint Re-29-6-5 that can perform most of the routine visual tasks that humans do effortlessly. It is becoming increasingly clear that the format used to represent image data can be as critical in image processing as the algorithms applied to the data. A digital image is initially encoded as an array of pixel intensities, but this raw format i s not suited to most tasks. Alternatively, an image may be represented by its Fourier transform, with operations applied to the transform coefficients rather than to the original pixel values. This is appropriate for some data compression and image enhancement tasks, but inappropriate for others. The transform representation is particularly unsuited for machine vision and computer graphics, where the spatial location of pattem elements is critical. Recently there has been a great deal of interest in representations that retain spatial localization as well as localization in the spatial—frequency domain. This i s achieved by decomposing the image into a set of spatial frequency bandpass component images. Individual samples of a component image represent image pattern information that is appropriately localized, while the bandpassed image as a whole represents information about a particular fineness of detail or scale. There is evidence that the human visual system uses such a representation, and multiresolution schemes are becoming increasingly popular in machine vision and in image processing in general. The importance of analyzing images at many scales arises from the nature of images themselves. Scenes in the world contain objects of many sizes, and these objects contain features of many sizes. Moreover, objects can be at various distances from the viewer. As a result, any analysis procedure that is applied only at a single scale may miss information at other scales. The solution is to carry out analyses at all scales simultaneously. Convolution is the basic operation of most image analysis systems, and convolution with large weighting functions is a notoriously expensive computation. In a multiresolution system one wishes to perform convolutions with kernels of many sizes, ranging from very small to very large. and the computational problems appear forbidding. Therefore one of the main problems in working with multiresolution representations is to develop fast and efficient techniques. Members of the Advanced Image Processing Research Group have been actively involved in the development of multiresolution techniques for some time. Most of the work revolves around a representation known as a "pyramid," which is versatile, convenient, and efficient to use. We have applied pyramid-based methods to some fundamental problems in image analysis, data compression, and image manipulation.

1,029 citations


"Restricted affine motion compensati..." refers methods in this paper

  • [...]

  • [...]