scispace - formally typeset
Search or ask a question

Showing papers on "Inter frame published in 2003"


Proceedings ArticleDOI
07 May 2003
TL;DR: This paper presents a new image processing method to remove unwanted vibrations and reconstruct a video sequence void of sudden camera movements based on a probabilistic estimation framework, and shows a significant improvement in stabilization quality.
Abstract: The removal of unwanted, parasitic vibrations in a video sequence induced by camera motion is an essential part of video acquisition in industrial, military and consumer applications. In this paper, we present a new image processing method to remove such vibrations and reconstruct a video sequence void of sudden camera movements. Our approach to separating unwanted vibrations from intentional camera motion is based on a probabilistic estimation framework. We treat estimated parameters of interframe camera motion as noisy observations of the intentional camera motion parameters. We construct a physics-based state-space model of these interframe motion parameters and use recursive Kalman filtering to perform stabilized camera position estimation. A six-parameter affine model is used to describe the interframe transformation, allowing quite accurate description of typical scene changes due to camera motion. The model parameters are estimated using a p-norm-based multi-resolution approach. This approach is robust to model mismatch and to object motion within the scene (which are treated as outliers). We use mosaicking in order to reconstruct undefined areas that result from motion compensation applied to each video frame. Registration between distant frames is performed efficiently by cascading interframe affine transformation parameters. We compare our method' s performance with that of a commercial product on real-life video sequences, and show a significant improvement in stabilization quality for our method.

232 citations


Proceedings ArticleDOI
24 Nov 2003
TL;DR: This work reports results on a Wyner-Ziv coding scheme for motion video that uses intraframe encoding, but interframe decoding, and uses previously reconstructed frames to generate side information for interframe decode of the Wyner -Ziv frames.
Abstract: In current interframe video compression systems, the encoder performs predictive coding to exploit the similarities of successive frames. The Wyner-Ziv theorem on source coding with side information available only at the decoder suggests that an asymmetric video codec, where individual frames are encoded separately, but decoded conditionally (given temporally adjacent frames) achieves similar efficiency. We report results on a Wyner-Ziv coding scheme for motion video that uses intraframe encoding, but interframe decoding. In the proposed system, key frames are compressed by a conventional intraframe codec and in-between frames are encoded using a Wyner-Ziv intraframe coder. The decoder uses previously reconstructed frames to generate side information for interframe decoding of the Wyner-Ziv frames.

179 citations


Proceedings ArticleDOI
14 Sep 2003
TL;DR: PRISM's architectural goals are to inherit the low encoding complexity and robustness of motion-JPEG style intra-frame video codecs while approaching the high compression efficiency of full-motion interframe video Codecs.
Abstract: In this work, we present PRISM (power-efficient, robust, high-compression, syndrome-based multimedia coding), a video coding paradigm based on the principles of coding with side information (which, unlike the classical Wyner-Ziv coding scenario Wyner, A et al. (1976), is characterized by an ambiguous state of nature characterizing the side-information Ishwar, P et al. (2003)). PRISM's architectural goals are to inherit the low encoding complexity and robustness of motion-JPEG style intra-frame video codecs while approaching the high compression efficiency of full-motion interframe video codecs. The PRISM paradigm roughly swaps the encoder-decoder complexity with respect to conventional video coding architectures through the novel concept of moving the motion compensation task from the encoder to the decoder. These traits make PRISM well-matched to uplink-rich media applications involving wireless video and security cameras, multimedia-equipped phones and PDA's etc.

133 citations


Patent
Simon Winder1
13 Jun 2003
TL;DR: In this article, the authors describe techniques and tools for video frame interpolation and motion analysis in real-time to increase the frame rate of streamed video for playback, which may be implemented separately or in combination in software and/or hardware devices for various applications.
Abstract: Techniques and tools for video frame interpolation and motion analysis are described. The techniques and tools may be implemented separately or in combination in software and/or hardware devices for various applications. For example, a media playback device uses frame interpolation and motion analysis in real time to increase the frame rate of streamed video for playback. The device uses feature/region selection in global motion estimation, local motion estimation to correct the global motion estimation at an intermediate timestamp for a synthesized frame, and vector switching in the local motion estimation.

126 citations


Patent
13 Jun 2003
TL;DR: In this article, the authors describe techniques and tools for quality control in frame interpolation and motion analysis, which may be implemented separately or in combination in software and/or hardware devices for various applications.
Abstract: Techniques and tools for quality control in frame interpolation and motion analysis are described. The techniques and tools may be implemented separately or in combination in software and/or hardware devices for various applications. For example, a media playback device uses quality control in frame interpolation with motion analysis to increase the frame rate of streamed video for playback. The device selectively uses frame synthesis to increase frame rate and quality under normal circumstances, but avoids using frame synthesis when it would not provide suitable quality improvement. The device uses selective ghosting reduction and selective feathering to reduce artifacts in synthesized frames.

107 citations


Patent
Pohsiang Hsu1, Bruce Lin1, Thomas W. Holcomb1, Kunal Mukerjee1, Sridhar Srinivasan1 
18 Jul 2003
TL;DR: In this article, techniques and tools for encoding and decoding video images (e.g., interlaced frames) are described, including DC/AC prediction techniques and motion vector prediction techniques.
Abstract: Techniques and tools for encoding and decoding video images (e.g., interlaced frames) are described. For example, a video encoder or decoder processes 4:1:1 format macroblocks comprising four 8×8 luminance blocks and four 4×8 chrominance blocks. In another aspect, fields in field-coded macroblocks are coded independently of one another (e.g., by sending encoded blocks in field order). Other aspects include DC/AC prediction techniques and motion vector prediction techniques for interlaced frames.

101 citations


Patent
31 Mar 2003
TL;DR: In this paper, an apparatus and method for encoding video frames is presented, in which the video frames are divided into blocks for encoding, and the video blocks are used to obtain the desired compression for a particular bit rate.
Abstract: An apparatus and method for encoding video frames is provided. The video frames are divided into blocks for encoding. Encoding of the video blocks utilizes motion detection, motion estimation and adaptive compression, to obtain the desired compression for a particular bit rate. Adaptive compression includes intra compression (without regard to other frames) and inter compression (with regard to other frames). Intra compression, inter compression with motion detection, and inter compression with motion estimation are performed on a block by block basis, as needed. Segmentation is provided to compare encoding of a block with encoding of its sub-blocks, and to select the best block size for encoding.

99 citations


Patent
Koto Shinichiro1, Masuda Tadaaki1
19 Nov 2003
TL;DR: In this article, a video scrambling apparatus has a video encoder and a video scrambler, and the encoder selects a frame, which is not used as a reference frame for interframe prediction, from the input video signal, and scrambles the frame by pixel replacing in units of slices within a predetermined vertical range.
Abstract: A video scramble apparatus has a scrambler for scrambling an input video signal, and a video encoder for performing interframe predictive coding of the video signal after scrambling, and the scrambler selects a frame, which is not used as a reference frame for interframe prediction in the video encoder, from the input video signal, and scrambles the frame by pixel replacing in units of slices within a predetermined vertical range or pixel replacing in units of n consecutive macroblocks within a predetermined horizontal range.

99 citations


Journal ArticleDOI
TL;DR: This work presents a frame interpolation algorithm for FRC based on a pyramid structure and the motion compensation process is performed independently at each resolution level, similar to control grid interpolation (CGI).
Abstract: Frame rate up-conversion (FRC) is one of the main issues that have arisen in recent years with the emergence of new television and multimedia systems; it is required for conversion between any two display formats with different frame rates. We present a frame interpolation algorithm for FRC. It is based on a pyramid structure and the motion compensation process is performed independently at each resolution level. A technique similar to control grid interpolation (CGI) is employed to process hole regions generated at the top level of the pyramid. Bidirectional motion estimation (ME) and prediction mode selection are utilized at intermediate levels. Finally, at the bottom level, motion vector refinement and overlapped block motion compensation (OBMC) are used. In experiments, the frame rate of the progressive video sequence is up-converted by a factor of two and the performance of the proposed algorithm is compared with that of the conventional frame interpolation method. Experiments with several test sequences show the effectiveness of the proposed algorithm.

90 citations


Proceedings ArticleDOI
01 Jan 2003
TL;DR: A novel analytical approach is proposed to evaluate throughput and delay performance of IFS based priority mechanisms for CSMA-CA systems and 802.11 enhanced distributed coordination function.
Abstract: A number of service differentiation mechanisms have been proposed for, in general, CSMA-CA systems, and, in particular, the 802.11 enhanced distributed coordination function. An effective way to provide prioritized service support is to use different inter frame spaces (IFS) for stations belonging to different priority classes. This paper proposes a novel analytical approach to evaluate throughput and delay performance of IFS based priority mechanisms.

86 citations


Proceedings ArticleDOI
06 Jul 2003
TL;DR: This work presents a method to speed up the matching process for multiple reference frames in MPEG-4 AVC/JVT/H.264 by analyzing the available information after intra prediction and motion estimation from the previous frame to determine whether it is necessary to search more frames.
Abstract: In the new video coding standard, MPEG-4 AVC/JVT/H264, motion estimation is allowed to use multiple reference frames The reference software adopts full search scheme, and the increased computation is in proportion to the number of searched reference frames However, the reduction of prediction residues is highly dependent on the nature of sequences, not on the number of searched frames In this paper, we present a method to speed up the matching process for multiple reference frames For each macroblock, we analyze the available information after intra prediction and motion estimation from previous one frame to determine whether it is necessary to search more frames The information we use includes selected mode, inter prediction residues, intra prediction residues, and motion vectors Simulation results show that the proposed algorithm can save up to 90% of unnecessary frames while keeping the average miss rate of optimal frames less than 4%

Patent
10 Feb 2003
TL;DR: In this paper, a self-adaptive feedback scheme was proposed to compensate for the distortion component from prior frame compression in subsequent difference frame compression, which can improve the quality of static regions in the recovered images.
Abstract: The quality of digital images recovered from compressed data in an inter-frame redundancy-removing scheme is enhanced using a self-adaptive feedback scheme in an image compression/decompression system so as to provide for the compensation of the distortion component from prior frame compression in subsequent difference frame compression. Each transmitted frame is stored after a full compress/decompress cycle, and difference data (which includes the inverse of the distortion component from compression of the transmitted frame) representing the difference between the stored frame and the incoming new frame is transmitted. Consequently, the quality of static regions in the recovered images may be improved with each subsequent iteration by taking the distortion component in the prior frame into consideration along with the inter-frame motion information. The feedback loop thus forms a self-adaptive iterative cycle.

Patent
19 Nov 2003
TL;DR: In this article, an interpolation frame generation device that generates a frame that interpolates image frames that are obtained by decoding a coded image signal that is coded by motion compensation is presented.
Abstract: An interpolation frame generation device that generates an interpolation frame that interpolates image frames that are obtained by decoding a coded image signal that is coded by motion compensation, includes a motion vector deriving unit and an interpolation frame generating unit. The motion vector deriving unit acquires a motion compensation vector of a coded block that forms the coded image signal. The interpolation frame generating unit generates the interpolation frame in accordance with the motion vector of the image block that forms an image frame by using the motion compensation vector of the coded block as the motion vector of the image block.

Patent
07 Nov 2003
TL;DR: In this paper, the boundary of the target region of a previous frame is projected onto the current frame so that a search area in the current search area can be established, for every pixel in the search area, a search window is established in the previous frame so as to find a matched pixel within the search window.
Abstract: A method and device for tracking a region-of-interest in a sequence of image frames, wherein the boundary of the target region of a previous frame is projected onto the current frame so that a search area in the current frame can be established. For every pixel in the search area in the current frame, a search window is established in the previous frame so as to find a matched pixel within the search window. If the matched pixel is within the ROI of the previous frame, then the corresponding pixel in the current frame is preliminarily considered as a pixel within the ROI of the current frame. This backward matching is carried out using a low pass subband in the wavelet domain. The preliminary ROI in the current frame is then refined using edge detection in a high frequency subband.

Journal ArticleDOI
TL;DR: Simulations show that the proposed method can select key frames according to the dynamics of a video sequence and abstract the video with different levels of scalability.

Patent
24 Oct 2003
TL;DR: In this paper, an initial subset of modes is considered and an estimation of the motion for each block in the sub-set is made to establish a best motion vector, and a distortion measure is also made for each subset.
Abstract: An encoder (10) achieves improved encoding efficiency by initially limiting consideration of the potential modes (block sizes) to a prescribed sub-set and by performing mode estimation jointly with mode decision-making. An initial sub-set of modes is considered and an estimation of the motion for each block in the sub-set is made to establish a best motion vector. A distortion measure is also made for each sub-set. From the distortion measure, a determination is made whether or not to estimate the motion for other block sizes. If not, then an encoding mode is chosen in accordance with the estimated motion. In this way, motion estimation on all possible block sizes need not be undertaken.

Patent
Nao Mishima1, Goh Itoh1
09 Sep 2003
TL;DR: In this article, a first motion vector is estimated by using a first frame and a second frame that follows the first frame, and a support frame is generated from at least either the first or the second frame by using the first motion vectors.
Abstract: A first motion vector is estimated by using a first frame and a second frame that follows the first frame A support frame is generated from at least either the first or the second frame by using the first motion vector The support frame is divided into a plurality of small blocks Motion vector candidates are estimated by using the first and second frames, in relation to each of the small blocks A small block on the first frame, a small block on the second frame and the small blocks on the support frame are examined The small block on the first frame and the small block on the second frame correspond to each of the motion vector candidates A second motion vector is selected from the motion vector candidates, and points the small block on the first frame and the small block on the second frame which have the highest correlation with each of the small blocks on the support frame An interpolated frame is generated from at least either the first or the second frame by using the second motion vector

Patent
25 Jul 2003
TL;DR: In this paper, a storage device saving MVs of a partial or an entire frame and the prediction modes are applied to be the reference for the motion estimation of the neighboring frame.
Abstract: This invention provides efficient method and apparatus of the motion estimation for the video compression. A storage device saving MVs of a partial or an entire frame and the prediction modes are applied to be the reference for the motion estimation of the neighboring frame. The majority MV of the current frame and at least one neighboring frame is referred as the MV or as the initial point of searching for the current frame or the neighboring frames. Should the movement of the blocks in previous frame is different from the FMV, will the blocks or the neighboring blocks need to go through the motion estimation. The predetermined threshold values are specified to decide the need of a refiner pixel resolution, the sub-sampling ratio, and early giving up or early selecting of the current macroblock. Sub-sampling ratio or the decision of refiner or coarser pixel resolution is determined by the values of the MV or MAD.

Journal ArticleDOI
TL;DR: This paper presents a sequential key frame selection method that selects the pre-determined number of initial key frames and time-intervals by iteration, which reduces the distortion step by step.
Abstract: Video representation through key frames has been addressed frequently as an efficient way of preserving the whole temporal information of sequence with a considerably smaller amount of data. Such compact video representation is suitable for the purpose of video browsing in limited storage or transmission bandwidth environments. In this case, the controllability of the total key frame number (i.e. key frame rate) depending on the storage or bandwidth capacity is an important requirement for the key frame selection method. In this paper, we present a sequential key frame selection method when the number of key frames is given as a constraint. It first selects the pre-determined number of initial key frames and time-intervals. Then, it adjusts the positions of key frames and time-intervals by iteration, which reduces the distortion step by step. Experimental results demonstrate the improved performance of our algorithm over the existing approaches.

Patent
12 Dec 2003
TL;DR: In this article, a video stream containing encoded frame-based video information includes a first frame and a second frame, and a re-mapping strategy for video enhancement of the decoded first frame is determined using a region-based analysis.
Abstract: A video stream containing encoded frame-based video information includes a first frame and a second frame. The encoding of the second frame depends on the encoding of the first frame. The encoding includes motion vectors indicating differences in positions between regions of the second frame and corresponding regions of the first frame, the motion vectors defining the correspondence between regions of the second frame and regions of the first frame. The first frame is decoded and a re-mapping strategy for video enhancement of the decoded first frame is determined using a region-based analysis. Regions of the decoded first frame are re-mapped according to the determined video enhancement re-mapping strategy for the first frame so as to enhance the first frame. The motion vectors for the second frame are recovered from the video stream and the second frame is decoded.

Proceedings ArticleDOI
24 Nov 2003
TL;DR: A double stimulus subjective evaluation was performed to determine preferred frame rates at a fixed bit rate for low bit rate video, with several notable content-based exceptions.
Abstract: A double stimulus subjective evaluation was performed to determine preferred frame rates at a fixed bit rate for low bit rate video. Stimuli consisted of eight reference color video sequences of size 352/spl times/240 pixels. These were compressed at rates of 100, 200 and 300 kbps for low, medium, and high motion sequences, respectively, using three encoders and frame rates of 10, 15 and 30 frames per second. Twenty-two viewers ranked their frame rate preferences using an adjectival categorical scale. Their preferences were analyzed across sequence content, motion type, and encoder. Viewers preferred a frame rate of 15 frames per second across all categories, with several notable content-based exceptions.

Patent
Rajeeb Hazra1, Arlene Kasai1
27 May 2003
TL;DR: In this paper, a method comprising selecting a number of blocks of a frame pair and synthesizing an interpolated frame based on those selected blocks of the frame pair is presented, which may be aborted upon determining that the proposed frame has an unacceptable quality.
Abstract: A method comprising selecting a number of blocks of a frame pair and synthesizing an interpolated frame based on those selected blocks of the frame pair. Additionally, the synthesis of the interpolated frame may be aborted upon determining that the interpolated frame has an unacceptable quality.

Proceedings ArticleDOI
24 Nov 2003
TL;DR: The algorithm has been tested on MPEG-2 video, providing very satisfactory results, and outperforming by several dBs in PSNR the concealment technique based on repetition of the last received frame.
Abstract: A known problem in video streaming is that loss of a packet usually results into loss of a whole video frame. In this paper we propose an error concealment algorithm specifically designed to handle this sort of losses. The technique exploits information in a few past frames (namely the motion vectors) in order to estimate the forward motion vectors of the last received frame. This information is used to project the last frame onto an estimate of the missing frame. The algorithm has been tested on MPEG-2 video, providing very satisfactory results, and outperforming by several dBs in PSNR the concealment technique based on repetition of the last received frame.

Proceedings ArticleDOI
06 Jul 2003
TL;DR: A new framework for adaptive temporal filtering in wavelet interframe codecs, called unconstrained motion compensated temporal filtering (UMCTF), which provides higher coding efficiency, improved visual quality and flexibility of temporal and spatial scalability, and lower decoding delay than conventional MCTF schemes.
Abstract: This paper presents a new framework for adaptive temporal filtering in wavelet interframe codecs, called the unconstrained motion compensated temporal filtering (UMCTF). This framework allows flexible and efficient temporal filtering by combining the best features of motion compensation, used in predictive coding, with the advantages of interframe scalable wavelet video coding schemes. UMCTF provides higher coding efficiency, improved visual quality and flexibility of temporal and spatial scalability, higher coding efficiency and tower decoding delay than conventional MCTF schemes. Furthermore, UMCTF can also be employed in alternative open-loop scalable coding frameworks using DCT for the texture coding.

Patent
19 Sep 2003
TL;DR: In this article, a method for incrementally coding and signalling motion information for a video compression system involving a motion adaptive transform and embedded coding of transformed video samples comprises the steps of: (a) producing an embedded bit-stream, representing each motion field in coarse-to-fine fashion; and (b) interleaving incremental contributions from said embedded motion fields with incremental contributions of transformed videos.
Abstract: A method for incrementally coding and signalling motion information for a video compression system involving a motion adaptive transform and embedded coding of transformed video samples comprises the steps of: (a) producing an embedded bit-stream, representing each motion field in coarse to fine fashion; and (b) interleaving incremental contributions from said embedded motion fields with incremental contributions from said transformed video samples. A further embodiment of a method for estimating and signalling motion information for a motion adaptive transform based on temporal lifting steps comprises the steps of: (a) estimating and signalling motion parameters describing a first mapping from a source frame onto a target frame within one of the lifting steps; and (b) inferring a second mapping between either said source frame or said target frame, and another frame, based on the estimated and signalled motion parameters associated with said first mapping.

Proceedings ArticleDOI
01 Jan 2003
TL;DR: A novel center-biased frame selection method is proposed to speed up the multi-frame motion estimation process in H.264 and can save about 77% computations constantly while keeping similar picture quality as compared to full search.
Abstract: The new upcoming video coding standard, H264, allows motion estimation performing on multiple reference frames This new feature improves the prediction accuracy of inter-coding blocks significantly, but it is extremely computational intensive Its reference software adopts a full search scheme The complexity of multi-frame motion estimation increases linearly with the number of used reference frames However, the distortion gain given by each reference frame varies with the motion content of the video sequence, and it is not efficient to search through all the candidate frames In this paper, a novel center-biased frame selection method is proposed to speed up the multi-frame motion estimation process in H264 We apply a center-biased frame selection path to identify the ultimate reference frame from all the candidates Simulation results show that our proposed method can save about 77% computations constantly while keeping similar picture quality as compared to full search

Patent
12 Jun 2003
TL;DR: In this paper, a method of distinguishing between a foreground object part and a substantially static background part of each videoframe within a video sequence is proposed, which comprises the steps of dividing each video frame into a number of video blocks each of which comprises one or more pixels.
Abstract: A method of distinguishing between a foreground object part and a substantially static background part of each videoframe within a video sequence. The method comprises the steps of: dividing each video frame into a number of video blocks each of which comprises one or more pixels; generating a mask frame in respect of each video frame, each mask frame having a mask block corresponding to each video block in each respective video frame; and either setting each mask block to either an object value, indicating that the corresponding video block in the corresponding video frame includes one or more pixels depicting a foreground object part, or to another value.

Journal ArticleDOI
TL;DR: This work is the first to exploit the temporal coherence in the connectivity data between frames and presents a detailed encoding scheme for 3-D dynamic data that has a far superior performance when compared with existing techniques for both vertex compression and connectivity compression of 3- D dynamic datasets.
Abstract: Generation and transmission of complex animation sequences can benefit substantially from the availability of tools for handling large amounts of data associated with dynamic three-dimensional (3-D) models. Previous works in 3-D dynamic compression consider only the simplest situation where the connectivity changes do not occur with time. In this paper, we present an approach for compressing 3-D dynamic models in which both the vertex data and the connectivity data can change with time. Using our framework, 3-D animation sequences generated using commercial graphics tools or dynamic range data captured using range scanners can be compressed significantly. We use 3-D registration to identify the changes in the vertex data and the connectivity of the 3-D geometry between successive frames. Next, the interframe motion is encoded using affine motion parameters and the differential pulse coded modulation (DPCM) predictor. Our work is the first to exploit the temporal coherence in the connectivity data between frames and presents a detailed encoding scheme for 3-D dynamic data. We also discuss the issue of inserting I-frames in the compressed data for better performance. We show that our algorithm has a far superior performance when compared with existing techniques for both vertex compression and connectivity compression of 3-D dynamic datasets.

Proceedings ArticleDOI
16 Jun 2003
TL;DR: A modification to MCTF is presented, introducing smooth transitions between motion blocks in the high and low frequency subbands, respectively, and an optimized quantization strategy to compensate for variations in synthesis filtering for different block types is presented.
Abstract: In interframe wavelet video coding, wavelet-based motion-compensated temporal filtering (MCTF) is combined with spatial wavelet decomposition, allowing for efficient spatio-temporal decorrelation and temporal, spatial and SNR scalability Contemporary interframe wavelet video coding concepts employ block-based motion estimation (ME) and compensation (MC) to exploit temporal redundancy between successive frames Due to occlusion effects and imperfect motion modeling, block-based MCTF may generate temporal high frequency subbands with block-wise varying coefficient statistics, and low frequency subbands with block edges Both effects may cause declined spatial transform gain and blocking artifacts As a modification to MCTF, we present spatial highpass transition filtering (SHTF) and spatial lowpass transition filtering (SLTF), introducing smooth transitions between motion blocks in the high and low frequency subbands, respectively Additionally, we analyze the propagation of quantization noise in MCTF and present an optimized quantization strategy to compensate for variations in synthesis filtering for different block types Combining these approaches leads to a reduction of blocking artifacts, smoothed temporal PSNR performance, and significantly improved coding efficiency

Proceedings ArticleDOI
25 Mar 2003
TL;DR: It is shown that using a dual frame buffer, together with intra/inter mode switching improves the compression performance of the coder and the mode switching algorithm is improved with the use of half-pel motion vectors.
Abstract: Video codecs that use motion compensation have achieved performance improvements from the use of intra/inter mode switching decisions within a rate-distortion framework. A separate development has involved the use of multiple frame prediction, in which more than one past reference frame is available for motion estimation. It is shown that using a dual frame buffer, together with intra/inter mode switching improves the compression performance of the coder. Also, the mode switching algorithm is improved with the use of half-pel motion vectors.