scispace - formally typeset
Search or ask a question

Showing papers on "Inter frame published in 2009"


Proceedings ArticleDOI
18 Mar 2009
TL;DR: Unlike conventional DVC schemes, the DISCOS framework can perform most encoding operations in the analog domain with very low-complexity, making it be a promising candidate for real-time, practical applications where the analog to digital conversion is expensive, e.g., in Terahertz imaging.
Abstract: This paper proposes a novel framework called Distributed Compressed Video Sensing (DISCOS) - a solution for Distributed Video Coding (DVC) based on the Compressed Sensing (CS) theory. The DISCOS framework compressively samples each video frame independently at the encoder and recovers video frames jointly at the decoder by exploiting an interframe sparsity model and by performing sparse recovery with side information. Simulation results show that DISCOS significantly outperforms the baseline CS-based scheme of intraframe-coding and intraframe-decoding. Moreover, our DISCOS framework can perform most encoding operations in the analog domain with very low-complexity. This makes DISCOS a promising candidate for real-time, practical applications where the analog to digital conversion is expensive, e.g., in Terahertz imaging.

183 citations


Journal ArticleDOI
01 Dec 2009
TL;DR: This work builds a complete video resizing framework by incorporating motion-aware constraints with an adaptation of the scale-and-stretch optimization recently proposed by Wang and colleagues, and streaming implementation of the framework allows efficient resizing of long video sequences with low memory cost.
Abstract: Temporal coherence is crucial in content-aware video retargeting. To date, this problem has been addressed by constraining temporally adjacent pixels to be transformed coherently. However, due to the motion-oblivious nature of this simple constraint, the retargeted videos often exhibit flickering or waving artifacts, especially when significant camera or object motions are involved. Since the feature correspondence across frames varies spatially with both camera and object motion, motion-aware treatment of features is required for video resizing. This motivated us to align consecutive frames by estimating interframe camera motion and to constrain relative positions in the aligned frames. To preserve object motion, we detect distinct moving areas of objects across multiple frames and constrain each of them to be resized consistently. We build a complete video resizing framework by incorporating our motion-aware constraints with an adaptation of the scale-and-stretch optimization recently proposed by Wang and colleagues. Our streaming implementation of the framework allows efficient resizing of long video sequences with low memory cost. Experiments demonstrate that our method produces spatiotemporally coherent retargeting results even for challenging examples with complex camera and object motion, which are difficult to handle with previous techniques.

114 citations


Patent
James Bankoski1, Yaowu Xu1, Paul Wilkins1
10 Sep 2009
TL;DR: In this article, a method for digital video encoding prediction comprising creating a constructed reference frame using an encoder and compressing a series of source video frames using the constructed referenceframe to obtain a bitstream including a compressed digital video signal for a subsequent decoding process is presented.
Abstract: Disclosed herein is a method for digital video encoding prediction comprising creating a constructed reference frame using an encoder and compressing a series of source video frames using the constructed reference frame to obtain a bitstream including a compressed digital video signal for a subsequent decoding process. The constructed reference frame is omitted from the series of digital video frames during the subsequent viewing process.

77 citations


Proceedings ArticleDOI
06 May 2009
TL;DR: A decoder side motion vector derivation scheme for inter frame video coding is proposed using a template matching algorithm, motion information is derived at the decoder instead of explicitly coding the information into the bitstream.
Abstract: Decoder-side motion vector derivation (DMVD) using template matching has been shown to improve coding efficiency of H.264/AVC based video coding. Instead of explicitly coding motion vectors into the bitstream, the decoder performs motion estimation in order to derive the motion vector used for motion compensated prediction. In previous works, DMVD was performed using a full template matching search in a limited search range. In this paper, a candidate based fast search algorithm replaces the full search. While the complexity reduction especially for the decoder is quite significant, the coding efficiency remains comparable. While for the full search algorithm BD-Bitrate savings of 7.4% averaged over CIF and HD sequences according to the VCEG common conditions for IPPP high profile are observed, the proposed fast search achieves bitrate reductions of up to 7.5% on average. By further omitting sub-pel refinement, average savings observed for CIF and HD are still up to 7%.

73 citations


Journal ArticleDOI
TL;DR: An efficient intermode decision algorithm based on motion homogeneity evaluated on a normalized motion vector (MV) field, which is generated using MVs from motion estimation on the block size of 4 times 4.
Abstract: The latest video coding standard H.264/AVC significantly outperforms previous standards in terms of coding efficiency. H.264/AVC adopts variable block sizes ranging from 4 times 4 to 16 times 16 in inter frame coding, and achieves significant gain in coding efficiency compared to coding a macroblock (MB) using regular block size. However, this new feature causes extremely high computation complexity when rate-distortion optimization (RDO) is performed using the scheme of full mode decision. This paper presents an efficient intermode decision algorithm based on motion homogeneity evaluated on a normalized motion vector (MV) field, which is generated using MVs from motion estimation on the block size of 4 times 4. Three directional motion homogeneity measures derived from the normalized MV field are exploited to determine a subset of candidate intermodes for each MB, and unnecessary RDO calculations on other intermodes can be skipped. Experimental results demonstrate that our algorithm can reduce the entire encoding time about 40% on average, without any noticeable loss of coding efficiency.

64 citations


Proceedings ArticleDOI
19 Jan 2009
TL;DR: Using a template matching algorithm, motion vectors are derived at the decoder side instead of explicitly coding the motion vectors into the bitstream, therefore, higher numbers of hypotheses can be used in the averaging process at no additional coding cost.
Abstract: In this paper, a multihypothesis prediction scheme for inter frame video coding is proposed. Using a template matching algorithm, motion vectors are derived at the decoder side instead of explicitly coding the motion vectors into the bitstream. Therefore, higher numbers of hypotheses can be used in the averaging process at no additional coding cost. The proposed scheme has been implemented into the H.264/AVC reference software. Simulation results show bitrate reductions compared to H.264/AVC of 7.7% on average for the tested video sequences. It is shown that part of the performance gain is due to rounding effects in H.264/AVC sub-pixel interpolation which can be exploited in the averaging calculation of the proposed multihypothesis prediction. Experiments with an improved interpolation filter for both reference scheme and the proposed scheme still yield bitrate reductions of 4.7% on average.

61 citations


Journal ArticleDOI
TL;DR: An efficient approach capable of recognizing people in frontal-view video sequences using an intra-frame description of silhouettes which consists of a set of rectangles that will fit into any closed silhouette and a dynamic, inter-frame, dimension.

59 citations


Journal ArticleDOI
TL;DR: Simulation results show that this scheme provides better reconstruction results than existing compressive sensing video acquisition schemes, such as 2-D or 3-D wavelet methods and the minimum total-variance (TV) method.
Abstract: We present a compressive sensing video acquisition scheme that relies on the sparsity properties of video in the spatial domain. In this scheme, the video sequence is represented by a reference frame, followed by the difference of measurement results between each pair of neighboring frames. The video signal is reconstructed by first reconstructing the frame differences using 1 minimization algorithm, then adding them sequentially to the reference frame. Simulation results on both simulated and real video sequences show that when the spatial changes between neighboring frames are small, this scheme provides better reconstruction results than existing compressive sensing video acquisition schemes, such as 2-D or 3-D wavelet methods and the minimum total-variance (TV) method. This scheme is suitable for compressive sensing acquisition of video sequences with relatively small spatial changes. A method that estimates the amount of spatial change based on the statistical properties of measurement results is also presented.

59 citations


Patent
02 Jul 2009
TL;DR: In this article, a video transcoding system and method employing an improved rate control algorithm is presented, where a plurality of frames in an input video bitstream are received by the system, in which each frame is in a first coding format, and complexity information indicating the complexity of the frame after decoding is obtained.
Abstract: A video transcoding system and method employing an improved rate control algorithm. A plurality of frames in an input video bitstream are received by the system, in which each frame is in a first coding format. Each frame in the input bitstream is decoded, and complexity information indicating the complexity of the frame after decoding is obtained. An estimated number of bits to allocate for the respective frame is calculated. Using a rate estimation model that employs the complexity information for the respective frame, a picture cost for the frame is calculated based on the estimated number of bits allocated to encode the frame, and a parameter of the rate estimation model. A target cost for the respective frame is calculated based at least in part on the picture cost 10 and the complexity information for the frame. A quantization parameter (QP) is calculated that, when used to encode the respective frame in a second coding format, would generate an encoded frame having an actual cost approximately equal to the target cost. The respective frame is encoded using the calculated QP, and the frames encoded in the second coding format are provided in an output video bitstream.

58 citations


Patent
24 Jun 2009
TL;DR: In this paper, the authors present video encoding and decoding techniques for modified temporal compression based on fragmented references rather than complete reference pictures, which are used as reference pictures for generating predicted frames during a motion compensation process, rather than the entire frame.
Abstract: In general, this disclosure describes techniques for encoding and decoding sequences of video frames using fragmentary reference pictures. The disclosure presents video encoding and decoding techniques for modified temporal compression based on fragmented references rather than complete reference pictures. In a typical sequence of video frames, only a portion (i.e., a tile) of each frame includes moving objects. Moreover, in each frame, the moving objects tend to be confined to specific areas that are common among each frame in the sequence of video frames. As described herein, such common areas of motion are identified. Pictures are then extracted from the identified areas of the video frames. Because these pictures may represent only portions of the frames, this disclosure refers to these pictures as "fragments." It is then these fragments that are used as reference pictures for generating predicted frames during a motion compensation process, rather than the entire frame.

57 citations


Proceedings ArticleDOI
07 Nov 2009
TL;DR: Unlike conventional DVC schemes, the DISCOS framework can perform most encoding operations in the analog domain with very low-complexity, making it be a promising candidate for real-time, practical applications where the analog to digital conversion is expensive, e.g., in Terahertz imaging.
Abstract: This paper proposes a novel framework called Distributed Compressed Video Sensing (DISCOS) - a solution for Distributed Video Coding (DVC) based on the recently emerging Compressed Sensing theory. The DISCOS framework compressively samples each video frame independently at the encoder. However, it recovers video frames jointly at the decoder by exploiting an interframe sparsity model and by performing sparse recovery with side information. In particular, along with global frame-based measurements, the DISCOS encoder also acquires local block-based measurements for block prediction at the decoder. Our interframe sparsity model mimics state-of-the-art video codecs: the sparsest representation of a block is a linear combination of a few temporal neighboring blocks that are in previously reconstructed frames or in nearby key frames. This model enables a block to be optimally predicted from its local measurements by l1-minimization. The DISCOS decoder also employs a sparse recovery with side information to jointly reconstruct a frame from its global measurements and its local block-based prediction. Simulation results show that the proposed framework outperforms the baseline compressed sensing-based scheme of intraframe-coding and intraframe-decoding by 8 – 10dB. Finally, unlike conventional DVC schemes, our DISCOS framework can perform most encoding operations in the analog domain with very low-complexity, making it be a promising candidate for real-time, practical applications where the analog to digital conversion is expensive, e.g., in Terahertz imaging.

Patent
08 Jul 2009
TL;DR: In this article, an audio encoder is adapted for encoding frames of a sampled audio signal to obtain encoded frames, wherein a frame comprises a number of time domain audio samples, comprising a predictive coding analysis stage (110) for determining information on coefficients of a synthesis filter and information on a prediction domain frame based on a frame of audio samples.
Abstract: An audio encoder (100) adapted for encoding frames of a sampled audio signal to obtain encoded frames, wherein a frame comprises a number of time domain audio samples, comprising a predictive coding analysis stage (110) for determining information on coefficients of a synthesis filter and information on a prediction domain frame based on a frame of audio samples. The audio encoder (100) further comprises a frequency domain transformer (120) for transforming a frame of audio samples to the frequency domain to obtain a frame spectrum and an encoding domain decider (130). Moreover, the audio encoder (100) comprises a controller (140) for determining an information on a switching coefficient when the encoding domain decider decides that encoded data of a current frame is based on the information on the coefficients and the information on the prediction domain frame when encoded data of a previous frame was encoded based on a previous frame spectrum.

Patent
Cha Zhang1, Dinei Florencio1
25 Jun 2009
TL;DR: In this paper, a virtual viewpoint is used to determine expected contributions of individual portions of the frames to a synthesized image of the scene from the viewpoint position using the frames, and the frames are transmitted in compressed form via a network to a remote device, which is configured to render the scene using the compressed frames.
Abstract: Multi-view video that is being streamed to a remote device in real time may be encoded. Frames of a real-world scene captured by respective video cameras are received for compression. A virtual viewpoint, positioned relative to the video cameras, is used to determine expected contributions of individual portions of the frames to a synthesized image of the scene from the viewpoint position using the frames. For each frame, compression rates for individual blocks of a frame are computed based on the determined contributions of the individual portions of the frame. The frames are compressed by compressing the blocks of the frames according to their respective determined compression rates. The frames are transmitted in compressed form via a network to a remote device, which is configured to render the scene using the compressed frames.

Patent
08 Sep 2009
TL;DR: In this paper, lossless inter frame transcoding (LIFT) is proposed for improving the error resilience of video streaming, where conventional coded blocks are selectively transcoded into new transcoded block.
Abstract: Described herein is a novel transcoding technique called lossless inter frame transcoding (LIFT) for improving the error resilience of video streaming. According to various embodiments, conventional coded blocks are selectively transcoded into new transcoded block. At the decoder, the transcoded block can be transcoded back to the conventional coded block when the prediction is available and can also be robustly decoded independently when the prediction is unavailable. According to another embodiment, an offline transcoding and online composing technique is provided for generating a composite frame using the transcoded and conventional coded blocks and adjusting the ratio of the transcoded blocks, thereby achieving error robustness scalability.

Proceedings ArticleDOI
Ling Shao1, Ling Ji1
25 May 2009
TL;DR: A novel algorithm for key frame extraction based on intra-frame and inter-frame motion histogram analysis is proposed and validated by a large variety of real-life videos.
Abstract: Key frame extraction is an important technique in video summarization, browsing, searching, and understanding. In this paper, a novel algorithm for key frame extraction based on intra-frame and inter-frame motion histogram analysis is proposed. The extracted key frames contain complex motion and are salient in respect to their neighboring frames, and can be used to represent actions and activities in video. The key frames are first initialized by finding peaks in the curve of entropy calculated on motion histograms in each video frame. The peaked entropies are then weighted by inter-frame saliency which we use histogram intersection to output final key frames. The effectiveness of the proposed method is validated by a large variety of real-life videos.

Proceedings ArticleDOI
28 Dec 2009
TL;DR: Simulation results show that the proposed algorithm has better rate-distortion performance, especially for image sequences with middle-motion complexity or low encoding bit-rate as comparing with H.264/AVC using conventional RDO.
Abstract: Rate-distortion optimization (RDO), in which distortion metric plays a vital role, has been proved to be an effective way in hybrid video coding. This paper proposes an improved rate-distortion optimization method based on SSIM (IRDO-SSIM) in RDO mode selection process. And the derivation of the proper multiplier to fit for the IRDO-SSIM is mainly described in this paper. Simulation results show that the proposed algorithm has better rate-distortion performance, especially for image sequences with middle-motion complexity or low encoding bit-rate as comparing with H.264/AVC using conventional RDO.

Proceedings ArticleDOI
25 May 2009
TL;DR: A new video compression approach which tends to hard exploit the pertinent temporal redundancy in the video frames to improve compression efficiency with minimum processing complexity is presented.
Abstract: Generally, video signal has high temporal redundancies due to the high correlation between successive frames. Actually, this redundancy has not been exploited enough by current video compression technics. In this paper, we present a new video compression approach which tends to hard exploit the pertinent temporal redundancy in the video frames to improve compression efficiency with minimum processing complexity. It consists on a 3D to 2D transformation of the video frames that allows exploring the temporal redundancy of the video using 2D transforms and avoiding the computationally demanding motion compensation step. This transformation turns the spatial-temporal correlation of the video into high spatial correlation. Indeed, this technique transforms each group of pictures to one picture eventually with high spatial correlation. Thus, the decorrelation of the resulting pictures by the DCT makes efficient energy compaction, and therefore produces a high video compression ratio. Many experimental tests had been conducted to prove the method efficiency especially in high bit rate and with slow motion video. The proposed method seems to be well suitable for video surveillance applications and for embedded video compression systems.

Patent
Lidong Xu1, Yi-Jen Chiu1, Wenhao Zhang1
25 Sep 2009
TL;DR: In this article, a motion estimation (ME) method based on reconstructed reference pictures in a B frame or in a P frame at a video decoder is proposed to obtain a motion vector (MV) for a current input block.
Abstract: Methods and systems to apply motion estimation (ME) based on reconstructed reference pictures in a B frame or in a P frame at a video decoder. For a P frame, projective ME may be performed to obtain a motion vector (MV) for a current input block. In a B frame, both projective ME and mirror ME may be performed to obtain an MV for the current input block. The ME process can be performed on sub-partitions of the input block, which may reduce the prediction error without increasing the amount of MV information in the bitstream. Decoder-side ME can be applied for the prediction of existing inter frame coding modes, and traditional ME or the decoder-side ME can be adaptively selected to predict a coding mode based on a rate distribution optimization (RDO) criterion.

Journal ArticleDOI
TL;DR: This work improves H.264 rate control scheme using two tools, the incremental proportional-integral-differential (PID) algorithm and the frame complexity estimation, and decreases the average standard deviation of video quality by 32.29%.

Patent
邸佩云, 胡昌启, 元辉, 马彦卓, 常义林 
13 May 2009
TL;DR: In this paper, a method for key frame extraction from video data streams in a video service system is presented. But the method is not suitable for the case where the motion vector of each frame in the video data stream is obtained and the characteristics vector aggregate according to motion vector is determined.
Abstract: An extracting method of the key frame, a video service apparatus and a video service system are disclosed, in which the method is applied to extract the key frame of the video data stream in the video service system. The method comprises: obtaining the motion vector of each frame in the video data stream, and obtaining the characteristics vector aggregate according to the motion vector; determine whether the direction and the amplitude of the motion vector corresponding to the characteristics vector aggregate of the forward and the backward adjacent two frames occurs change or not; extracting the key frame using the determining result. So the frame whose speed changes abruptly could be extracted effectively.

Patent
16 Feb 2009
TL;DR: In this paper, the authors present a method for image processing, comprising receiving a video frame, coding a first portion of the video frame at a different quality than a second portion, based on an optical property.
Abstract: Systems and methods for image processing, comprising receiving a video frame, coding a first portion of the video frame at a different quality than a second portion of the video frame, based on an optical property, and displaying the video frame.

Patent
02 Sep 2009
TL;DR: In this paper, a real-time athletic estimating method based on a multiple dimensioned unchanged characteristic is proposed, which comprises the steps: (1) a gauss scale space is constructed and a local characteristic point is extracted; (2) a characteristic descriptor of the polar distribution of a rectangular window is constructed; (3) the characteristic point was used for matching and establishing an interframe motion model; and (4) the offset of a current frame output position which corresponds to a window center is calculated.
Abstract: The invention relates to a real-time athletic estimating method based on a multiple dimensioned unchanged characteristic, which comprises the steps: (1) a gauss scale space is constructed and a local characteristic point is extracted; (2) a characteristic descriptor of the polar distribution of a rectangular window is constructed; (3) the characteristic point is used for matching and establishing an interframe motion model; and (4) the offset of a current frame output position which corresponds to a window center is calculated. The athletic estimating method has a size, visual angle and rotation adaptive characteristic, can accurately match images with complicated athletic relation, such as translation, rotation, dimension, a certain visual angle change, and the like and has higher real-time performance. The estimating method has better robustness for common phenomena, such as mistiness, noise, and the like in a video, has higher estimated accuracy for arbitrary ruleless complicated athletic parameters and is combined with a motion compensating method based on motion state identification, thus, the image stabilizing requirement of a video image sequence which can be arbitrarily and randomly shot under complex environment can be realized, and the purposes of real-time output and video stabilization can be achieved.

Patent
20 Feb 2009
TL;DR: In this article, a method for global motion estimation for video stabilization is presented, which enables selecting a video frame from a video stream and computing a single global motion vector for the selected macroblocks and determining occurrence of at least one of: scene change, illumination change or crossing object.
Abstract: Disclosed herein is a method for global motion estimation for video stabilization. The method enables selecting a video frame from a video stream. The method further enables downscaling the video frames by factor close to 2 in a two dimensional range, dividing the downscaled video frame into a plurality of macroblocks, performing motion estimation for the macroblocks to generate a set of local motion vectors. Further, the method enables selecting macroblocks representing global motion vectors from the set of local motion vectors, computing a single global motion vector for the selected macroblocks and determining occurrence of at least one of: scene change, illumination change or crossing object and modifying the single global vector to compensate for errors induced due to occurrence of at least one of: scene change, illumination change or crossing object.

Proceedings ArticleDOI
07 Nov 2009
TL;DR: A novel idea to implement motion compensation by combining the up-sampled current frame and the high frequency part of the previous frame through the SAD framework is presented and experimental results show that the new motion compensation model via frequency classification has an advantage of 2dB gain on average over that of the traditional motion Compensation model.
Abstract: A typical dynamic reconstruction-based super-resolution video involves three independent processes: registration, fusion and restoration. Fast video super-resolution systems apply translational motion compensation model for registration with low computational cost. Traditional motion compensation model assumes that the whole spectrum of pixels is consistent between frames. In reality, the low frequency component of pixels often varies significantly. We propose a translational motion compensation model via frequency classification for video super-resolution systems. A novel idea to implement motion compensation by combining the up-sampled current frame and the high frequency part of the previous frame through the SAD framework is presented. Experimental results show that the new motion compensation model via frequency classification has an advantage of 2dB gain on average over that of the traditional motion compensation model. The SR quality has 0.25dB gain on average after the fusion process which is to minimize error by making use of the new motion compensated frame.

Patent
28 Nov 2009
TL;DR: In this paper, an adaptive frame rate modulation system and method thereof are described, where a frame processing unit is employed to receive a first frame and a second frame for dividing the second frame into a plurality of second block frames.
Abstract: An adaptive frame rate modulation system and method thereof are described. A frame processing unit is employed to receive a first frame and a second frame for dividing the second frame into a plurality of second block frames. A frame change detection unit compares the first block frames with the second block frames correspondingly for detecting the change status between the first and second frames. A timing generator classifies the second block frames of the second frame to construct a plurality of frame rates based on the compared results of the first and second block frames. The timing generator further modulates the frame rates of the second frames.

Patent
27 Dec 2009
TL;DR: In this article, a method for data processing, comprising providing a sequence of image frames that is encoded by identifying intra and inter frames in the sequence and applying a variable block size motion compensation (VBSMC) procedure to the inter frames, is presented.
Abstract: A method for data processing, comprising providing a sequence of image frames that is encoded by identifying intra and inter frames in the sequence and applying a variable block size motion compensation (VBSMC) procedure to the inter frames, thereby generating respective parameters representing the inter frames; selectively encrypting the block sizes, using an encryptor, without encrypting all of the parameters representing the inter frames; and outputting encoded data representing the sequence of the image frames and comprising the encrypted block sizes.

Proceedings ArticleDOI
24 May 2009
TL;DR: A novel fast MVC algorithm where a fast decision strategy of prediction direction of MCP and DCP is designed and blocks with slow motion (SMBs) of all pictures in the base view and anchor pictures in enhancement views are identified based on MVs from MCP without additional computations.
Abstract: Multi-view applications provide viewers a whole new viewing experience, and multi-view video coding (MVC) plays a key role in distributing multi-view video contents through networks with limited bandwidth. However, the computational load of a MVC encoder is pretty heavy so that it is hard to be realized in real-time applications. One reason behind this is that a MVC encoder has to make a decision of prediction direction based on rate-distortion optimization from both motion compensation prediction (MCP) and disparity compensation prediction (DCP) for multiple views. Motivated by this, this paper presents a novel fast MVC algorithm where a fast decision strategy of prediction direction of MCP and DCP is designed. Blocks with slow motion (SMBs) of all pictures in the base view and anchor pictures in enhancement views are identified based on MVs from MCP without additional computations. Then, the identification of SMBs in non-anchor frames of an enhancement view will be inferred from the SMBs of base view or the other coded enhancement views. Finally, the fast algorithm is achieved by applying MCP to SMBs of non-anchor pictures in enhancement views within the same GGOP. Experimental results conducted by JMVM 6.0 show that the average time reduction is 20% while the bitrate increase and PSNR loss are less than 0.25% and 0.0045 dB, respectively.

Journal ArticleDOI
TL;DR: This paper aims to achieve the triple goal of consistent quality video, minimizing the total distortion, and meeting the bit budget strictly all at the same time on the interframe dependent coding structure and proposes a trellis-based framework for this goal.
Abstract: Typically, a video rate control algorithm minimizes the average distortion (denoted as MINAVE) at the cost of large temporal quality variation, especially for videos with high motion and frequent scene changes. To alleviate the negative effect on subjective video quality, another criterion that restricts a small amount of quality variation among adjacent frames is preferred for practical applications. As pointed out by , although some existing proposals can produce consistent quality videos, they often fail to fully utilize the available bits to minimize the global total distortion. In this paper, we would like to achieve the triple goal of consistent quality video, minimizing the total distortion, and meeting the bit budget strictly all at the same time on the interframe dependent coding structure. Two approaches are taken to accomplish this goal. In the first algorithm, a trellis-based framework is proposed. One of our contributions is to derive an equivalent condition between the distortion minimization problem and the budget minimization problem. Second, our trellis state (tree node) is defined in terms of distortion, which facilitates the consistent quality control. Third, by adjusting one key parameter in our algorithm, a solution in between the MINAVE and the constant quality criteria can be obtained. The second approach is to combine the Lagrange multipliers method together with the consistent quality control. The PSNR performance is degraded slightly but the computational complexity is significantly reduced. Simulation results show that both our approaches produce a much smaller PSNR variation at a slight average PSNR loss as compared to the MPEG JM rate control. When they are compared to the other consistent quality proposals, only the proposed algorithms can strictly meet the target bit budget requirement (no more, no less) and produce the largest average PSNR at a small PSNR variation.

Patent
15 Oct 2009
TL;DR: In this paper, a method for determining motion estimation with compressed frame is proposed, which includes loading a macroblock of a current image into codec, transferring a compressed version of motion estimation search window data from previous frame to codec, and carrying out motion estimation to calculate motion vector for current macroblock by matching block to uncompressed version of previous frame data in search window.
Abstract: An apparatus and a method for determining motion estimation with compressed frame, the method includes loading a macroblock of a current image into codec, transferring a compressed version of motion estimation search window data from previous frame to codec, and carrying out motion estimation to calculate motion vector for current macroblock by matching block to uncompressed version of previous frame data in search window.

Patent
Yi-Jen Chiu1, Lidong Xu1, Hong Jiang1
25 Sep 2009
TL;DR: In this paper, a block-based motion vector is derived at the video decoder by utilizing motion estimation among available pixels relative to blocks in one or more reference frames, where the available pixels could be spatially neighboring blocks in the sequential scan coding order of a current frame, blocks in a previously decoded frame, or blocks in an downsampled frame in a lower pyramid when layered coding has been used.
Abstract: Method and apparatus for deriving a motion vector at a video decoder. A block-based motion vector may be produced at the video decoder by utilizing motion estimation among available pixels relative to blocks in one or more reference frames. The available pixels could be, for example, spatially neighboring blocks in the sequential scan coding order of a current frame, blocks in a previously decoded frame, or blocks in a downsampled frame in a lower pyramid when layered coding has been used.