scispace - formally typeset
Search or ask a question

Showing papers on "Residual frame published in 2004"


Proceedings ArticleDOI
24 Oct 2004
TL;DR: This paper improves on their Wyner-Ziv video codec by sending hash codewords of the current frame to aid the decoder in accurately estimating the motion, and implements a low-delay system where only the previous reconstructed frame is used to generate the side information of a current frame.
Abstract: In the current interframe video compression systems, the encoder performs predictive coding to exploit the similarities of successive frames. The Wyner-Ziv theorem on source coding with side information available only at the decoder suggests that an asymmetric video codec, where individual frames are encoded separately, but decoded conditionally (given temporally adjacent frames) could achieve similar efficiency. In the previous work we propose a Wyner-Ziv coding scheme for motion video that uses intraframe encoding instead of interframe decoding. In this paper we improve on our Wyner-Ziv video codec by sending hash codewords of the current frame to aid the decoder in accurately estimating the motion. This allows us to implement a low-delay system where only the previous reconstructed frame is used to generate the side information of a current frame. Simulation results show significant gains above conventional DCT-based intraframe coding. The Wyner-Ziv video codec with hash-based motion compensation at the receiver enables low-complexity encoding while achieving high compression efficiency.

200 citations


Patent
30 Apr 2004
TL;DR: In this article, a drift-free hybrid method of performing video stitching is provided, which includes decoding a plurality of video bitstreams and storing prediction information in conjunction with previously generated frames to predict pixel blocks in the next frame.
Abstract: A drift-free hybrid method of performing video stitching is provided. The method includes decoding a plurality of video bitstreams and storing prediction information. The decoded bitstreams form video images, spatially composed into a combined image. The image comprises frames of ideal stitched video sequence. The method uses prediction information in conjunction with previously generated frames to predict pixel blocks in the next frame. A stitched predicted block in the next frame is subtracted from a corresponding block in a corresponding frame to create a stitched raw residual block. The raw residual block is forward transformed, quantized, entropy encoded and added to the stitched video bitstream along with the prediction information. Also, the stitched raw residual block is inverse transformed and dequantized to create a stitched decoded residual block. The residual block is added to the predicted block to generate the stitched reconstructed block in the next frame of the sequence.

145 citations


Patent
Shawmin Lei1, Shijun Sun1
15 Oct 2004
TL;DR: In this article, a bitstream with a current video frame encoded with two interlaced fields, in a MPEG-2, MPEG-4, or H.264 standard, decoding a current frame top field, decoding the current frame bottom field, and presenting the decoded top and bottom fields as a 3D frame image.
Abstract: Systems and methods are provided for receiving and encoding 3D video. The receiving method comprises: accepting a bitstream with a current video frame encoded with two interlaced fields, in a MPEG-2, MPEG-4, or H.264 standard; decoding a current frame top field; decoding a current frame bottom field; and, presenting the decoded top and bottom fields as a 3D frame image. In some aspects, the method presents the decoded top and bottom fields as a stereo-view image. In other aspects, the method accepts 2D selection commands in response to a trigger such as receiving a supplemental enhancement information (SEI) message, an analysis of display capabilities, manual selection, or receiver system configuration. Then, only one of the current frame interlaced fields is decoded, and a 2D frame image is presented.

127 citations


Proceedings ArticleDOI
23 May 2004
TL;DR: Simulation results show that the H.264 coder, using the proposed algorithm with virtually little computational complexity added, effectively alleviates PSNR surges and sharp drops for frames caused by high motions or scene changes.
Abstract: In recent years, rate control plays an increasing important role in real-time video communication applications using MPEG-4 AVC/H.264. An important step in many existing rate control algorithms, which employs the quadratic rate-distortion (R-D) model, is to determine the target bits for each P frame. This paper aims in improving video distortion, due to high motions or scene changes, by more accurately predicting frame complexity using the statistics of previously encoded frames. We use mean absolute difference (MAD) ratio as a measure for global frame encoding complexity. Bit budget is allocated to frames according to their MAD ratio, combined with the bits computed based on their buffer status. Simulation results show that the H.264 coder, using our proposed algorithm with virtually little computational complexity added, effectively alleviates PSNR surges and sharp drops for frames caused by high motions or scene changes.

94 citations


Proceedings ArticleDOI
17 May 2004
TL;DR: The effect of motion vector accuracy on the efficiency of motion compensated frame rate up conversion (MC-FRUC) is studied and a processing scheme is proposed to improve the interpolated frame quality at the decoder.
Abstract: In this paper, the effect of motion vector accuracy on the efficiency of motion compensated frame rate up conversion (MC-FRUC) is studied. The motion vector processing problem is formulated and analyzed via motion vector modelling. A processing scheme is proposed to improve the interpolated frame quality at the decoder. The practical application of integrating the motion vector processing algorithm in a standard H.263 decoder for MC-FRUC is also discussed. Experimental results show that a 0.4-0.6 dB gain in the H.263 codec, and a 0.5-2 dB gain in off-line frame rate conversion can be obtained by the proposed algorithm.

76 citations


Proceedings ArticleDOI
23 Mar 2004
TL;DR: In the proposed approach, only two consecutive frames to generate a small set of motion vectors that represent the motion from the previous frame to the current frame is used, which makes it suitable for real time applications.
Abstract: Geometry compression is the compression of the 3D geometric data that provides a computer graphics system with the scene description necessary to render images. Geometric data is quite large and, therefore, needs effective compression methods to decrease the transmission and storage bit requirements. A large amount of research has focused on static geometry compression, but only limited research has addressed animated geometry compression, the compression of temporal sequences of geometry data. This paper proposes an octree-based motion representation method that can be applied to compress animated geometric data. In our approach, 3D animated sequences can be represented with a compression factor of over 100, with slight losses in animation quality. Paper focuses on compressing vertex positions for all the frames. In the proposed approach, only two consecutive frames to generate a small set of motion vectors that represent the motion from the previous frame to the current frame is used. The motion vectors are used to predict the vertex positions for each frame except the first frame. The process generates a hierarchical octree motion representation for each frame. Quantization and an adaptive arithmetic coder are used to achieve further data reduction. The simple and efficient decompression of this approach makes it suitable for real time applications.

75 citations


Patent
12 May 2004
TL;DR: In this article, a method of compressing video data having at least one frame having at most one block and each block having an array of pixels is provided, which transforms the pixels of each block into coefficients and creates an optimal transmission order of the coefficients.
Abstract: A method of compressing video data having at least one frame having at least one block and each block having an array of pixels is provided. The method transforms the pixels of each block into coefficients and creates an optimal transmission order of the coefficients. The method also optimizes the speed of processing compressed video data by partitioning the data bitstream and coding each partition independently. The method also predicts fractional pixel motion by selecting an interpolation method for each given plurality or block of pixels depending upon at least one metric related to each given block and varies the method from block to block. The method also enhances error recovery for a current frame using a frame prior to the frame immediately before the current frame as the only reference frame for lessening quality loss during data transmission. Enhanced motion vector coding is also provided.

75 citations


Patent
07 May 2004
TL;DR: In this paper, a method of processing sequential frames of data comprises repeating the following steps: acquiring at least a reference frame containing data points and a current frame of data points; identifying a set of anchor points in the reference frame; assigning to each anchor point in the frame a respective motion vector that estimates the location of the anchor point.
Abstract: A method of processing sequential frames of data comprises repeating the following steps for successive frames of data: acquiring at least a reference frame containing data points and a current frame of data points; identifying a set of anchor points in the reference frame; assigning to each anchor point in the reference frame a respective motion vector that estimates the location of the anchor point in the current frame; defining polygons formed of anchor points in the reference frame, each polygon containing data points in the reference frame, each polygon and each data point contained within the polygon having a predicted location in the current frame based on the motion vectors assigned to anchor points in the polygon; for one or more polygons in the reference frame, adjusting the number of anchor points in the reference frame based on accuracy of the predicted locations of data points in the current frame; and if the number of anchor points is increased by addition of new anchor points, then assigning motion vectors to the new anchor points that estimate the location of the anchor points in the current frame.

70 citations


Patent
16 Mar 2004
TL;DR: In this paper, a method of processing video frame data includes the steps of: receiving a video frame, partially decoding the video frame; fully decoding and encoding the macroblocks based on the determined video data parameters to provide a compressed video frame for subsequent display.
Abstract: A method of processing video frame data includes the steps of: receiving a video frame; partially decoding the video frame; fully decoding the video frame to produce macroblocks; determining video data parameters from the partially decoded video frame or both the partially and fully decoded video frame; and encoding the macroblocks based on the determined video data parameters to provide a compressed video frame for subsequent display.

60 citations


Patent
15 Nov 2004
TL;DR: In this paper, a video decoder receives an entry point key frame comprising first and second interlaced video fields and decodes a first syntax element comprising information (e.g., frame coding mode) for the first syntax level in a bitstream.
Abstract: A video decoder receives an entry point key frame comprising first and second interlaced video fields and decodes a first syntax element comprising information (e.g., frame coding mode) for the entry point key frame at a first syntax level (e.g., frame level) in a bitstream. The first interlaced video field is a predicted field, and the second interlaced video field is an intra-coded field. The information for the entry point key frame can be a frame coding mode (e.g., field interlace) for the entry point key frame. The decoder can decode a second syntax element at the first syntax level comprising second information (e.g., field type for each of the first and second interlaced video fields) for the entry point key frame.

58 citations


Patent
25 Aug 2004
TL;DR: In this article, a method for inter-mode prediction in video coding is proposed, which consists of checking a data block of an image for zero motion and computing frame difference of the data block based on the checking for zero motions.
Abstract: A method for inter-mode prediction in video coding, the method comprising checking a data block of an image for zero motion; computing frame difference of the data block based on the checking for zero motion; and making an inter-mode prediction selection based on the computed frame difference.

Patent
27 Feb 2004
TL;DR: In this paper, a phase correlation of corresponding regions of the predicted frame and reference frame is used to identify peaks in the phase correlation, and the location of the peaks are used as candidate motion vectors.
Abstract: Motion vectors for encoding a predicted frame relative to a reference frame are determined from a phase correlation of corresponding regions of the predicted frame and reference frame. Peaks in the phase correlation are identified, and the location of the peaks are used as candidate motion vectors. From this limited set of candidate motion vectors, the best motion vectors for predicting blocks within each region can be readily identified.

Patent
17 Feb 2004
TL;DR: In this article, a receiver uses the map and frame count information to find data in the fields of received frames, where the odd fields contain the current map and part of the frame count.
Abstract: Frames comprise odd fields and even fields. The frame sync segments of the odd fields contains a current map specifying the location of data in frames, a next map specifying the location of data in a future frame, and a frame count designating the future frame. The frame sync segments of the even field may contain the same information. Alternatively, the frame sync segments of the odd fields contain the current map and part of the frame count, and the frame sync segments of the corresponding even fields contain the next map and the rest of the frame count. A receiver uses the map and frame count information to find data in the fields of received frames.

Journal ArticleDOI
TL;DR: This paper shows that using a dual-frame buffer together with intra/inter mode switching improves the compression performance of the coder and investigates the effect of feedback in making more informed and effective mode-switching decisions.
Abstract: Video codecs that use motion compensation benefit greatly from the development of algorithms for near-optimal intra/inter mode switching within a rate-distortion framework. A separate development has involved the use of multiple-frame prediction, in which more than one past reference frame is available for motion estimation. In this paper, we show that using a dual-frame buffer (one short-term frame and one long-term frame available for prediction) together with intra/inter mode switching improves the compression performance of the coder. We improve the mode-switching algorithm with the use of half-pel motion vectors. In addition, we investigate the effect of feedback in making more informed and effective mode-switching decisions. Feedback information is used to limit drift errors due to packet losses by synchronizing the long-term frame buffers of both the encoder and the decoder.

Patent
04 Jun 2004
TL;DR: In this paper, a method and system for automated video quality assessment which reduces the adverse effects of sub-field/frame misalignments between the reference and test sequences is presented.
Abstract: A method and system for automated video quality assessment which reduces the adverse effects of sub-field/frame misalignments between the reference and test sequences. More particularly, the invention provides for misalignments down to a sub-field/frame level to be handled by individually matching sub-field/frame elements of a test video field/frame with sub-field/frame elements from a reference video field/frame. The use of a matching element size that is significantly smaller than the video field/frame size enables transient sub-field/frame misalignments to be effectively tracked.

Patent
15 Nov 2004
TL;DR: In this article, a decoder receives a field start code for an entry point key frame, followed by a field header, which can be used to decode the second coded interlaced video field without decoding the first one.
Abstract: A decoder receives a field start code for an entry point key frame. The field start code indicates a second coded interlaced video field in the entry point key frame following a first coded interlaced video field in the entry point key frame and indicates a point to begin decoding of the second coded interlaced video field. The first coded interlaced video field is a predicted field, and the second coded interlaced video field is an intra-coded field. The decoder decodes the second field without decoding the first field. The field start code can be followed by a field header. The decoder can receive a frame header for the entry point key frame. The frame header may comprise a syntax element indicating a frame coding mode for the entry point key frame and/or a syntax element indicating field types for the first and second coded interlaced video fields.

Patent
12 Oct 2004
TL;DR: In this paper, a method of simplifying the encoding of a predetermined number of bits of data into frames including adding error coding bits was proposed, so that a ratio of the frame length times the baud rate of the bit packing ratio of data divided the total bit of data is always an integer.
Abstract: A method of simplifying the encoding of a predetermined number of bits of data into frames including adding error coding bits so that a ratio of the frame length times the baud rate of the frame times he bit packing ratio of the data divided the total bits of data is always an integer. The method may also convolutionally encode the bits of data so that the same equation is also always an integer.

Journal ArticleDOI
TL;DR: An algorithm is presented that takes advantage of soft information provided by a soft decoder to produce an enhanced estimate of the frame boundary and achieves the lower bound for signal-to-noise ratio values exceeding 1 dB.
Abstract: For the additive white Gaussian noise channel, we consider the problem of frame synchronization for coded systems. We present an algorithm that takes advantage of soft information provided by a soft decoder to produce an enhanced estimate of the frame boundary. To reduce complexity, a companion algorithm is introduced that is a hybrid of the optimal uncoded frame synchronizer introduced by Massey and the list synchronizer introduced by Robertson. The high-complexity coded maximum-likelihood frame synchronizer used by Robertson will accordingly be replaced by our algorithm, which operates on decoder-provided soft decisions. The algorithm begins by obtaining a list of high-probability starting positions via the log-likelihood function of the optimal uncoded frame synchronizer. Then, a test /spl delta/ is used to decide if the decision of the optimal uncoded frame synchronizer is sufficient, or whether list synchronization is required. If the test chooses in favor of using the optimal uncoded synchronizer, the estimate is obtained with relative ease. Otherwise, list synchronization is performed, and statistics provided by the decoder are used to resolve the frame boundary. Monte Carlo simulations demonstrate that the frame-synchronization-error rate (the probability of the synchronizer making an error) achieves the lower bound for signal-to-noise ratio values exceeding 1 dB.

Patent
Peng Lin1, Yeong-Taeg Kim1
17 Jun 2004
TL;DR: In this paper, a video noise reduction system for a set of video frames that computes a first motion signal using a current frame and multiple consecutive previous frames, computes the second motion signal by soft switching between the multi-frame temporal average and the recursive average based on the first motion signals is presented.
Abstract: A video noise reduction system for a set of video frames that computes a first motion signal using a current frame and multiple consecutive previous frames, computes a second motion signal using the current frame and the processed preceding frame; computes the multi-frame temporal average of the current frame and multiple consecutive previous frames; computes the recursive average of the current frame and the processed preceding frame; generates a temporal filtered signal by soft switching between the multi-frame temporal average and the recursive average based on the first motion signal; applies a spatial filter to the current frame to generate a spatial filtered signal; and combines the temporal filtered signal and the spatial filtered signal based on the second motion signal to generate a final noise reduced video output signal

Patent
Jani Lainema1
06 Jul 2004
TL;DR: In this paper, a method of coding video frames in a telecommunication system, comprising of forming a video frame of consecutive stationary frames, storing the frame reconstruction data of at least one frame as a reference frame and the motion data of earlier coded neighboring blocks, is presented.
Abstract: A method of coding video frames in a telecommunication system, comprising: forming a video frame of consecutive stationary frames, storing the frame reconstruction data of at least one frame as a reference frame and the motion data of earlier coded neighboring blocks, defining by means of the motion data of one or more earlier coded neighboring blocks the motion data of the block to be coded, which neighboring block is formed by means of the stored reference frame, defining the frame reconstruction data of the frame to be coded, selecting for use the frame reconstruction data and motion data representing the block to be coded, which provide a pre-defined coding efficiency with a predefined picture quality.

Patent
02 Sep 2004
TL;DR: In this paper, a decoder decodes a bitplane signaled at frame layer for the first interlaced video frame in a video sequence, and an encoder performs corresponding encoding.
Abstract: In one aspect, for a first interlaced video frame in a video sequence, a decoder decodes a bitplane signaled at frame layer for the first interlaced video frame. The bitplane represents field/frame transform types for plural macroblocks of the first interlaced video frame. For a second interlaced video frame in the video sequence, for each of at least one but not all of plural macroblocks of the second interlaced video frame, the decoder processes a per macroblock field/frame transform type bit signaled at macroblock layer. An encoder performs corresponding encoding.

Patent
27 Dec 2004
TL;DR: In this paper, a keyframe is inserted according to access to a scene based on the content of an image, so that usability of a function allowing access to an random image frame is increased.
Abstract: A method of adaptively inserting a key frame according to video content to allow a user to easily access a desired scene. A video encoder includes a coding mode determination unit receiving a temporal residual frame with respect to an original frame, determining whether the original frame has a scene change by comparing the temporal residual frame with a pre¬ determined reference, determining to encode the temporal residual frame when it is determined that the original frame does not have the scene change, and determining to encode the original frame when it is determined that the original frame has the scene change, and a spatial transformer performing spatial transform on either of the temporal residual frame and the original frame according to the determination of the coding mode determination unit and obtaining a transform coefficient. A keyframe is inserted according to access to a scene based on the content of an image, so that usability of a function allowing access to a random image frame is increased.

Patent
Dong-Kyu Kim1
19 Nov 2004
TL;DR: In this article, the authors proposed a method of dividing a payload intra-frame for improving throughput of a carrier sensing multiple access/collision avoidance (CSMA/CA) wireless communication network.
Abstract: Provided is a method of dividing a payload intra-frame for improving throughput of a carrier sensing multiple access/collision avoidance (CSMA/CA) wireless communication network. The payload intra-frame dividing method includes a data frame dividing step and a physical layer frame generating step in which a physical layer receives a plurality of data frames from an upper layer within a range of the maximum data frame length the physical layer can transmit and transmits the data frames as a single physical layer data frame. Furthermore, an acknowledge (ACK) frame is provided, which can minimize the deterioration of throughput even when a data frame, which has been divided into a plurality of data frames and transmitted as a single data frame, is required to be re-transmitted because an error is generated in the data frame.

Patent
15 Jan 2004
TL;DR: In this article, the header information in a video frame is analyzed to detect channel errors in the video frame, and the error is then corrected by isolating the detected channel errors to a few macroblocks in video frame to reduce data loss and improve video quality.
Abstract: Efficient techniques are provided to detect and correct channel errors found in an encoded video signal. In one example embodiment, header information in a video frame is analyzed to detect channel errors in the video frame. The video frame is then corrected for detected channel errors by isolating the detected channel errors to a few macroblocks in the video frame to reduce data loss and improve video quality.

Patent
27 Oct 2004
TL;DR: In this paper, a method for constructing a frame preamble in an OFDM wireless communication system, and a method to acquire frame synchronization and searching cells using the preamblage is presented.
Abstract: Disclosed are a method for constructing a frame preamble in an OFDM wireless communication system, and a method for acquiring frame synchronization and searching cells using the preamble. The preamble is arranged at the beginning of a frame and constructed of a pattern repeated an integer number of times, and a CP. The pattern has a length shorter than a single OFDM symbol interval. The length of the repetitive pattern is not limited to an integer number of times a single OFDM symbol interval. Frame synchronization can be acquired by observing cross-correlation of a received signal and reference patterns and detecting the moment when the absolute value of the cross-correlation exceeds a predetermined threshold. Otherwise, frame synchronization can be acquired by observing auto-correlation of the received signal using repeated patterns included in the received signal and detecting the moment when the absolute value of the auto-correlation becomes the maximum value.

Patent
10 Aug 2004
TL;DR: In this article, a method and Electro-Optical (EO) system for processing imagery comprising selecting a first frame of data as a template frame, capturing a second frame using the EO system, correlating the plurality of pixels of the second frame with pixels from the template frame to generate a plurality of shift vectors.
Abstract: A method and Electro-Optical (EO) system for processing imagery comprising selecting a first frame of data as a template frame; capturing a second frame of data using the EO system, for a plurality of pixels of the second frame, correlating the plurality of pixels of the second frame with pixels of the template frame to generate a plurality of shift vectors, one for each pixel of the plurality of pixels of the second frame, registering the second frame with the template frame by interpolating the second frame using the plurality of shift vectors and re-sampling at least a portion of the second frame to produce a registered frame, re-sampling the template frame, and combining the re-sampled template frame and the registered frame to generate an averaged frame.

Patent
02 Aug 2004
TL;DR: In this paper, a motion estimation method and apparatus for video coding of a multi-view sequence is described, which includes identifying one or more pixels in a first frame (402) of a multiview video sequence, and constraining a search range (406, 408, 410) associated with a second frame (404).
Abstract: A motion estimation method and apparatus for video coding of a multi-view sequence is described. In one embodiment, a motion estimation method includes identifying one or more pixels in a first frame (402) of a multi-view video sequence, and constraining a search range (406, 408, 410) associated with a second frame (404) of the multi-view video sequence based on an indication of a desired correlation between efficient coding and semantic accuracy. The semantic accuracy relies on use of geometric configurations of cameras capturing the multi-view video sequence. The method further includes searching the second frame (402) within the constrained search range for a match of the pixels identified in the first frame (404).

Proceedings ArticleDOI
30 Jun 2004
TL;DR: This work proposes an algorithm that takes into account the correlation/continuity of motion vectors among different reference frames and shows that the algorithm effectively reduces the computations of MRF-ME, and achieves similar coding gain compared to the full-search approach.
Abstract: Multiple reference frame motion compensation is a new feature introduced in H.264/MPEG-4 AVC to improve video coding performance. However, the computational cost of multiple reference frame motion estimation (MRF-ME) is very high. We propose an algorithm that takes into account the correlation/continuity of motion vectors among different reference frames. We also show that the algorithm effectively reduces the computations of MRF-ME, and achieves similar coding gain compared to the full-search approach.

Patent
26 Mar 2004
TL;DR: A rate controller in a transcoder, which receives a stream of compressed frames carried in a bit stream, selectively determines whether to quantize and/or threshold slices of a frame carried in the stream of frames.
Abstract: A rate controller in a transcoder, which receives a stream of compressed frames carried in a bit stream, selectively determines whether to quantize and/or threshold slices of a frame carried in the stream of frames. The rate controller determines the input size of the frame and based at least in part upon at least a desired size, requantizes and/or thresholds the frame such that the output size of the frame is approximately the desired size.

Patent
12 Mar 2004
TL;DR: In this article, a system and method for effectively and efficiently retransmitting data frames, which were inadequately received by a receiver, back to the receiver for combination with the mismatched data frames to increase gain at the receiver is proposed.
Abstract: A system and method for effectively and efficiently retransmitting data frames, which were inadequately received by a receiver, back to the receiver for combination with the inadequately received data frames to increase gain at the receiver. The system and method preferably uses an R-Rake retransmission technique while eliminating the need to transmit a signaling message to a receiver for identifying the data frames to be combined as in the conventional R-Rake technique, and employs a data transmitter and a controller. The data transmitter transmits data in data frame format to be received by a receiver. Upon receiving a retransmission request from the receiver, the controller controls the data transmitter to retransmit a particular data frame to the receiver without transmitting a signaling message. The receiver receives the retransmitted data frame and compares it to other data frames stored in a buffer to determine the likelihood of a match between the transmitted data frame and a buffered data frame. When the likelihood of a match exceeds at least one predetermined threshold, the receiver combines the retransmitted data frame with the matching data frame, and provides the combined data frame to a higher layer in the receiver. However, if the likelihood of a match is below any of the predetermined thresholds, the receiver stores the either the combined data frame, or the retransmitted and matching data frame in the buffer, depending on which threshold the probability of a match is below, and sends another retransmission request to the transmitter to again retransmit the data frame. Accordingly, gain at the receiver can be increased without a substantial increase in signaling overhead.