scispace - formally typeset
Search or ask a question

Showing papers on "Residual frame published in 2012"


Journal ArticleDOI
TL;DR: A unified model for detecting different types of video shot transitions is presented and frame estimation scheme using the previous and the next frames is formulated using a multilayer perceptron network.
Abstract: We have presented a unified model for detecting different types of video shot transitions. Based on the proposed model, we formulate frame estimation scheme using the previous and the next frames. Unlike other shot boundary detection algorithms, instead of properties of frames, frame transition parameters and frame estimation errors based on global and local features are used for boundary detection and classification. Local features include scatter matrix of edge strength and motion matrix. Finally, the frames are classified as no change (within shot frame), abrupt change, or gradual change frames using a multilayer perceptron network. The proposed method is relatively less dependent on user defined thresholds and is free from sliding window size as widely used by various schemes found in the literature. Moreover, handling both abrupt and gradual transitions along with non-transition frames under a single framework using model guided visual feature is another unique aspect of the work.

88 citations


Patent
09 Aug 2012
TL;DR: In this paper, a method and apparatus for decoding the depth map of multi-view video data are provided. But the method is based on decoding a block of a color video frame into a partition based on a pixel value of the block of the prediction-encoded and restored multiview colour video frame.
Abstract: A method and apparatus for decoding the depth map of multi-view video data are provided. The method includes splitting a block of restored multi-view color video frame into a partition based on a pixel value of the block of the prediction-encoded and restored multi-view color video frame; obtaining a parameter indicating correlation between block partitions of the multi-view color video frame and block partitions of the depth map frame using peripheral pixel values of the block partitions of the multi-view color video frame and peripheral pixel values of the block partitions of the depth map frame corresponding to the block partitions of the multi-view color video frame with respect to each of the block partitions of the restored multi-view color video frame; and obtaining prediction values of corresponding block partitions of the depth map frame from the block partitions of the restored multi-view color video frame using the obtained parameter.

63 citations


Journal ArticleDOI
TL;DR: This paper presents a new iterative algorithm—gradient descent of the frame potential—for increasing the degree of tightness of any finite unit norm frame, and shows that this algorithm converges to a unit norm tight frame at a linear rate.

53 citations


Proceedings ArticleDOI
01 Dec 2012
TL;DR: In this paper, a digital video watermarking technique based on identical frame extraction in 3-Level Discrete Wavelet Transform (DWT) is proposed, which has strong robustness against some common attacks such as cropping, Gaussian noise adding, Salt & pepper noiseAdding, frame dropping and frame adding.
Abstract: Digital video watermarking was introduced at the end of the last century to provide means of enforcing video copyright protection. Video watermarking involves embedding a secret information in the video. In this paper, we proposed a digital video watermarking technique based on identical frame extraction in 3-Level Discrete Wavelet Transform (DWT). In the proposed method, first the host video is divided into video shots. Then from each video shot one video frame called identical frame is selected for watermark embedding. Each identical frame is decomposed into 3-level DWT, then select the higher subband coefficients to embed the watermark and the watermark are adaptively embedded to these coefficients and thus guarantee the perceptual invisibility of the watermark. For watermark detection, the correlation between the watermark signal and the watermarked video is compared with a threshold value obtained from embedded watermark signal. The experimental results demonstrate that the watermarking method has strong robustness against some common attacks such as cropping, Gaussian noise adding, Salt & pepper noise adding, frame dropping and frame adding.

49 citations


Patent
10 May 2012
TL;DR: In this article, a robotic system includes a camera having an image frame whose position and orientation relative to a fixed frame is determinable through one or more image frame transforms, a tool disposed within a field of view of the camera and having a tool frame whose location and orientations relative to the fixed frame can be determined through one of the image and tool frame transforms.
Abstract: A robotic system includes a camera having an image frame whose position and orientation relative to a fixed frame is determinable through one or more image frame transforms, a tool disposed within a field of view of the camera and having a tool frame whose position and orientation relative to the fixed frame is determinable through one or more tool frame transforms, and at least one processor programmed to identify pose indicating points of the tool from one or more camera captured images, determine an estimated transform for an unknown one of the image and tool frame transforms using the identified pose indicating points and known ones of the image and tool frame transforms, update a master-to-tool transform using the estimated and known ones of the image and tool frame transforms, and command movement of the tool in response to movement of a master using the updated master-to-tool transform.

40 citations


Patent
27 Jun 2012
TL;DR: In this paper, a method, apparatus, system and computer program product for analysing video images of a sports motion and, in particular, to identification of key motion positions within a video and automatic extraction of segments of a video containing a sport motion is presented.
Abstract: The invention is directed to a method, apparatus, system and computer program product for analysing video images of a sports motion and, in particular, to identification of key motion positions within a video and automatic extraction of segments of a video containing a sports motion. Video data displaying a sports motion comprises data representative of a number of image frames. For a number of image frames, one or more frame difference measures are calculated between the image frame and another image frame. The frame difference measures are analysed to identify a plurality of image frames that each show a key position of the sports motion. The segment of the video containing the sports motion showing the sports motion is between two of the image frames showing the key positions. Frame difference measures may be calculated based on pixel difference measures or optical flow techniques.

39 citations


Patent
08 Mar 2012
TL;DR: In this paper, a method of determining the temporal motion vector predictor comprises selecting as the temporal predictor one motion vector from among motion vectors in a reference block of a reference frame different from the current frame.
Abstract: A temporal motion vector predictor is includable, together with one or more spatial motion vector predictors, in a set of motion vector predictors for a block to encode of a current frame. A method of determining the temporal motion vector predictor comprises selecting as the temporal predictor one motion vector from among motion vectors in a reference block of a reference frame different from the current frame. The reference block is a block of the reference frame collocated with the block to encode or a block of the reference frame neighboring the collocated block. The selection is based on a diversity criterion for achieving diversity among the predictors of the set. This can reduce the motion vector memory requirements with no or no significant additional coding efficiency penalty. Alternatively, even if the motion vector memory is not reduced in size, coding efficiency improvements can be achieved.

39 citations


Patent
28 Aug 2012
TL;DR: In this article, a method of controlling a video content system can include: obtaining a current input frame and a preceding input frame from an input video sequence and obtaining a corresponding degraded frame from a degraded video sequence corresponding to the input video frame; computing a first differences value from the current input and the preceding input frames and a second differences value between the current degraded frame and preceding degraded frame, giving rise to an inter-frame quality score.
Abstract: According to examples of the presently disclosed subject matter, a method of controlling a video content system can include: obtaining a current input frame and a preceding input frame from an input video sequence and obtaining a current degraded frame and a preceding degraded frame from a degraded video sequence corresponding to the input video sequence; computing a first differences value from the current input and the preceding input frames and a second differences value from the current degraded frame and the preceding degraded frame, comparing the first and second differences values, giving rise to an inter-frame quality score; computing an intra-frame quality score using an intra-frame quality measure that is applied in the pixel-domain of the current degraded frame and the current input frame and providing a configuration instruction to the video content system based on a quality criterion related to the inter-frame and the intra-frame quality scores.

37 citations


Journal ArticleDOI
TL;DR: Dropping frames based on their visual scores proves superior to random dropping of B frames and is used in a router to predict the visual impact of a frame loss and perform intelligent frame dropping to relieve network congestion.
Abstract: We examine the visual effect of whole-frame loss by different decoders. Whole-frame losses are introduced in H.264/AVC compressed videos which are then decoded by two different decoders with different common concealment effects: frame copy and frame interpolation. The videos are seen by human observers who respond to each glitch they spot. We found that about 39% of whole-frame losses of B frames are not observed by any of the subjects, and over 58% of the B frame losses are observed by 20% or fewer of the subjects. Using simple predictive features that can be calculated inside a network node with no access to the original video and no pixel level reconstruction of the frame, we develop models that can predict the visibility of whole B frame losses. The models are then used in a router to predict the visual impact of a frame loss and perform intelligent frame dropping to relieve network congestion. Dropping frames based on their visual scores proves superior to random dropping of B frames.

36 citations


Journal ArticleDOI
Bo Yan1, Jie Zhou1
TL;DR: An efficient frame concealment algorithm is proposed for depth image-based 3-D video transmission, which is able to provide accurate estimation for the motion vectors of the lost frame with the help of the depth information and significantly outperforms other existing frame recovery methods.
Abstract: In depth image-based 3-D video transmission, the compressed video stream is very likely to be corrupted by channel errors. Due to the high compression ratio of H.264/AVC, it is often common that an entire coded picture is packetized into one packet. Thus the loss of a packet may result in the loss of the whole video frame. Currently, most of the frame concealment methods are mainly for 2-D video transmission. In this paper, we have proposed an efficient frame concealment algorithm for depth image-based 3-D video transmission, which is able to provide accurate estimation for the motion vectors of the lost frame with the help of the depth information. Simulation results show that it is highly effective and significantly outperforms other existing frame recovery methods by up to 2.91 dB.

35 citations


01 Jan 2012
TL;DR: A new approach for key frame extraction based on the block based Histogram difference and edge matching rate is proposed, which provides global information about the video content and are faster without any performance degradations.
Abstract: Shot boundary detection and Keyframe Extraction is a fundamental step for organization of large video data. Key frame extraction has been recognized as one of the important research issues in video information retrieval. Video shot boundary detection, which segments a video by detecting boundaries between camera shots, is usually the first and important step for content based video retrieval and video summarization. This paper discusses the importance of key frame extraction; briefly review and evaluate the existing approaches, to overcome the shortcomings of the existing approaches. This paper also proposes a new approach for key frame extraction based on the block based Histogram difference and edge matching rate. Firstly, the Histogram difference of every frame is calculated, and then the edges of the candidate key frames are extracted by Prewitt operator. At last, the edges of adjacent frames are matched. If the edge matching rate is above average edge matching rate, the current frame is deemed to the redundant key frame and should be discarded. Histogram based algorithms are very applicable to SBD; They provide global information about the video content and are faster without any performance degradations.

Patent
Ce Wang1, Walid Ali1
21 Dec 2012
TL;DR: In this paper, a motion estimator may generate and output a motion vector that represents a change in position between a current block of the current frame and a matching reference block of a reference frame of the reference frame.
Abstract: Techniques to perform fast motion estimation are described An apparatus may comprise a motion estimator operative to receive as input a current frame and a reference frame from a digital video sequence The motion estimator may generate and output a motion vector The motion vector may represent a change in position between a current block of the current frame and a matching reference block of the reference frame The motion estimator may utilize an enhanced block matching technique to perform block matching based on stationary and spatially proximate blocks Other embodiments are described and claimed

Patent
24 Aug 2012
TL;DR: In this paper, a correspondent relationship between a plurality of contiguous frames FrN and FrN+1 is estimated, and the correspondences of the two frames are transformed to obtain first and second interpolated frames FrH1 and FrH2.
Abstract: To acquire a high-resolution frame from a plurality of frames sampled from a video image, it is necessary to obtain a high-resolution frame with reduced picture quality degradation regardless of motion of a subject included in the frame. Because of this, between a plurality of contiguous frames FrN and FrN+1, there is estimated a correspondent relationship. Based on the correspondent relationship, the frames FrN+1 and FrN are interposed to obtain first and second interpolated frames FrH1 and FrH2. Based on the correspondent relationship, the coordinates of the frame FrN+1 are transformed, and from a correlation value with the frame FrN, there is obtained a weighting coefficient α(x°, y°) that makes the weight of the first interpolated frame FrH1 greater as a correlation becomes greater. With the weighting coefficient, the first and second interpolated frames are weighted and added to acquire a synthesized frame FrG.

Patent
10 Oct 2012
TL;DR: In this paper, the first and second sub-sequences of a video sequence are encoded using a first sequence parameter set and a second sequence parametrization set, respectively.
Abstract: Techniques are described related to receiving first and second sub-sequences of video, wherein the first sub-sequence includes one or more frames each having a first resolution, and the second sub-sequence includes one or more frames each having a second resolution, receiving a first sequence parameter set and a second sequence parameter set for the coded video sequence, wherein the first sequence parameter set indicates the first resolution of the one or more frames of the first sub-sequence, and the second sequence parameter set indicates the second resolution of the one or more frames of the second sub-sequence, and wherein the first sequence parameter set is different than the second sequence parameter set, and using the first sequence parameter set and the second sequence parameter set to decode the coded video sequence.

Patent
08 Jun 2012
TL;DR: In this paper, the authors propose a method for streaming media data having an original media frame and an original frame index referencing the original media frames; determining an optimal session bitrate, wherein the optimal session bitsrate is based on the available network bandwidth between a server and a terminal.
Abstract: A method includes receiving streaming media data having an original media frame and an original frame index referencing the original media frame; determining an optimal session bitrate, wherein the optimal session bitrate is based on the available network bandwidth between a server and a terminal; allocating a frame budget for an output media frame by estimating a frame size of the output media frame based on the original frame index and the optimal session bitrate; generating the output media frame by processing the original media frame based on first encoding parameters and, if the allocated frame budget is greater than a frame size of the processed media frame, padding the processed media frame; and providing the output media frame.

Patent
09 Aug 2012
TL;DR: In this article, a multiview video data encoding method and device and decoding method and devices are presented for the decoding of multi-view video data, where the decoding is based on a depth map frame.
Abstract: Disclosed are a multiview video data encoding method and device and decoding method and device. A multiview video data encoding method according to one embodiment of the present invention acquires a multiview colour video frame and a depth map frame which corresponds to the multiview colour video frame, prediction encodes the acquired multiview colour video frame, and, using the encoding result of the prediction encoded multiview colour video frame, prediction encodes the corresponding depth map frame.

Patent
05 Dec 2012
TL;DR: In this paper, a method, computer program product, and system are provided for multi-threaded video encoding. The method includes the steps of generating a set of motion vectors in a hardware video encoder based on a current frame of a video stream and a reference frame of the video stream, dividing the current frame into a number of slices, encoding each slice based on the set of vectors, and combining the encoded slices to generate an encoded bitstream.
Abstract: A method, computer program product, and system are provided for multi-threaded video encoding. The method includes the steps of generating a set of motion vectors in a hardware video encoder based on a current frame of a video stream and a reference frame of the video stream, dividing the current frame into a number of slices, encoding each slice of the current frame based on the set of motion vectors, and combining the encoded slices to generate an encoded bitstream.

Patent
Udar Mittal1, James P. Ashley2
20 Dec 2012
TL;DR: In this article, a first coding method is used to construct a first frame of coded output audio samples by coding a first audio frame in a sequence of frames, which is then combined with the overlap-add portion of the first frame using a second coding method.
Abstract: A method (700, 800) and apparatus (100, 200) processes audio frames to transition between different codecs. The method can include producing (720), using a first coding method, a first frame of coded output audio samples by coding a first audio frame in a sequence of frames. The method can include forming (730) an overlap-add portion of the first frame using the first coding method. The method can include generating (740) a combination first frame of coded audio samples based on combining the first frame of coded output audio samples with the overlap-add portion of the first frame. The method can include initializing (760) a state of a second coding method based on the combination first frame of coded audio samples. The method can include constructing (770) an output signal based on the initialized state of the second coding method.

Patent
14 Jun 2012
TL;DR: In this paper, a method and system of video decoding incorporating frame compression to reduce frame buffer size are disclosed, which adjusts parameters of the frame compression according to decoder system information or syntax element in the video bitstream.
Abstract: Method and system of video decoding incorporating frame compression to reduce frame buffer size are disclosed. The method adjusts parameters of the frame compression according to decoder system information or syntax element in the video bitstream. The decoder system information may be selected from a group consisting of system status, system parameter and a combination of system status and system parameter. The decoder system information may include system bandwidth, frame buffer size, frame buffer status, system power consumption, and system processing load. The syntax element comprises reference frame indicator, initial picture QP (quantization parameter), picture type, and picture size. The adaptive frame compression may be applied to adjust compression ratio. Furthermore, the adaptive frame compression may be applied to a decoder for a scalable video coding system or a multi-layer video coding system.

Patent
26 Apr 2012
TL;DR: In this paper, a method for data-rate control by randomized bit-puncturing in communication systems is presented, based on a number of bits to be punctured from the group or frame generated by the encoder, a set of pointers and random-generated displacements is used to generate addresses for bits in the group and frame to be transmitted or punctured.
Abstract: Method and system for data-rate control by randomized bit-puncturing in communication systems. An encoder encodes at least one information bit thereby generating a group of encoded bits or an encoded frame. The encoder may be any type of encoder including a turbo encoder, an LDPC (Low Density Parity Check) encoder, a RS (Reed-Solomon) encoder, or other type of encoder. Any sub-portion of an encoded frame generated by such an encoder can be viewed as being a group of encoded bits. If the encoded frame is sub-divided into multiple groups of bits, each group can under processing in accordance with the means presented herein to effectuate rate matching. Based on a number of bits to be punctured from the group or frame generated by the encoder, a set of pointers and random-generated displacements is used to generate addresses for bits in the group or frame to be transmitted or punctured.

Patent
12 Sep 2012
TL;DR: In this paper, a computing device may be configured to determine for a pixel representing a feature in the frame, a corresponding pixel representing the feature in a consecutive frame; and determine, for a set of pixels including the pixel, a projective transform that may represent motion of the camera.
Abstract: Methods and systems for rolling shutter removal are described. A computing device may be configured to determine, in a frame of a video, distinguishable features. The frame may include sets of pixels captured asynchronously. The computing device may be configured to determine for a pixel representing a feature in the frame, a corresponding pixel representing the feature in a consecutive frame; and determine, for a set of pixels including the pixel in the frame, a projective transform that may represent motion of the camera. The computing device may be configured to determine, for the set of pixels in the frame, a mixture transform based on a combination of the projective transform and respective projective transforms determined for other sets of pixels. Accordingly, the computing device may be configured to estimate a motion path of the camera to account for distortion associated with the asynchronous capturing of the sets of pixels.

Patent
30 Aug 2012
TL;DR: In this paper, the authors present a processor, and storage containing data relating combinations of resolution and frame rates to maximum bitrates, where a plurality of video streams that are related to the same maximum bit rate form a maximum bitrate level.
Abstract: Systems and methods for streaming and playing back video having a variety of resolutions, frame rates, and/or sample aspect ratios, where the video streams are encoded at one of a number of maximum bit rate levels, in accordance with embodiments of the invention are disclosed. One embodiment includes a processor, and storage containing data relating combinations of resolution and frame rates to maximum bitrates, where a plurality of resolution and frame rates that are related to the same maximum bitrate form a maximum bitrate level. In addition, an encoding application configures the processor to encode a video stream as a plurality of video streams having different resolutions and frame rates, where the target maximum bitrate used during the encoding is selected based upon the maximum bitrate levels of the resolution and frame rate combinations indicated within the data relating combinations of resolution and frame rates to maximum bitrates.

Patent
Christian L. Duvivier1
24 May 2012
TL;DR: In this paper, a method for determining frame slice sizes for multithreaded decoding under the H.264 codec is presented. But this method requires the frame to be encoded using at least two different slice types based on size where a large-type slice is at least 2 times larger than a small-size slice and/or the large type slices comprise 70-90% of the frame.
Abstract: Method for determining frame slice sizes of a frame for multithreaded decoding. The frame is encoded using at least two different slice types based on size where a large-type slice is at least two times larger than a small-type slice and/or the large-type slices comprise 70-90% of the frame. In some embodiments, the number of large-type slices is equal to the number of threads available for decoding and comprise the beginning slices of the frame to be decoded before the small-type slices. Methods for multithreaded deblocking of the frame under the H.264 codec is provided where first and second threads processes first and second sections of the frame in parallel. The first section comprises macroblocks on one side of a diagonal line and the second section comprises the remainder, the diagonal line extending from a first corner of a sub-frame to a second corner of the sub-frame.

Proceedings Article
Xiao Liu1, Mingli Song1, Luming Zhang1, Senlin Wang1, Jiajun Bu1, Chun Chen1, Dacheng Tao 
01 Jan 2012
TL;DR: A joint framework to integrate both shot boundary detection and key frame extraction is proposed, wherein three probabilistic components are taken into account, i.e. the prior of the key frames, the conditional probability of shot boundaries and the conditional probabilities of each video frame.
Abstract: Representing a video by a set of key frames is useful for efficient video browsing and retrieving. But key frame extraction keeps a challenge in the computer vision field. In this paper, we propose a joint framework to integrate both shot boundary detection and key frame extraction, wherein three probabilistic components are taken into account, i.e. the prior of the key frames, the conditional probability of shot boundaries and the conditional probability of each video frame. Thus the key frame extraction is treated as a Maximum A Posteriori which can be solved by adopting alternate strategy. Experimental results show that the proposed method preserves the scene level structure and extracts key frames that are representative and discriminative.

Patent
04 Jun 2012
TL;DR: In this paper, a coder may exchange signaling with a decoder to identify unused areas of frames and prediction modes for the unused areas, based on exchanged signaling, and the input frame may be parsed into a used area and an unused area based on the exchanged signaling.
Abstract: Embodiments of the present invention provide techniques for efficiently coding/decoding video data during circumstances where a decoder only requires or utilizes a portion of coded frames. A coder may exchange signaling with a decoder to identify unused areas of frames and prediction modes for the unused areas. An input frame may be parsed into a used area and an unused area based on the exchanged signaling. If motion vectors of the input frame are not limited to the used areas of the reference frames, the unused area of the input frame may be coded using low complexity. If the motion vectors of the input frame are limited to the used areas of the reference frames, the pixel blocks in the unused area of the input frame may not be coded, or the unused area of the input frame may be filled with gray, white, or black pixel blocks.

Patent
22 Jun 2012
TL;DR: An apparatus, method, and computer-readable medium for motion sensor-based video stabilization is described in this paper, where a motion sensor may capture motion data of a video sequence and a controller may compute instantaneous motion of the camera for a current frame of the video sequence.
Abstract: An apparatus, method, and computer-readable medium for motion sensor-based video stabilization A motion sensor may capture motion data of a video sequence A controller may compute instantaneous motion of the camera for a current frame of the video sequence The controller may compare the instantaneous motion to a threshold value representing a still condition and reduce a video stabilization strength parameter for the current frame if the instantaneous motion is less than the threshold value A video stabilization unit may perform video stabilization on the current frame according to the frame's strength parameter

Patent
16 May 2012
TL;DR: In this paper, a method for receiving a video frame from an encoder associated with a first camera that is coupled to a network, and associating a parity block to the video frame, evaluating whether the data block is full, and communicating the data blocks to a second camera in the network is presented.
Abstract: A method is provided in one example embodiment and includes receiving a video frame from an encoder associated with a first camera that is coupled to a network; appending the video frame to a data block; associating a parity block to the video frame; evaluating whether the data block is full; and communicating the data block to a second camera in the network. In other embodiments, the method can include receiving additional video frames from the encoder; appending the additional video frames to a plurality of data blocks; and aligning particular sizes of the plurality of data blocks to a plurality of corresponding disk write sizes.

Patent
17 Sep 2012
TL;DR: In this paper, a video transcoder may decode a compressed video data frame and re-encode it using a field programmable gate array (FPGA) to determine a timestamp offset.
Abstract: A method, a video processing system, and an electronic device are disclosed. A video transcoder may decode a compressed video data frame creating a decoded video data frame. The video transcoder may embed a network presentation timestamp in the decoded video data frame. The video transcoder may re-encode the decoded video data frame creating a transcoded video data frame. A field programmable gate array may compare the network presentation timestamp with a transcoder presentation timestamp to determine a timestamp offset.

Patent
30 Mar 2012
TL;DR: In this article, a suitable predictor is found by selecting a pixel block of a current frame, conducting a direction based search by comparing pixel blocks within a search window of a reference frame with the selected pixel blocks of the current frame to determine whether a match exists.
Abstract: A method, a device and computer readable storage media facilitate providing screen content including a plurality of video frames that are displayed by a computing device. During coding of the screen content, a suitable predictor is found that is used to code pixel blocks from one or more frames. The suitable predictor is found by selecting a pixel block of a current frame, conducting a direction based search by comparing pixel blocks within a search window of a reference frame with the selected pixel block of the current frame to determine whether a match exists, and, in response to a determination that no sufficient match has been found, conducting a feature oriented search by comparing pixel blocks of the reference frame with the selected pixel block of the current frame to find a suitable match based upon a common feature.

Patent
30 Jun 2012
TL;DR: In this paper, a method for encoding video sequences using frames from a higher rate video sequence is described, where a frame from a first video sequence was selected as a reference frame by comparing the similarity of the content of the selected frame with the contents of at least one frame in the second video sequence.
Abstract: Systems and methods for encoding video sequences using frames from a higher rate video sequence in accordance with embodiments of the invention are disclosed. One embodiment of the invention includes encoding frames in a first video sequence by selecting a frame in the first video sequence and selecting a frame in a second video sequence as a reference frame by comparing the similarity of the content of the selected frame from the first video sequence with the content of at least one frame in the second video sequence. The selected frame from the first sequence is then encoded using predictions that include references to the reference frame from the second sequence. Information identifying the reference frame from the second sequence is then associated with the encoded frame from the first sequence to enable decoding of the first sequence using the second sequence.