scispace - formally typeset
Search or ask a question

Showing papers on "Inter frame published in 2008"


Patent
08 Sep 2008
TL;DR: In this paper, a video codec consisting of a memory unit, a multithreading engine, and a plurality of control and task modules organized in a tree structure, each module corresponding to a coding operation, communicate with each other by control messages and shared memory.
Abstract: A video codec having a modular structure for encoding/decoding a digitized sequence of video frames in a multi-core system is described. The video codec comprises a memory unit; a multithreading engine. and a plurality of control and task modules organized in a tree structure, each module corresponding to a coding operation. The modules communicate with each other by control messages and shared memory. The control modules control all coding logic and workflow, and lower level task modules perform tasks and provide calculations upon receiving messages from the control task modules. The multithreading engine maintains context of each task and assigns at least one core to each task for execution. The method of coding/decoding comprises denoising, core motion estimation, distributed motion estimation, weighted texture prediction and error resilient decoding.

262 citations


Journal ArticleDOI
TL;DR: A novel, low-complexity motion vector processing algorithm at the decoder is proposed for motion-compensated frame interpolation or frame rate up-conversion and it explicitly considers the reliability of each received motion vector and has the capability of preserving the structure information.
Abstract: In this paper, a novel, low-complexity motion vector processing algorithm at the decoder is proposed for motion-compensated frame interpolation or frame rate up-conversion. We address the problems of having broken edges and deformed structures in an interpolated frame by hierarchically refining motion vectors on different block sizes. Our method explicitly considers the reliability of each received motion vector and has the capability of preserving the structure information. This is achieved by analyzing the distribution of residual energies and effectively merging blocks that have unreliable motion vectors. The motion vector reliability information is also used as a prior knowledge in motion vector refinement using a constrained vector median filter to avoid choosing identical unreliable one. We also propose using chrominance information in our method. Experimental results show that the proposed scheme has better visual quality and is also robust, even in video sequences with complex scenes and fast motion.

125 citations


Proceedings ArticleDOI
12 Dec 2008
TL;DR: A candidate based fast search algorithm replaces the full search for decoder-side motion vector derivation using template matching to improve coding efficiency of H.264/AVC based video coding.
Abstract: In this paper, a decoder side motion vector derivation scheme for inter frame video coding is proposed. Using a template matching algorithm, motion information is derived at the decoder instead of explicitly coding the information into the bitstream. Based on Lagrangian rate-distortion optimisation, the encoder locally signals whether motion derivation or forward motion coding is used. While our method exploits multiple reference pictures for improved prediction performance and bitrate reduction, only a small template matching search range is required. Derived motion information is reused to improve the performance of predictive motion vector coding in subsequent blocks. An efficient conditional signalling scheme for motion derivation in Skip blocks is employed. The motion vector derivation method has been implemented as an extension to H.264/AVC. Simulation results show that a bitrate reduction of up to 10.4% over H.264/AVC is achieved by the proposed scheme.

105 citations


Patent
02 Apr 2008
TL;DR: In this article, the authors propose a method to replace I-frame error recovery with long term reference frames, even in the case where the reference frame management messages are lost to at least one decoder.
Abstract: An apparatus, software encoded in tangible media, and a method at an encoder. The method includes sending compressed video data including a reference frame message to create a long term reference frame to a plurality of decoders at one or more destination points, receiving feedback from the decoders indicative of whether or not the decoders successfully received the reference frame message, and in the case that the received feedback is such that at least one of the decoders did not successfully receive the reference frame message or does not have the indicated recent frame, repeating sending a reference frame message to create the long term reference frame. Using the method can replace I-frame error recovery with long term reference frames, even in the case where the reference frame management messages are lost to at least one decoder.

94 citations


Journal ArticleDOI
TL;DR: A complexity reduction algorithm tailored for the H.264/AVC encoder aims to alleviate the computational burden imposed by Lagrangian rate distortion optimization in the inter-mode selection process and demonstrates a reduction in encoding time of at least 40%, regardless of the class of sequence.
Abstract: A complexity reduction algorithm tailored for the H.264/AVC encoder is described. It aims to alleviate the computational burden imposed by Lagrangian rate distortion optimization in the inter-mode selection process. The proposed algorithm is described as a hierarchical structure comprising three levels. Each level targets different types of macroblocks according to the complexity of the search process. Early termination of mode selection is triggered at any of the levels to avoid a full cycle of Lagrangian examination. The algorithm is evaluated using a wide range of test sequences of different classes. The results demonstrate a reduction in encoding time of at least 40%, regardless of the class of sequence. Despite the reduction in computational complexity, picture quality is maintained at all bit rates.

69 citations


Proceedings ArticleDOI
15 Aug 2008
TL;DR: A video key frame extraction method based on spatial-temporal color distribution which considers the spatial and temporal distribution of the pixels throughout the video shot and a weighted distance is computed between frames in the shot and the constructed reference frame.
Abstract: Video key frame extraction is a type of video abstraction, which is one of the key problems in video content indexing and retrieval. Key frame extraction aims at finding a small collection of salient images extracted from a video sequence for visual content summarization. In this paper we propose a video key frame extraction method based on spatial-temporal color distribution. First we construct a temporally maximum occurrence frame which considers the spatial and temporal distribution of the pixels throughout the video shot. Then a weighted distance is computed between frames in the shot and the constructed reference frame. Key frames are extracted at the peaks of the distance curve and can achieve high compression ratio and high fidelity.

52 citations


Journal ArticleDOI
TL;DR: A simple yet effective mechanism to select proper reference frames for H.264 motion estimation by means of a simple test, which enables working with any existing motion search algorithms developed for the traditional single reference frame.
Abstract: This paper proposes a simple yet effective mechanism to select proper reference frames for H.264 motion estimation. Unlike traditional video codecs, H.264 permits more than one reference frame for increased precision in motion estimation. However, motion estimation is complicated by variable block-size motion estimation, which requires significant encoding complexity to identify the best inter-coding. Our smart selection mechanism selects suitable reference frames by means of a simple test, and only the selected frames will be searched further in the variable block size motion estimation. One major advantage of our mechanism is that it enables working with any existing motion search algorithms developed for the traditional single reference frame. Experimental results demonstrate the effectiveness of our proposed algorithm.

50 citations


Patent
14 Jan 2008
TL;DR: In this paper, a global motion estimation for a group of at least one image element in a frame of a video sequence is determined between the frame and a reference frame, and uncovered groups present in an uncovered region of the frame are identified based on the determined global motion.
Abstract: In a motion estimation for a group of at least one image element in a frame of a video sequence, a global motion is determined between the frame and a reference frame. Uncovered groups present in an uncovered region of the frame are identified based on the determined global motion. The global motion is assigned as motion representation for these identified uncovered groups. The assigned motion representation is useful for constructing new frames in the sequence in a frame rate up-conversion.

45 citations


Journal ArticleDOI
TL;DR: A simplified and efficient Block Matching Algorithm for Fast Motion Estimation was proposed which provided a faster search with minimum distortion when compared to the optimal fast block matching motion estimation algorithms.
Abstract: Block matching motion estimation was one of the most important modules in the design of any video encoder. It consumed more than 85% of video encoding time due to searching of a candidate block in the search window of the reference frame. To minimize the search time on block matching, a simplified and efficient Block Matching Algorithm for Fast Motion Estimation was proposed. It had two steps such as prediction and refinement. The temporal correlation among successive frames and the direction of the previously processed frame for predicting the motion vector of the candidate block was considered during prediction step. Different combination of search points was considered in the refinement step of the algorithm which subsequently minimize the search time. Experiments were conducted on various SIF and CIF video sequences. The performance of the algorithm was compared with existing fast block matching motion estimation algorithms which were used in recent video coding standards. The experimental results were shown that the algorithm provided a faster search with minimum distortion when compared to the optimal fast block matching motion estimation algorithms.

45 citations


Journal ArticleDOI
TL;DR: A novel macroblock (MB) mode decision algorithm for P-frame prediction based on machine learning techniques to be used as part of a very low complexity MPEG-2 to H.264 video transcoder and results show that the proposed approach achieves the best results.
Abstract: The H.264 standard achieves much higher coding efficiency than the MPEG-2 standard, due to its improved inter-and intra-prediction modes at the expense of higher computational complexity. Transcoding MPEG-2 video to H.264 is important to enable gradual migration to H.264. However, given the significant differences between the MPEG-2 and the H.264 coding algorithms, transcoding is a much more complex task and new approaches to transcoding are necessary. The main problems that need to be addressed in the design of an efficient heterogeneous MPEG-2/H.264 transcoder are: the inter-frame prediction, the transform coding and the intra-frame prediction. In this paper, we focus our attention on the inter-frame prediction, the most computationally intensive task involved in the transcoding process. This paper presents a novel macroblock (MB) mode decision algorithm for P-frame prediction based on machine learning techniques to be used as part of a very low complexity MPEG-2 to H.264 video transcoder. Since coding mode decisions take up the most resources in video transcoding, a fast MB mode estimation would lead to reduced complexity. The proposed approach is based on the hypothesis that MB coding mode decisions in H.264 video have a correlation with the distribution of the motion compensated residual in MPEG-2 video. We use machine learning tools to exploit the correlation and construct decision trees to classify the incoming MPEG-2 MBs into one of the several coding modes in H.264. The proposed approach reduces the H.264 MB mode computation process into a decision tree lookup with very low complexity. Experimental results show that the proposed approach reduces the MB mode selection complexity by as much as 95% while maintaining the coding efficiency. Finally, we conduct a comparative study with some of the most prominent fast inter-prediction methods for H.264 presented in the literature. Our results show that the proposed approach achieves the best results for video transcoding applications.

41 citations


Patent
27 May 2008
TL;DR: In this paper, an encode control strategy for variable bit rate encoding of a sequence of video frames in a single pass is provided for determining whether a video frame has a complexity level statistically outside a defined range.
Abstract: An encode control strategy is provided for variable bit rate encoding of a sequence of video frames in a single pass. The control strategy includes determining whether a video frame has a complexity level statistically outside a defined range from a complexity level of at least one preceding frame of the sequence of video frames, and if so, determining a new average bit rate target for the video frame. The new average bit rate for the video frame is determined employing at least one of spatial complexity and temporal complexity of the video frame. The new average bit rate target for the video frame is used to set frame level bit rate control parameter(s), and the video frame is encoded using the set frame level bit rate control parameter(s).

Patent
09 Apr 2008
TL;DR: In this paper, a motion detection method, a device and an intelligent monitor system are presented, where a background difference image is obtained according to the prior input image and the prior background image respectively.
Abstract: The invention provides a motion detection method, a device and an intelligent monitor system. A background difference image is obtained according to the prior input image and the prior background image respectively and an interframe difference image is obtained according to the prior input image and the previous frame input image thereof. Because the background different image includes foreground information in the prior input image differentiated with the background and the interframe different image includes motion information in the prior input image, the invention combines the two part of information to obtain motion foreground information.

Patent
22 Aug 2008
TL;DR: A frame type determination unit (15B) counts the frame data amount of each frame, the number of TS packets included in the frame based on a frame start position included in an input TS packet of video communication, and determines a frame type based on the large/small relationships between the data amounts of the frames.
Abstract: A frame type determination unit (15B) counts, as the frame data amount of each frame, the number of TS packets included in the frame based on a frame start position included in an input TS packet of video communication, and determines a frame type based on the large/small relationships between the frame data amounts of the frames. A video quality estimation unit (15C) estimates the video quality of the video communication based on the frame type of each frame obtained by the frame type determination unit (15B), the frame structure (14A) of an elementary stream read out from a storage unit (14), and a TS packet loss state detected from the TS packets of the video communication.

Patent
03 Sep 2008
TL;DR: In this paper, a method for estimating the ego-motion of a camera mounted on a vehicle that uses infra-red images is disclosed, comprising the steps of extracting features from the previous frame and the current frame, finding correspondances between extracted features, and estimating the relative pose of the camera by minimizing reprojection errors from correspondences to the anchor frame.
Abstract: A method for estimating egomotion of a camera mounted on a vehicle that uses infra-red images is disclosed, comprising the steps of (a) receiving a pair of frames from a plurality of frames from the camera, the first frame being assigned to a previous frame and an anchor frame and the second frame being assigned to a current frame; (b) extracting features from the previous frame and the current frame; (c) finding correspondances between extracted features from the previous frame and the current frame; and (d) estimating the relative pose of the camera by minimizing reprojection errors from the correspondences to the anchor frame. The method can further comprise the steps of (e) assigning the current frame as the anchor frame when a predetermined amount of image motion between the current frame and the anchor frame is observed; (f) assigning the current frame to the previous frame and assigning a new frame from the plurality of frames to the current frame; and (g) repeating steps (b)-(f) until there are no more frames from the plurality of frames to process. Step (c) is based on an estimation of the focus of expansion between the previous frame and the current frame.

Patent
30 Jun 2008
TL;DR: In this paper, a rate controller for allocating a bit budget for video frames to be encoded is disclosed, which considers many different factors when determining the frame bit budget including: desired video quality, target bit rate, frame type (intra-frame or inter-frame), frame duration, intra-frame frequency, frame complexity, intra block frequency within an intra-block, buffer overflow, buffer underflow, and the encoded video frame quality for a possible second pass.
Abstract: A rate controller for allocating a bit budget for video frames to be encoded is disclosed. The rate controller of the present invention considers many different factors when determining the frame bit budget including: desired video quality, target bit rate, frame type (intra-frame or inter-frame), frame duration, intra-frame frequency, frame complexity, intra-block frequency within an intra-frame, buffer overflow, buffer underflow, and the encoded video frame quality for a possible second pass.

Journal ArticleDOI
TL;DR: A novel macroblock (MB) mode decision algorithm for interframe prediction based on data mining techniques to be used as part of a very low complexity heterogeneous video transcoder and shows that the proposed data mining-based approach achieves the best results for video transcoding applications.
Abstract: Recent developments have given birth to H.264/AVC: a video coding standard offering better bandwidth to video quality ratios than previous standards (such as H.263, MPEG-2, MPEG-4, etc.), due to its improved inter- and intraprediction modes at the expense of higher computation complexity. It is expected that H.264/AVC will take over the digital video market, replacing the use of previous standards in most digital video applications. This creates an important need for heterogeneous video transcoding technologies from older standards to H.264. In this paper, we focus our attention on the interframe prediction, the most computationally intensive task involved in the heterogeneous video transcoding process. This paper presents a novel macroblock (MB) mode decision algorithm for interframe prediction based on data mining techniques to be used as part of a very low complexity heterogeneous video transcoder. The proposed approach is based on the hypothesis that MB coding mode decisions in H.264 video have a correlation with the distribution of the motion compensated residual in the decoded video. We use data mining tools to exploit the correlation and derive decision trees to classify the incoming decoded MBs into one of the several coding modes in H.264. The proposed approach reduces the H.264 MB mode computation process into a decision tree lookup with very low complexity. For general validation purposes, we apply our algorithm to two of the most important heterogeneous video transcoders: MPEG-2 to H.264 and H.263 to H.264. Our results show that the our data-mining based transcoding algorithm is able to maintain a good video quality while considerably reducing the computational complexity by 72% on average when applied in MPEG-2 to H.264 transcoders, and by 62% on average when applied in H.263 to H.264 transcoders. Finally, we conduct a comparative study with some of the most prominent fast interprediction methods for H.264 presented in the literature. Our results show that the proposed data mining-based approach achieves the best results for video transcoding applications.

Patent
04 Dec 2008
TL;DR: In this article, a system and method for encoding interactive low-latency video using interframe coding is described, where the maximum data rate will be exceeded if a particular frame of the sequence of frames is transmitted from the server to the client over the communication channel.
Abstract: A system and method are described below for encoding interactive low- latency video using interframe coding. For example, one embodiment of a computer- implemented method for performing video compression comprises: detecting a maximum data rate of a communication channel between a server and a client; transmitting a video stream comprising a series of sequential frames from the server to the client; detecting that the maximum data rate will be exceeded if a particular frame of the sequence of frames is transmitted from the server to the client over the communication channel; and in lieu of transmitting the frame which could cause the maximum data rate to be exceeded, causing the client to re-render the previous frame of the sequence of frames, thereby effectively reducing the frame rate of the video stream rendered on the client.

Proceedings ArticleDOI
18 Jun 2008
TL;DR: A novel bidirectionally decodable Wyner-Ziv video coding scheme which relaxes this inter frame dependency and outperforms H.264 up to 4dB at same bitrate.
Abstract: Inter frame prediction technique significantly improves the compression efficiency in the hybrid video coding schemes. However, this technique causes the decoding dependency of each inter frame on all of its reference frames. This dependency complicates the reverse play operation which is the most common video cassette recording (VCR) functions. This dependency also causes error propagation when the video is transmitted over error prone channel. In this paper, we propose a novel bidirectionally decodable Wyner-Ziv video coding scheme which relaxes this inter frame dependency. The proposed bidirectionally decodable Wyner-Ziv frame can be decoded by using whether forward prediction or backward prediction as side information at the decoder, i.e. the proposed stream supports forward decoding and backward decoding simultaneously. Compared with the other schemes which support reverse playback, our scheme requires much lower bandwidth and smaller storage space. In error resilient test, our scheme outperforms H.264 up to 4dB at same bitrate. Our proposed frames also support video splicing and stream switching at arbitrary time point like I-frames.

Patent
Yu-Wen Huang1
10 Mar 2008
TL;DR: In this paper, a method for encoding a video signal comprising a plurality of reference frames and non-reference frames is proposed, which is based on determining if at least a portion of a reference frame that is a backward reference frame of the nonreference frame has no scene change.
Abstract: A method for encoding a video signal comprising a plurality of reference frames and non-reference frames includes: for a non-reference frame to be encoded, determining if at least a portion of a reference frame that is a backward reference frame of the non-reference frame has no scene change; and when the portion of the reference frame has no scene change, scaling down a search range for block matching of the portion of the non-reference frame.

Patent
19 Dec 2008
TL;DR: In this paper, a hierarchical frame structure is constructed using bi-directional P frames to better accommodate low-complex decoding profiles and multilayered encoded video bitstreams can be generated based on the hierarchical frame structures.
Abstract: Embodiments of the present invention provide systems, methods and apparatuses for generating forward, backward or bi-directional P frames. Prior to encoding a sequence of video frames, P frames within the video sequence can be reordered to include causal and/or non-causal references to one or more reference frames. This allows any block partition of a bi-directional P frame to include a single reference to a reference frame that is temporally displayed either before or after the bi-directional P frame. Compression and visual quality can therefore be improved. Hierarchical frame structures can be constructed using bi-directional P frames to better accommodate low complexity decoding profiles. Multilayered encoded video bitstreams can be generated based on the hierarchical frame structures and can include a first layer of anchor frames and one or more second layers that include bi-directional P frames that reference the anchor frames and/or any frame in any lower level layer.

Patent
20 Oct 2008
TL;DR: In this paper, a video scalable encoding method calculates a weight coefficient which includes a proportional coefficient and an offset coefficient and indicates brightness variation between an encoding target image region and a reference image region in an upper layer.
Abstract: A video scalable encoding method calculates a weight coefficient which includes a proportional coefficient and an offset coefficient and indicates brightness variation between an encoding target image region and a reference image region in an upper layer, calculates a motion vector by applying the weight coefficient to an image signal of a reference image region as a search target and executing motion estimation, and generates a prediction signal by applying the weight coefficient to a decoded signal of a reference image region indicated by the motion vector and executing motion compensation. Based on encoding information of an immediately-lower image region in an immediately-lower layer, which is present at spatially the same position as the encoding target image region, a data structure of the weight coefficient is determined. When the immediately-lower image region performed interframe prediction in the immediately-lower layer, the method identifies an immediately-lower layer reference image region that the immediately-lower image region used as a prediction reference for motion prediction, and calculates the weight coefficient by applying a weight coefficient that the immediately-lower image region used in weighted motion prediction to a DC component of an image region in the upper layer, which is present at spatially the same position as the immediately-lower layer reference image region, and assuming a result of the application as a DC component of the immediately-lower image region.

Patent
21 May 2008
TL;DR: In this article, an apparatus for displaying video signals includes an input source (23) to receive video signals having different frame frequencies and provide an input video signal from among the received video signals.
Abstract: An apparatus for displaying video signals includes an input source (23) to receive video signals having different frame frequencies and provide an input video signal from among the received video signals, a frame frequency detector (5) to detect, as a first frame frequency, the frame frequency of the input video signal, a determiner (6) to determine a second frame frequency according to the first frame frequency and provide a frame frequency conversion rate (K), the second frame frequency being higher than the first frame frequency, an interpolation frame generator (7) to generate interpolation frames according to the frame frequency conversion rate, and a frame frequency converter (8) to convert the input video signal into a video signal having the second frame frequency by interpolating the generated interpolation frames between original frames of the input video signal.

Patent
20 Feb 2008
TL;DR: In this paper, an image display apparatus capable of reducing a judder and simultaneously weakening the degree of reducing the judder at the time of converting frame rate of a film signal using motion compensation is presented.
Abstract: The present invention provides an image display apparatus capable of reducing a judder and simultaneously weakening the degree of reducing the judder at the time of converting frame rate of a film signal using motion compensation. At the time of converting frame rate of a video signal by adding N (N: integer of 2 or larger) interpolation frames into between original frames neighboring each other along time base obtained from video images in original frames by using motion compensation so that interpolation positions of the video images in the N interpolation frames are set to a deviated position which is closer to the nearest video image in the original frames rather than positions obtained by equally dividing, into (N+1) portions, magnitude of video image motion between an earlier original frame and a following original frame along the time base.

Patent
19 Jun 2008
TL;DR: In this article, a method for controlling bitrate in video coding of a sequence of frames including series of Inter frames separated by Intra frames is presented. But the method is not suitable for video streaming.
Abstract: A device and method of controlling bitrate in video coding of a sequence of frames including series of Inter frames separated by Intra frames, wherein the coded frames are validated in a video buffering device prior to transmission of the coded frames and wherein the method comprises: measuring frame complexity in the sequence of frames; for each Inter frame, calculating a target buffer level of the video buffering device in relation to a distance between the Inter frame and a next Intra frame; for each Inter frame, calculating a target frame size in relation to the distance between the Inter frame and the next Intra frame, the measured frame complexity, a current buffer level of the video buffering device and the calculated target buffer level of the video buffering device; and using the calculated target frame size to control bitrate in video coding of the sequence of frames.

Journal ArticleDOI
TL;DR: Experimental results show that the proposed algorithm has good performance for object tracking in crowded scenes on stairs, in airports, or at train stations in the presence of object translation, rotation, small scaling, and occlusion.
Abstract: We propose a new algorithm for object tracking in crowded video scenes by exploiting the properties of undecimated wavelet packet transform (UWPT) and interframe texture analysis. The algorithm is initialized by the user through specifying a region around the object of interest at the reference frame. Then, coefficients of the UWPT of the region are used to construct a feature vector (FV) for every pixel in that region. Optimal search for the best match is then performed by using the generated FVs inside an adaptive search window. Adaptation of the search window is achieved by interframe texture analysis to find the direction and speed of the object motion. This temporal texture analysis also assists in tracking of the object under partial or short-term full occlusion. Moreover, the tracking algorithm is robust to Gaussian and quantization noise processes. Experimental results show that the proposed algorithm has good performance for object tracking in crowded scenes on stairs, in airports, or at train stations in the presence of object translation, rotation, small scaling, and occlusion.

Patent
27 Mar 2008
TL;DR: In this article, a video encoder selects first search modes by using optimized inter mode information of a correlation macroblock having the same position as a current macroblock in a previous frame, in order to determine the inter mode.
Abstract: The present invention relates to a method for a video encoder to determine an inter mode. The video encoder selects first search modes by using optimized inter mode information of a correlation macroblock having the same position as a current macroblock in a previous frame, in order to determine the inter mode. The video encoder compares a rate-distortion cost of the correlation macroblock and a rate-distortion cost of the mode that is selected as the minimum cost mode from among the first search modes, and determines whether to terminate an inter mode determination process early. When the early termination condition is satisfied, the video encoder determines the search mode having the minimum rate-distortion cost from among the first search modes as the optimized inter mode of the current macroblock, and terminates the inter mode determination process early. When the early termination condition is not satisfied, the video encoder selects second search modes to additionally perform an inter prediction process, and determines the corresponding search mode having the minimum rate-distortion cost as the optimized inter mode of the current macroblock.

Patent
30 Dec 2008
TL;DR: In this article, a method for adaptive frame averaging is proposed, which divides a current frame image into a plurality of sub-blocks, and then determines a frame averaging coefficient for each sub-block based on the characteristic of the current frame and a characteristic image of a previous frame image.
Abstract: A method for adaptive frame averaging includes dividing a current frame image into a plurality of sub-blocks; obtaining a characteristic for each of the plurality of sub-blocks to obtain a characteristic image of the current frame image; determining a frame averaging coefficient for each of the plurality of sub-blocks based on the characteristic image of the current frame image and a characteristic image of a previous frame image; and frame-averaging a resultant frame-averaged image of the previous frame image and the current frame image by using the frame averaging coefficient of each of the plurality of sub-blocks to obtain a resultant frame-averaged image of the current frame image.

Patent
20 Feb 2008
TL;DR: In this paper, an image display device which can suppress degradation of image quality attributed to a motion vector detection accuracy when performing a video signal processing for a predetermined image quality improvement is provided.
Abstract: Provided is an image display device which can suppress degradation of image quality attributed to a motion vector detection accuracy when performing a video signal processing for a predetermined image quality improvement. By considering reliability upon detection of a motion vector mv by a motion vector detector (44), an interpolation unit (45) performs a video signal processing in an imaging blur suppressing unit (13) and an overdrive unit (10). More specifically, setting is performed so that a processing amount upon a video signal processing becomes greater as the reliability becomes higher. On the other hand, setting is performed so that the processing amount upon the video signal processing becomes smaller as the reliability becomes lower. Even if a motion vector search range (a range for which block matching is performed) is exceeded when performing video signal processing by using a motion vector, it is possible to perform a video signal processing in accordance with the motion vector detection accuracy.

Proceedings ArticleDOI
01 Dec 2008
TL;DR: This paper presents the work on automatically detecting moving rigid text in digital videos, which achieves both detection and tracking of moving text at the same time.
Abstract: This paper presents our work on automatically detecting moving rigid text in digital videos. The temporal information is obtained by dividing a video frame into sub-blocks and calculating inter-frame motion vector for each sub-block. Text blocks are then extracted through both intra-frame classification and inter-frame spatial relationship checking. Unlike previous works, our method achieves both detection and tracking of moving text at the same time. The method works very well detecting scrolling text in news clips and movies, and is robust towards low resolution and complex background. The computational efficiency of the method is also discussed.

Patent
25 Mar 2008
TL;DR: In this paper, a method and system for real-time processing of a sequence of video frames is presented, where a current frame in the sequence and at least one frame occurring prior to the current frame is analyzed.
Abstract: A method and system for real time processing of a sequence of video frames. A current frame in the sequence and at least one frame in the sequence occurring prior to the current frame is analyzed. Each frame includes a two-dimensional array of pixels. The sequence of video frames is received in synchronization with a recording of the video frames in real time. The analyzing includes performing a background subtraction on the at least one frame, which determines a background image and a static region mask associated with a static region consisting of a contiguous distribution of pixels in the current frame. The static region mask identifies each pixel in the static region upon the static region mask being superimposed on the current frame. The background image includes the array of pixels and a background model of the at least one frame and does not include any moving object.