scispace - formally typeset
Search or ask a question

Showing papers on "Inter frame published in 2015"


Journal ArticleDOI
TL;DR: A co segmentation framework to discover and segment out common object regions across multiple frames and multiple videos in a joint fashion and introduces a spatio-temporal scale-invariant feature transform (SIFT) flow descriptor to integrate across-video correspondence from the conventional SIFT-flow into interframe motion flow from optical flow.
Abstract: With ever-increasing volumes of video data, automatic extraction of salient object regions became even more significant for visual analytic solutions. This surge has also opened up opportunities for taking advantage of collective cues encapsulated in multiple videos in a cooperative manner. However, it also brings up major challenges, such as handling of drastic appearance, motion pattern, and pose variations, of foreground objects as well as indiscriminate backgrounds. Here, we present a cosegmentation framework to discover and segment out common object regions across multiple frames and multiple videos in a joint fashion. We incorporate three types of cues, i.e., intraframe saliency, interframe consistency, and across-video similarity into an energy optimization framework that does not make restrictive assumptions on foreground appearance and motion model, and does not require objects to be visible in all frames. We also introduce a spatio-temporal scale-invariant feature transform (SIFT) flow descriptor to integrate across-video correspondence from the conventional SIFT-flow into interframe motion flow from optical flow. This novel spatio-temporal SIFT flow generates reliable estimations of common foregrounds over the entire video data set. Experimental results show that our method outperforms the state-of-the-art on a new extensive data set (ViCoSeg).

144 citations


Journal ArticleDOI
TL;DR: This paper presents a novel video stabilization and moving object detection system based on camera motion estimation that uses local feature extraction and matching to estimate global motion and demonstrates that Scale Invariant Feature Transform (SIFT) keypoints are suitable for the stabilization task.
Abstract: Aerial surveillance system provides a large amount of data compared with traditional surveillance system. But, it usually suffers from undesired motion of cameras, which presents new challenges. These challenges must be overcome before such video can be widely used. In this paper, we present a novel video stabilization and moving object detection system based on camera motion estimation. We use local feature extraction and matching to estimate global motion and we demonstrate that Scale Invariant Feature Transform (SIFT) keypoints are suitable for the stabilization task. After estimating the global camera motion parameters using affine transformation, we detect moving object by Kalman filtering. For motion smoothing, we use a median filter to retain the desired motion. Finally, motion compensation is carried out to obtain a stabilized video sequence. A number of aerial video examples demonstrate the effectiveness of our proposed system. We use the software Virtual Dub with the Deshaker-Plugin for test purposes. For objective evaluation, we use Interframe Transformation Fidelity for video stabilization tasks and Detection Ratio for moving object detection task.

55 citations


Proceedings ArticleDOI
07 Jun 2015
TL;DR: The proposed video deblurring method effectively leverages the information distributed across multiple video frames due to camera motion, jointly estimating the motion between consecutive frames and blur within each frame.
Abstract: Camera motion introduces motion blur, degrading the quality of video. A video deblurring method is proposed based on two observations: (i) camera motion within capture of each individual frame leads to motion blur; (ii) camera motion between frames yields inter-frame mis-alignment that can be exploited for blur removal. The proposed method effectively leverages the information distributed across multiple video frames due to camera motion, jointly estimating the motion between consecutive frames and blur within each frame. This joint analysis is crucial for achieving effective restoration by leveraging temporal information. Extensive experiments are carried out on synthetic data as well as real-world blurry videos. Comparisons with several state-of-the-art methods verify the effectiveness of the proposed method.

36 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed efficient and effective wide-view video stitching method outperforms the existing ones compared in terms of overall stitching quality and computational efficiency.
Abstract: In computer vision, video stitching is a very challenging problem. In this paper, we proposed an efficient and effective wide-view video stitching method based on fast structure deformation that is capable of simultaneously achieving quality stitching and computational efficiency. For a group of synchronized frames, firstly, an effective double-seam selection scheme is designed to search two distinct but structurally corresponding seams in the two original images. The seam location of the previous frame is further considered to preserve the interframe consistency. Secondly, along the double seams, 1-D feature detection and matching is performed to capture the structural relationship between the two adjacent views. Thirdly, after feature matching, we propose an efficient algorithm to linearly propagate the deformation vectors to eliminate structure misalignment. At last, image intensity misalignment is corrected by rapid gradient fusion based on the successive over relaxation iteration (SORI) solver. A principled solution to the initialization of the SORI significantly reduced the number of iterations required. We have compared favorably our method with seven state-of-the-art image and video stitching algorithms as well as traditional ones. Experimental results show that our method outperforms the existing ones compared in terms of overall stitching quality and computational efficiency.

35 citations


Proceedings ArticleDOI
03 Dec 2015
TL;DR: It is observed that perceived video quality generally increases with frame rate, but the gain saturates at high rates, and such gain also depends on the interactions between quantization level, spatial resolution, and spatial and motion complexities.
Abstract: High frame rate video has been a hot topic in the past few years driven by a strong need in the entertainment and gaming industry. Nevertheless, progress on perceptual quality assessment of high frame rate video remains limited, making it difficult to evaluate the exact perceptual gain by switching from low to high frame rates. In this work, we first conduct a subjective quality assessment experiment on a database that contains videos compressed at different frame rates, quantization levels and spatial resolutions. We then carry out a series of analysis on the subjective data to investigate the impact of frame rate on perceived video quality and its interplay with quantization level, spatial resolution, spatial complexity, and motion complexity. We observe that perceived video quality generally increases with frame rate, but the gain saturates at high rates. Such gain also depends on the interactions between quantization level, spatial resolution, and spatial and motion complexities.

31 citations


Journal ArticleDOI
TL;DR: The proposed MC models are harnessed and a comprehensive analysis of the system is presented, to qualitatively predict the experimental results, and it is shown that the theory explains qualitatively the empirical behavior.
Abstract: Block-based motion estimation (ME) and motion compensation (MC) techniques are widely used in modern video processing algorithms and compression systems. The great variety of video applications and devices results in diverse compression specifications, such as frame rates and bit rates. In this paper, we study the effect of frame rate and compression bit rate on block-based ME and MC as commonly utilized in inter-frame coding and frame rate up-conversion (FRUC). This joint examination yields a theoretical foundation for comparing MC procedures in coding and FRUC. First, the video signal is locally modeled as a noisy translational motion of an image. Then, we theoretically model the motion-compensated prediction of available and absent frames as in coding and FRUC applications, respectively. The theoretic MC-prediction error is studied further and its autocorrelation function is calculated, yielding useful separable-simplifications for the coding application. We argue that a linear relation exists between the variance of the MC-prediction error and temporal distance. While the relevant distance in MC coding is between the predicted and reference frames, MC-FRUC is affected by the distance between the frames available for interpolation. We compare our estimates with experimental results and show that the theory explains qualitatively the empirical behavior. Then, we use the models proposed to analyze a system for improving of video coding at low bit rates, using a spatio-temporal scaling. Although this concept is practically employed in various forms, so far it lacked a theoretical justification. We here harness the proposed MC models and present a comprehensive analysis of the system, to qualitatively predict the experimental results.

31 citations


Patent
13 Feb 2015
TL;DR: In this paper, a method for identifying a set of key video frames from a video sequence comprising extracting feature vectors for each video frame and applying a group sparsity algorithm to represent the feature vector for a particular video frame as a group sparse combination of the feature vectors of the other video frames is presented.
Abstract: A method for identifying a set of key video frames from a video sequence comprising extracting feature vectors for each video frame and applying a group sparsity algorithm to represent the feature vector for a particular video frame as a group sparse combination of the feature vectors for the other video frames Weighting coefficients associated with the group sparse combination are analyzed to determine video frame clusters of temporally-contiguous, similar video frames The video sequence is segmented into scenes by identifying scene boundaries based on the determined video frame clusters

24 citations


Journal ArticleDOI
TL;DR: A novel algorithm, which is called mixed lossy and lossless (MLL) reference frame recompression, is proposed in this paper, which differs from its previous designs and achieves a much higher compression ratio.
Abstract: Frame recompression is an efficient way to reduce the huge bandwidth of external memory for video encoder, especially for P/B frame compression. A novel algorithm, which is called mixed lossy and lossless (MLL) reference frame recompression, is proposed in this paper. The bandwidth reduction comes from two sources in our scheme, which differs from its previous designs and achieves a much higher compression ratio. First, it comes from pixel truncation. We use truncated pixels (PR) for integer motion estimation (IME) and acquire truncated residuals for factional motion estimation (FME) and motion compensation (MC). Because the pixel access of IME is much larger than FME and MC, it saves about 37.5% bandwidth under 3-b truncation. Second, embedded compression of PR helps to further reduce data. The truncated pixels in the first stage greatly help to achieve a higher compression ratio than current designs. From our experiments, 3-b truncated PR can be compressed to 15.4% of the original data size, while most current embedded compressions can only achieve around 50%. For PR compression, two methods are proposed: in-block prediction and small-value optimized variable length coding. With these experiments, the total bandwidth can be reduced to 25.5%. Our proposed MLL is hardware/software friendly and also fast IME algorithm friendly frame recompression scheme. It is more suitable to work together with the data-reuse strategy than the previous schemes, and the video quality degradation is controllable and negligible.

24 citations


Patent
30 Jan 2015
TL;DR: In this article, techniques for coding an ambient higher order ambisonic coefficient are described for decoding an audio bitstream with a memory and a processor, where the memory may store a first frame of a bitstream and a second frame of the bitstream.
Abstract: In general, techniques are described for coding an ambient higher order ambisonic coefficient. An audio decoding device comprising a memory and a processor may perform the techniques. The memory may store a first frame of a bitstream and a second frame of the bitstream. The processor may obtain, from the first frame, one or more bits indicative of whether the first frame is an independent frame that includes additional reference information to enable the first frame to be decoded without reference to the second frame. The processor may further obtain, in response to the one or more bits indicating that the first frame is not an independent frame, prediction information for first channel side information data of a transport channel. The prediction information may be used to decode the first channel side information data of the transport channel with reference to second channel side information data of the transport channel.

23 citations


Patent
23 Sep 2015
TL;DR: In this paper, an inter-frame prediction method in a hybrid video coding standard was proposed to further improve the coding performance of a video by obtaining motion information of a plurality of adjacent coded blocks around a current coding block.
Abstract: An inter-frame prediction method in a hybrid video coding standard belongs to the field of video coding. In order to effectively process deformation movement existing in a video sequence, the present invention puts forward an inter-frame prediction method in a hybrid video coding standard for further improving coding performance of a video. The inter-frame prediction method comprises the steps of: obtaining motion information of a plurality of adjacent coded blocks around a current coding block; obtaining a reference index of each dividing unit in the current coding block according to obtained reference indexes of the adjacent coded blocked; and processing motion vectors of the adjacent coded blocks according to the obtained reference indexes of the adjacent coded blocks and the obtained reference index of each dividing unit in the current coding block, so as to obtain a motion vector of each dividing unit in the current coding block. According to the inter-frame prediction method of the present invention, motion information of the current coding block is predicted through the motion information of the adjacent coded blocks of the current coding block, so that deformation movement existing in the video sequence can be effectively described, and coding efficiency can be further improved.

18 citations


Patent
Ximin Zhang1, Sang-Hee Lee1
25 Mar 2015
TL;DR: In this paper, a quantization parameter for a frame of a video sequence, modifying the quantization parameters based on a spatial complexity or a temporal complexity associated with the video frame, is discussed.
Abstract: Techniques related to constant quality video coding are discussed. Such techniques may include determining a quantization parameter for a frame of a video sequence, modifying the quantization parameter based on a spatial complexity or a temporal complexity associated with the video frame, and generating a block level quantization parameter for a block of the video frame based on the modified frame level quantization parameter, a complexity of the block, and a complexity of the video frame.

Patent
Neelesh N. Gokhale1
03 Aug 2015
TL;DR: In this paper, a decoder adapted to generate an intermediate decoded version of a video frame from an encoded version of the video frame, determine either an amount of high frequency basis functions or coefficients below a quantization threshold for at least one block of the frame, and generate a final decoding version of video frame based at least in part on the intermediate decoding version and the determined amount(s) for the one or more blocks.
Abstract: A decoder adapted to generate an intermediate decoded version of a video frame from an encoded version of the video frame, determine either an amount of high frequency basis functions or coefficients below a quantization threshold for at least one block of the video frame, and generate a final decoded version of the video frame based at least in part on the intermediate decoded version of the video frame and the determined amount(s) for the one or more blocks of the video frame, is disclosed. In various embodiments, the decoder may be incorporated as a part of a video system.

Patent
18 Mar 2015
TL;DR: In this article, the quality of static image frames having a relatively long residence time in a frame buffer on a sink device is improved by encoding additional information to improve the representation of the now static frame.
Abstract: One or more system, apparatus, method, and computer readable media is described for improving the quality of static image frames having a relatively long residence time in a frame buffer on a sink device. Where a compressed data channel links a source and sink, the source may encode additional frame data to improve the quality of a static frame presented by a sink display. A display source may encode frame data at a nominal quality and transmit a packetized stream of the compressed frame data. In the absence of a timely frame buffer update, the display source encodes additional information to improve the image quality of the representation of the now static frame. A display sink device presents a first representation of the frame at the nominal image quality, and presents a second representation of the frame at the improved image quality upon subsequently receiving the frame quality improvement data.

Patent
06 May 2015
TL;DR: In this article, a video coder, a method and a device and inter-frame mode selection method and device thereof are described, and the problem of high calculation complexity of mode selection in the prior art is solved, and further effects of reducing the complexity and improving the coding speed are achieved.
Abstract: The invention discloses a video coder, a method and a device and inter-frame mode selection method and device thereof. When the initial value of a current depth Depth is 1, the inter-frame mode selection method comprises the following steps: S701, calling S703 to S705 if the calculation of coding expenditure for coding an coding unit CUDepth, and calling S702 to S705 if the calculation is not skipped; S702, determining a current optimal coding mode and coding expenditure of the coding unit CUDepth; S703, dividing the coding unit CUDepth into a plurality of coding sub-units, recursively executing S701 to S705 until the depths of the coding sub-units reach maximum or meet a division end condition, and determining the optimal coding mode and coding expenditure of each coding sub-unit; S704, comparing the sum of the coding expenditure of the multiple coding sub-units with the size of the current coding expenditure of the coding unit CUDepth; S705, determining the mode corresponding to the smaller one in the S704 as the optimal coding mode. The problem of high calculation complexity of mode selection in the prior art is solved, and further the effects of reducing the complexity and improving the coding speed are achieved.

Journal ArticleDOI
TL;DR: The proposed video stabilization technique based on optimized dynamic time warping (DTW) proves to be a robust for various applications such as moving platform and night shooting, over existing intensity-based motion estimation techniques.
Abstract: In this paper, a video stabilization technique based on optimized dynamic time warping (DTW) has been proposed The method uses integral frame projection warping (IFPW) for the relative motion estimation between consecutive frames Implementation of DTW is done using dynamic programming, to give a significant reduction in terms of memory and processing time Most of the motion estimation techniques are adversely affected due to blurring, low intensity and large displacements The proposed method gives accurate results under these conditions and proves to be a robust for various applications such as moving platform and night shooting, over existing intensity-based motion estimation techniques Efficiency of the proposed IFPW technique, for dark video and blurry frames, is measured using motion estimation error analysis Overall performance evaluation of the stabilization system is done using interframe transformation fidelity and processing time

Book ChapterDOI
07 Oct 2015
TL;DR: An efficient forensic method based on motion vector pyramid (MVP) and its variation factor (VF) is proposed to detect frame deletion and duplication in videos with static background and results show that the proposed method is efficient at forgery identification and localization.
Abstract: Frame deletion and duplication are common inter-frame tampering methods in digital videos. In this paper, an efficient forensic method based on motion vector pyramid (MVP) and its variation factor (VF) is proposed to detect frame deletion and duplication in videos with static background. This method is composed of two parts: feature extraction and discontinuity point detection. In the stage of feature extraction, each frame of the video is transformed to grayscale image firstly. Then, motion vector pyramid (MVP) sequence and its corresponding variation factor (VF) are calculated for every two adjacent frames. In the stage of discontinuity point detection, forgery type is identified and tampering point is localized by performing modified generalized ESD test. Experimental results show that the proposed method is efficient at forgery identification and localization. Compared with other existing methods on inter-frame forgery detection, our proposed method is more generic.

Proceedings ArticleDOI
06 Aug 2015
TL;DR: An inter-frame dependent rate-distortion optimization scheme is proposed and implemented on the newest video coding standard High Efficiency Video Coding (HEVC) platform, which obtains a significantly higher coding gain than the multiple QP (±3) optimization technique.
Abstract: It is known that, in the current hybrid video coding structure, spatial and temporal prediction techniques are extensively used which introduce strong dependency among coding units. Such dependency poses a great challenge to perform a global rate-distortion optimization (RDO) when encoding a video sequence. RDO is usually performed in a way that coding efficiency of each coding unit is optimized independently without considering dependeny among coding units, leading to a suboptimal coding result for the whole sequence. In this paper, we investigate the inter-frame dependent RDO, where the impact of coding performance of the current coding unit on that of the following frames is considered. Accordingly, an inter-frame dependent rate-distortion optimization scheme is proposed and implemented on the newest video coding standard High Efficiency Video Coding (HEVC) platform. Experimental results show that the proposed scheme can achieve about 3.19% BD-rate saving in average over the state-of-the-art HEVC codec (HM15.0) in the low-delay B coding structure, with no extra encoding time. It obtains a significantly higher coding gain than the multiple QP (±3) optimization technique which would greatly increase the encoding time by a factor of about 6. Coupled with the multiple QP optimization, the proposed scheme can further achieve a higher BD-rate saving of 5.57% and 4.07% in average than the HEVC codec and the multiple QP optimization enabled HEVC codec, respectively.

Journal ArticleDOI
TL;DR: Simulation results confirm that these novel fuzzy frameworks outperform other state-of-the-art techniques in terms of objective criteria, as well as subjective visual perception in the various color sequences.

Patent
31 Mar 2015
TL;DR: In this paper, video encoding parameters employed by the video data senders, including at least the video frame size and/or video frame rate, can be dynamically adapted to the available bandwidths of the video receivers, taking into account possible effects of spatial and temporal scaling of video frames on the resulting video QoE.
Abstract: Improved systems and methods of performing multimedia communications over multimedia communications networks, in which video data senders can maintain high video quality of experience (QoE) levels with increased reliability despite changes in available bandwidths of video data receivers. In the disclosed systems and methods, video encoding parameters employed by the video data senders, including at least the video frame size and/or the video frame rate, can be dynamically adapted to the available bandwidths of the video data receivers, taking into account possible effects of spatial scaling and/or temporal scaling of video frames on the resulting video QoE.

Journal ArticleDOI
TL;DR: Three step search (TSS) block matching algorithm is implemented on different types of video sequences and it is shown that three step search algorithm produces better quality performance and less computational time compared with exhaustive full search algorithm.

Proceedings ArticleDOI
01 Dec 2015
TL;DR: This paper first derived a residual energy model, and the major factors that may impact the motion vector resolution are considered, including the texture complexity, motion scale, inter-frame noise and quantization parameter.
Abstract: In the classical block-based video coding, motion vector is derived for each coding block to remove the inter-frame redundancy. However, the motion vector resolution is usually restricted to be identical, typically 1/4-pixel resolution, regardless of the different video contents. In this paper, we propose an algorithm that can adaptively select the optimal motion vector resolution at frame level according to the characteristics of the video contents. We first derived a residual energy model, and the major factors that may impact the motion vector resolution are considered, including the texture complexity, motion scale, inter-frame noise and quantization parameter. Experimental results have shown that the proposed scheme can achieve 1.8% BD-rate gain on average without complexity increment.

Patent
Ximin Zhang1, Sang-Hee Lee1
13 May 2015
TL;DR: In this paper, techniques related to designating golden frames and to determining frame sizes and/or quantization parameters for video coding are discussed, such techniques may include designating an frame as a golden frame or a non-golden frame based on whether the frame is a scene change frame, a distance of the frame to a previous golden frame, and an average temporal distortion.
Abstract: Techniques related to designating golden frames and to determining frame sizes and/or quantization parameters golden and non-golden frames for video coding are discussed. Such techniques may include designating an frame as a golden frame or a non-golden frame based on whether the frame is a scene change frame, a distance of the frame to a previous golden frame, and an average temporal distortion of the frame and determining a frame size and/or quantization parameter for the frame based on the designation and a temporal distortion of the frame.

Book ChapterDOI
01 Jan 2015
TL;DR: The task of digital video stabilization in static scenes is investigated and the use of fuzzy Takagi-Sugeno-Kang model for detection the best local and global motion vectors is the novelty of the approach.
Abstract: In recent years, a digital video stabilization improving the results of hand-held shooting or shooting from mobile platforms is the most popular approach In this chapter, the task of digital video stabilization in static scenes is investigated The unwanted motion caused by camera jitters or vibrations ought to be separated from the objects motion in a scene Our contribution connects with the development of deblurring method to find and improve the blurred frames, which have strong negative influence on the following processing results The use of fuzzy Takagi-Sugeno-Kang model for detection the best local and global motion vectors is the novelty of our approach The quality of test videos stabilization was estimated by Peak Signal to Noise Ratio (PSNR) and Interframe Transformation Fidelity (ITF) metrics Experimental data confirmed that the ITF average estimations increase up on 3–4 dB or 15–20 % relative to the original video sequences

Patent
Aki Kuusela1
21 Sep 2015
TL;DR: Low-latency two-pass video coding may include identifying an input frame from an input video stream, determining a reduced frame from the input frame, encoding the reduced frame using a first encoder, generating a plurality of encoding parameters based on encoding reduced frame, generating an encoded frame by encoding the input frames using the first encoded and the plurality of decoding parameters, including the first encoded frame in an output bit stream, and storing or transmitting the output bitstream.
Abstract: Low-latency two-pass video coding may include identifying an input frame from an input video stream, determining a reduced frame from the input frame, the reduced frame having a size smaller than a size of the input frame, encoding the reduced frame using a first encoder, generating a plurality of encoding parameters based on encoding the reduced frame, generating an encoded frame by encoding the input frame using the first encoder and the plurality of encoding parameters, including the first encoded frame in an output bitstream, and storing or transmitting the output bitstream

Patent
20 May 2015
TL;DR: In this article, a quick HEVC (High Efficiency Video Coding) inter-frame prediction mode selection method was proposed, where the correlation of the video texture direction and an interframe prediction angle were fully considered.
Abstract: The invention discloses a quick HEVC (High Efficiency Video Coding) inter-frame prediction mode selection method. After coarse selection of inter-frame prediction modes, the statistical property of Hadamard transform-based cost values corresponding to the coarsely selected inter-frame prediction modes is fully utilized, and the correlation of the video texture direction and an inter-frame prediction mode angle are fully considered; for prediction unit types of different sizes, the inter-frame prediction modes after the coarse selection are quickly screened through a threshold value method or the continuity of the coarsely selected inter-frame prediction modes is calculated to reflect the texture direction of the prediction unit, so that the unnecessary coarsely selected inter-frame prediction mode is screened without introducing more extra calculation; in a verification process of the most possible prediction mode, the correlation of the coarsely selected inter-frame prediction modes and the most possible prediction mode and the space correlation of a video image per se are fully considered, the final optimal inter-frame prediction mode is quickly obtained, and the inter-frame coding complexity is reduced on the premise that the video coding quality is guaranteed.

Patent
15 Jan 2015
TL;DR: Watermark data is converted to watermark coefficients, which may be embedded in an image by converting the image to a frequency domain, embedding the watermark in image coefficients corresponding to medium-frequency components, and converting the modified coefficients to the spatial domain this article.
Abstract: Watermark data is converted to watermark coefficients, which may be embedded in an image by converting the image to a frequency domain, embedding the watermark in image coefficients corresponding to medium-frequency components, and converting the modified coefficients to the spatial domain. The watermark data is extracted from the modified image by converting the modified image to a frequency domain, extracting the watermark coefficients from the image coefficients, and determining the watermark data from the watermark coefficients. The watermark data may be truncated image data bits such as truncated least significant data bits. After extraction from the watermark, the truncated image data bits may be combined with data bits representing the original image to increase the bit depth of the image. Watermark data may include audio data portions corresponding to a video frame, reference frames temporally proximate to a video frame, high-frequency content, sensor calibration information, or other image data.

Posted Content
TL;DR: In this article, an iterative rate-matching process was proposed for inter-frame decoding in broadcast wireless communication, such that the code rate of each frame is progressively lowered to or below the appropriate value, prior to applying or re-applying conventional physical-layer channel decoding on it.
Abstract: A novel inter-frame coding approach to the problem of varying channel-state conditions in broadcast wireless communication is developed in this paper; this problem causes the appropriate code-rate to vary across different transmitted frames and different receivers as well. The main aspect of the proposed approach is that it incorporates an iterative rate-matching process into the decoding of the received set of frames, such that: throughout inter-frame decoding, the code-rate of each frame is progressively lowered to or below the appropriate value, prior to applying or re-applying conventional physical-layer channel decoding on it. This iterative rate-matching process is asymptotically analyzed in this paper. It is shown to be optimal, in the sense defined in the paper. Consequently, the data-rates achievable by the proposed scheme are derived. Overall, it is concluded that, compared to the existing solutions, inter-frame coding presents a better complexity versus data-rate tradeoff. In terms of complexity, the overhead of inter-frame decoding includes operations that are similar in type and scheduling to those employed in the relatively- simple iterative erasure decoding. In terms of data-rates, compared to the state-of-the-art two-stage scheme involving both error-correcting and erasure coding, inter-frame coding increases the data-rate by a factor that reaches up to 1.55x.

Book ChapterDOI
01 Jan 2015
TL;DR: The proposed application of fuzzy logic operators improves the separation results between the unwanted motion and the real motion of rigid objects and the corrective algorithm compensates the unwantedmotion in frames; thereby the scene is aligned.
Abstract: The digital video stabilization is oriented on the removal of unintentional motions from video sequences caused by camera vibrations under external conditions, motion of robots stabilized platforms in a rugged landscape, a sea, oceans, or jitters during a non-professional hand-held shooting. The approaches for digital video stabilization in static and dynamic scenes are similar. However, objectively the analysis of dynamic scenes is needed in advanced intelligent methods. Several sequential stages include the choice of the key frames, the local and global motion estimations, the jitters compensation algorithm, the inpainting of frames boundaries, and the blurred frames restoration, for which the novel methods and algorithms were developed. The proposed application of fuzzy logic operators improves the separation results between the unwanted motion and the real motion of rigid objects. The corrective algorithm compensates the unwanted motion in frames; thereby the scene is aligned. The quality of stabilization in test video sequences was estimated by Peak Signal to Noise Ratio (PSNR) and Interframe Transformation Fidelity (ITF) metrics. During experiments, the PSNR and ITF estimations were received for six video sequences received from the static camera and eight video sequences received from the moving camera. The ITF estimations increase up on 3–4 dB or 15–20 % relative to the original video sequences.

Journal ArticleDOI
28 Dec 2015
TL;DR: This paper proposes the use of Hadamardbased Sum of Absolute Transformed Differences (SATD), in replacement of the traditionally used Sum of absolute differences (SAD), as a means of improving the efficiency of video coding.
Abstract: State-of-the-art video coding tools are submitted to severe performance and energy consumption requirements resulting from high complexity of video standards and from limited energy budgets of portable mobile devices. While providing most of the compression gains, inter frame and intra frame prediction techniques are the most demanding steps, since they compare a huge number of blocks. In such a process, the similarity metric employed affects both the quality of compression and the calculation effort. In this paper we propose the use of Hadamardbased Sum of Absolute Transformed Differences (SATD), in replacement of the traditionally used Sum of Absolute Differences (SAD), as a means of improving the efficiency of video coding. To allow that we explore two Hadamard Transform methods to design efficient SATD architectures, one using the Fast Hadamard Transform (FHT) but terfly and another one using the so-called Transform-Exempted (TE) SATD algorithm. Those methods were com bined with architectural decisions (full parallelism, full parallelism with pipelining or multi-cycling) to build a total of six Hadamard-based SATD architectures that were synthesized for a commercial 45nm standard cell library for two operating frequencies. The architectures were simulated with pixel block data to obtain realistic dynamic power and energy estimates. The TE-SATD architectures achieved the lowest energy results: down to 13.13 pJ/ SATD in the case of parallel architecture with pipeline. However, considering also the area results when evaluating energy, the best results are given by both methods using multi-cycling (transpose buffer): nearly 20.75 pJ/ SATD with up to 63.54% smaller area compared with fully parallel architectures.

Patent
29 Apr 2015
TL;DR: In this paper, a rapid inter-frame transcoding method for reducing video resolution based on HEVC is proposed, which is based on analyzing CU partitions and the similarity of PU encoding modes of HEVC high resolution video and corresponding low-resolution video.
Abstract: The invention relates to a rapid inter-frame transcoding method for reducing video resolution based on HEVC, and belongs to the technical field of video resolution. The rapid inter-frame transcoding method starts with analyzing CU partitions and the similarity of PU encoding modes of HEVC high-resolution video and corresponding low-resolution video, and determines the CU partition and a PU prediction mode of the corresponding low-resolution video at an HEVC encoder end through utilizing predicted information such as the CU partition and MV of the high-resolution video at an HEVC decoding end, so that the computational complexity of mode decision in an inter-frame transcoding process is reduced. By means of the rapid inter-frame transcoding method for reducing the video resolution based on the HEVC, the computational complexity of encoding can be greatly reduced under the situation that the bit rate and the video quality loss are very small, so that the encoding time is saved.