scispace - formally typeset
Search or ask a question

Showing papers on "Inter frame published in 2016"


Journal ArticleDOI
TL;DR: A fast reference frame selection algorithm for the HEVC encoder that efficiently removes the encoding complexity of the best reference frame decision process and the rate distortion performance degradation is negligible.

91 citations


Journal ArticleDOI
TL;DR: This paper investigates RDO with inter-frame dependency, where the impact of coding performance of the current CU on that of the following frames is considered and an RDO scheme taking the inter- frame dependency into account is proposed by adapting the Lagrangian multiplier.
Abstract: Rate–distortion optimization (RDO) is widely used in video coding, which plays a critical role in enhancing the coding efficiency substantially. Currently, the RDO process is performed in a way that coding efficiency of each coding unit (CU) is maximized independently without considering the dependency among CUs. As we know, in the current hybrid video coding structure, spatial/temporal prediction techniques are extensively used, which introduce strong dependency among CUs. In this paper, we investigate RDO with inter-frame dependency, where the impact of coding performance of the current CU on that of the following frames is considered. Accordingly, an RDO scheme taking the inter-frame dependency into account is proposed by adapting the Lagrangian multiplier. The experimental results show that the proposed scheme can achieve about 3.22% and 3.19% BD-rate saving in average over the state-of-the-art High Efficiency Video Coding (HEVC) reference software HM15.0 in the low-delay $P$ (LDP) and low-delay $B$ (LDB) coding structures, respectively, with no extra encoding time. The proposed scheme can obtain a significantly higher coding gain than the multiple quantization parameter (MQP) (±3) optimization technique that would greatly increase the encoding time by a factor of about six. Coupled with MQP optimization, the proposed scheme can further achieve about 5.96% and 5.57% BD-rate savings in average over the HEVC and about 4.03% and 4.07% over the HEVC with MQP optimization, under the specified common test conditions for LDP and LDB coding structures, respectively.

82 citations


Journal ArticleDOI
TL;DR: This paper proposes an effective scheme for reversible data hiding in encrypted H.264/AVC video bitstreams and presents a theoretical analysis of the picture distortion caused by data embedding and the subsequent inter-frame distortion drift.

57 citations


Journal ArticleDOI
TL;DR: Large numbers of experiments demonstrate that the proposed method based on in-frame inter-frame information to detect infrared moving small targets accurately has satisfying detection effectiveness and robustness for infraredMoving small target detection under complex cloud backgrounds.

50 citations


Patent
Jian Yanmei, Yi Qinglin, Yu Jie, Zheng Kai, Liu Sai 
17 Aug 2016
TL;DR: Zhang et al. as mentioned in this paper proposed an ORB key frame closed-loop detection SLAM method with capability of improving the consistency of the position and the pose of the robot, higher constructing quality of an environmental map and high optimization efficiency.
Abstract: The invention discloses an ORB key frame closed-loop detection SLAM method capable of improving the consistency of the position and the pose of a robot. The ORB key frame closed-loop detection SLAM method comprises the following steps of, firstly, acquiring color information and depth information of the environment by adopting an RGB-D sensor, and extracting the image features by using the ORB features; then, estimating the position and the pose of the robot by an algorithm based on RANSAC-ICP interframe registration, and constructing an initial position and pose graph; and finally, constructing BoVW (bag of visual words) by extracting the ORB features in a Key Frame, carrying out similarity comparison on the current key frame and words in the BoVW to realize closed-loop key frame detection, adding constraint of the position and pose graph through key frame interframe registration detection, and obtaining the global optimal position and pose of the robot. The invention provides the ORB key frame closed-loop detection SLAM method with capability of improving the consistency of the position and the pose of the robot, higher constructing quality of an environmental map and high optimization efficiency.

46 citations


Journal ArticleDOI
TL;DR: A new passive approach is proposed for tampering detection and localization in MPEGx coded videos that can detect frame insertion or deletion and double compression with different GOP structures and lengths and reduce the effect of motion on residual errors of P frames.
Abstract: In this paper, a new passive approach is proposed for tampering detection and localization in MPEGx coded videos. The proposed algorithm can detect frame insertion or deletion and double compression with different GOP structures and lengths. To devise the proposed algorithm, the traces of quantization error on residual errors of P frames are mathematically studied. Then, based on the obtained guidelines, a new algorithm is proposed to detect quantization-error-rich areas in the P frames and reduce the effect of motion on residual errors of P frames. Subsequently, a wavelet-based algorithm is proposed to enrich the traces of quantization error in the frequency domain. Finally, the processed and spatially constrained residual errors of P frames are employed to detect and localize video forgery in the temporal domain. Experimental results and a comparison of the proposed method with an existing approach show the efficiency of the proposed algorithm especially for videos with high compression rates. A new algorithm for inter-frame video forgery detection and localization is proposed.The traces of quantization error on residual errors of P frames are studied.The algorithm can detect inter-frame forgeries in videos with different GOP lengths.We detect quantization-error-rich areas in P frames to reduce the effect of motion.

43 citations


Patent
Eunyong Son1, Park Seoungwook1, Yongjoon Jeon1, Heo Jin1, Koo Moonmo1, Sunmi Yoo1 
25 Aug 2016
TL;DR: In this article, a prediction sample can be adaptively filtered according to the intra prediction mode, and intra prediction performance can be improved by applying filtering to the prediction sample on the basis of the filtering reference samples.
Abstract: An intra prediction method according to the present invention comprises the steps of: acquiring intra prediction mode information from a bitstream; deriving neighboring samples of a current block; determining an intra prediction mode for the current block on the basis of the intra prediction mode information; deriving a prediction sample of the current block on the basis of the intra prediction mode and the neighboring samples; determining filtering reference samples for the prediction sample on the basis of the intra prediction mode; and deriving a filtered prediction sample by applying filtering to the prediction sample on the basis of the filtering reference samples. According to the present invention, a prediction sample can be adaptively filtered according to the intra prediction mode, and intra prediction performance can be improved.

36 citations


Patent
Ho-Sang Sung1, Nam-Suk Lee1
04 Mar 2016
TL;DR: In this article, a frame error concealment (FEC) method was proposed for decoding time domain signals generated after time-frequency inverse transform processing, where the current frame is an error frame or a normal frame when the previous frame is a error frame.
Abstract: Disclosed are a frame error concealment method and apparatus and an audio decoding method and apparatus The frame error concealment (FEC) method includes: selecting an FEC mode based on at least one of a state of at least one frame and a phase matching flag, with regard to a time domain signal generated after time-frequency inverse transform processing; and performing corresponding time domain error concealment processing on the current frame based on the selected FEC mode, wherein the current frame is an error frame or the current frame is a normal frame when the previous frame is an error frame

34 citations


Posted Content
TL;DR: This work considers the problem of next frame prediction from video input with a recurrent convolutional neural network trained to predict depth from monocular video input, and produces results which are visually and numerically superior to existing methods that directly predict the next frame.
Abstract: We consider the problem of next frame prediction from video input. A recurrent convolutional neural network is trained to predict depth from monocular video input, which, along with the current video image and the camera trajectory, can then be used to compute the next frame. Unlike prior next-frame prediction approaches, we take advantage of the scene geometry and use the predicted depth for generating the next frame prediction. Our approach can produce rich next frame predictions which include depth information attached to each pixel. Another novel aspect of our approach is that it predicts depth from a sequence of images (e.g. in a video), rather than from a single still image. We evaluate the proposed approach on the KITTI dataset, a standard dataset for benchmarking tasks relevant to autonomous driving. The proposed method produces results which are visually and numerically superior to existing methods that directly predict the next frame. We show that the accuracy of depth prediction improves as more prior frames are considered.

30 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed forensic and counter anti- Forensics methods not only outperform existing methods in detecting frame deletion and anti-forensics, but also outperform them in the VIF game.
Abstract: Among different types of video manipulations, video inter-frame forgery is a powerful and common tampering operation. Several forensic and anti-forensic techniques have been proposed to deal with this challenge. In this paper, we first improve an existing video frame deletion detection algorithm. The improvement is attributed to the combination of two properties resulted from video frame deletion, the periodicity and the magnitude of the fingerprint in the P-frame prediction error. We then analyze a typical anti-forensic method of video frame deletion, and prove that the fingerprint of frame deletion still can be discovered after being anti-forensically modified. We thus further propose a counter anti-forensics approach by estimating the true prediction error and comparing it with the prediction error stored in videos. We show that the detection algorithm is not only useful in detecting video frame deletion, but also useful for detecting video frame insertion. Compared with the existing counter anti-forensics, our proposed approach is robust when different motion estimation algorithms are used in the initial compression. Furthermore, the forensics and counter anti-forensics are combined to perform a two-phase test to detect video inter-frame forgery. A Video Inter-frame Forgery (VIF) game, which is zero-sum, simultaneous-move, is defined to analyze the interplay between the forger and the investigator. Mixed strategy Nash equilibrium is introduced to solve the VIF game and we can obtain the optimal strategies for both players. Experimental results show that the proposed forensic and counter anti-forensic methods not only outperform existing methods in detecting frame deletion and anti-forensics, but also outperform them in the VIF game.

25 citations


Journal ArticleDOI
TL;DR: The authors have shown that they can estimate motion for frames with time intervals as short as 5 s using nonattenuation corrected reconstructed FDG PET brain images, and find that their method is able to compensate for both gradual and step-like motions.
Abstract: Purpose: Head motion during PET brain imaging can cause significant degradation of image quality. Several authors have proposed ways to compensate for PET brain motion to restore image quality and improve quantitation. Head restraints can reduce movement but are unreliable; thus the need for alternative strategies such as data-driven motion estimation or external motion tracking. Herein, the authors present a data-driven motion estimation method using a preprocessing technique that allows the usage of very short duration frames, thus reducing the intraframe motion problem commonly observed in the multiple frame acquisition method. Methods: The list mode data for PET acquisition is uniformly divided into 5-s frames and images are reconstructed without attenuation correction. Interframe motion is estimated using a 3D multiresolution registration algorithm and subsequently compensated for. For this study, the authors used 8 PET brain studies that used F-18 FDG as the tracer and contained minor or no initial motion. After reconstruction and prior to motion estimation, known motion was introduced to each frame to simulate head motion during a PET acquisition. To investigate the trade-off in motion estimation and compensation with respect to frames of different length, the authors summed 5-s frames accordingly to produce 10 and 60 s frames. Summed images generated from the motion-compensated reconstructed frames were then compared to the original PET image reconstruction without motion compensation. Results: The authors found that our method is able to compensate for both gradual and step-like motions using frame times as short as 5 s with a spatial accuracy of 0.2 mm on average. Complex volunteer motion involving all six degrees of freedom was estimated with lower accuracy (0.3 mm on average) than the other types investigated. Preprocessing of 5-s images was necessary for successful image registration. Since their method utilizes nonattenuation corrected frames, it is not susceptible to motion introduced between CT and PET acquisitions. Conclusions: The authors have shown that they can estimate motion for frames with time intervals as short as 5 s using nonattenuation corrected reconstructed FDG PET brain images. Intraframe motion in 60-s frames causes degradation of accuracy to about 2 mm based on the motion type.

Patent
17 Nov 2016
TL;DR: In this article, a reduced-bandwidth wireless 3D video transmission method is proposed, which includes receiving initial first-eye frame data, reprojecting the next second-eyeframe data to the first eye, and performing infilling on the next firsteye frame with the perspective-warped initial firsteye frames.
Abstract: A method for reduced-bandwidth wireless 3D video transmission includes receiving initial first-eye frame data, reprojecting the initial first-eye frame data to the second eye (which creates initial second-eye frame data), receiving sensor data; time-warping the initial first-eye and second-eye frame data, and receiving next second-eye frame data S 140 . The method can additionally or alternatively include perspective warping the initial first-eye frame data; reprojecting the next second-eye frame data to the first eye (which creates next first-eye frame data); performing infilling on the next first-eye frame data with the perspective-warped initial first-eye frame data; time-warping the next first-eye and second-eye frame data; and/or encoding transmitted frame data with sensor data.

Journal ArticleDOI
TL;DR: A method based on the consistency of quotient of mean structural similarity QoMSSIM, which has higher classification accuracy, lower computational complexity and robustness against recompression and white Gaussian noise is proposed.
Abstract: Inter-frame forgery is a common type of video forgery in digital videos In this paper, a method based on the consistency of quotient of mean structural similarity QoMSSIM is proposed For original videos, the QoMSSIM are consistent, but in forgeries the consistency will be destroyed First, the mean structural similarity MSSIM between every two adjacent frames is extracted, and then the quotients between every two sequential MSSIM are calculated Finally, the quotient of mean SSIM after post-processing, normalization and quantization is used as distinguishing feature to identify inter-frame forgeries Experiments are conducted on a large database and support vector machine SVM is used to distinguish original videos and inter-frame forgeries Experimental results show that the proposed method is efficient in differentiating original videos and forgeries For differentiating frame deletion and insertion forgeries, the proposed method performs also pretty well Compared with the other method, the proposed method has higher classification accuracy, lower computational complexity and robustness against recompression and white Gaussian noise Copyright © 2016 John Wiley & Sons, Ltd

Patent
06 Apr 2016
TL;DR: In this paper, the authors proposed an inter-frame noise reduction method based on motion detection, where moving targets are extracted by a multi-Gaussian mixture background model method and an overlapping stationary area between two adjacent frames are found, then interframe accumulative filtering is performed on the area, and moving target areas and non-moving target areas in non-overlapping areas are replaced by background models established by an intra-frame filtering algorithm and the multiscale mixture background models method respectively.
Abstract: The invention discloses an inter-frame noise reduction method based on motion detection According to the method, moving targets are extracted by a multi-Gaussian mixture background model method and an overlapping stationary area between two adjacent frames are found, then inter-frame accumulative filtering is performed on the area, and moving target areas and non-moving target areas in non-overlapping areas are replaced by background models established by an intra-frame filtering algorithm and the multi-Gaussian mixture background model method respectively Meanwhile, the algorithm can also self-adaptively adjust the number of stack frames and has a multistage adjustable function The innovative points reside in that moving target detection of images is performed firstly, then AND operation is performed on two successive frames of foreground images including the moving targets only, and the inter-frame filtering algorithm, the intra-frame filtering algorithm or a background model replacement algorithm is selected according to the result of AND operation so that the phenomena of edge virtual images, pseudo images and even lost of the moving targets caused by the conventional inter-frame filtering algorithm can be avoided, and the great noise reduction effect of the moving images can also be achieved

Patent
23 Mar 2016
TL;DR: In this paper, different fast inter-frame mode decision methods applied to transcoding from H.264 to HEVC are adopted for coding units (CUs) of different HEVC depths respectively according to H264 code stream information and prediction mode information of different depths of the HEVC interframe CUs.
Abstract: Different fast inter-frame mode decision methods applied to transcoding from H.264 to HEVC are adopted for coding units (CUs) of different HEVC depths respectively according to H.264 code stream information and prediction mode information of different depths of the HEVC inter-frame CUs. For CUs of which the depths are 0 and 1, information obtained by decoding an H.264 code stream is processed with a classifier, and decision threshold values used for Skip mode judgement and CU division respectively are calculated. First, Skip mode judgement of a CU is carried out in advance according to the Skip mode judgement threshold value, and then whether the CU is required to be divided into sub-CUs or not is predicted. The methods combine the statistical characteristic of CU division and distribution and the H.264 code stream information, so as to judge a Skip mode in advance, and effectively predict whether CU division is required to be stopped or continued. Therefore, a quadtree can be pruned effectively and unnecessary coding branches are skipped. The methods can remarkably lower the calculation complexity of inter-frame coding during transcoding from H.264 to HEVC.

Journal ArticleDOI
TL;DR: Experimental results confirm that the proposed 3DME-McFIS technique outperforms the HEVC-3D coding standard by improving 0.90dB PSNR on average, by reducing computational time by 50%, and by reducing RAFD problem compared to the existing HE VC-3d coding standard.

Proceedings ArticleDOI
01 Jan 2016
TL;DR: This paper proposes to discover motion models and their associated masks and then use these models and masks to form a prediction of the current frame and shows that a savings in bit rate of 2.3% is achievable over standalone HEVC if this predicted frame is used as an additional reference frame.
Abstract: Traditional video coding uses the motion model to approximate geometric boundaries of moving objects where motion discontinuities occur. Motion hints based inter-frame prediction paradigm moves away from this redundant approach and employs an innovative framework consisting of motion hint fields that are continuous and invertible, at least, over their respective domains. However, estimation of motion hint is computationally demanding, in particular for high resolution video sequences. In this paper, we propose to discover motion models and their associated masks over the current frame and then use these models and masks to form a prediction of the current frame. The prediction process is computationally simpler and experimental results show that a savings in bit rate of 2.3% is achievable over standalone HEVC if this predicted frame is used as an additional reference frame.

Journal ArticleDOI
TL;DR: The nonlocal sparsity and hierarchical GOP structure is used to propose a novel CS based soft video broadcast scheme and the experimental results show that the proposed scheme provides better performance compared with the traditional SoftCast with up to 8 dB coding gain for some channel conditions.
Abstract: Video broadcasting over wireless network has become a very popular application. However, the conventional digital video broadcasting framework can hardly accommodate heterogeneous users with diverse channel conditions, which is called the cliff effects. To overcome this cliff effects and provide a graceful degradation to multi-receivers, in this paper, we use the nonlocal sparsity and hierarchical GOP structure to propose a novel CS based soft video broadcast scheme. CS has properties of minimizing bandwidth consumption and generating measurements with equal importance which are exactly needed by video soft broadcast. In the proposed scheme, the measurement data are generated by block-wise compressive sensing (BCS), and then the measurement data packets are sent over a highly dense constellation though OFDM channel to achieve a simple encoder. Ideally, with the GOP structure, inter frame has lower sampling rate than intra frame to achieve better compression efficiency. At the decoder side, due to equally-important packets and property of soft broadcast, each user can receive the noise-corrupted measurements matching its channel condition and reconstruct video. The hierarchical GOP structure is presented to explode the correlation and non-local sparsity among video frames during the recover process. Additionally, using non-local sparsity, group based CS reconstruction with adaptive dictionaries is proposed to improve decoding quality. The experimental results show that the proposed scheme provides better performance compared with the traditional SoftCast with up to 8 dB coding gain for some channel conditions.

Proceedings ArticleDOI
01 May 2016
TL;DR: Proposed method uses a correlation of colour channels to summarize video content, hence it is very important for video data applications specially for surveillance where a lot of redundancy is present.
Abstract: Video surveillance has been widely used in many applications. Public safety and theft protections are most important uses of it. A system like this needs an efficient transmission and storage of the large video data. Key frame extraction is a simple and powerful system to accomplish this objective. Keyframe extraction also called as a summary of the video because it gives the only important content of the video. Keyframing is used to summarize essential video content, hence it is very important for video data applications specially for surveillance where a lot of redundancy is present. Keyframes can be determined by considering motion as an important feature which can be calculated by using inter frame difference. Proposed method uses a correlation of colour channels to summarize video content.

Patent
15 Nov 2016
TL;DR: In this article, a joint machine learning and game theory framework for video coding rate control (RC) is described, where a machine learning based R-D model classification scheme is provided to facilitate improved RC prediction accuracy.
Abstract: Systems and methods which provide a joint machine learning and game theory modeling (MLGT) framework for video coding rate control (RC) are described. A machine learning based R-D model classification scheme may be provided to facilitate improved R-D model prediction accuracy and a mixed R-D model based game theory approach may be implemented to facilitate improved RC performance. For example, embodiments may provide inter frame Coding Tree Units (CTUs) level bit allocation and RC optimization in HEVC. Embodiments provide for the CTUs being classified into a plurality of categories, such as by using a support vector machine (SVM) based multi-classification scheme. An iterative solution search method may be implemented for the mixed R-D models based bit allocation method. Embodiments may additionally or alternatively refine the intra frame QP determination and the adaptive bit ratios among frames to facilitate improving the coding quality smoothness.

Proceedings ArticleDOI
20 Mar 2016
TL;DR: Three types of 3 Dimensional WPP (3D-WPP) algorithms that can significantly improve the parallelism, while achieving good tradeoffs between implementation complexity, determinism, and rate-distortion (RD) performance are proposed.
Abstract: Although wavefront parallel processing (WPP) proposed in the HEVC standard and various inter frame WPP algorithms can achieve comparatively high parallelism, their scalability for its parallelism is still very limited due to various dependencies introduced in spatial and temporal prediction in HEVC. In this paper, we propose three types of 3 Dimensional WPP (3D-WPP) algorithms that can significantly improve the parallelism, while achieving good tradeoffs between implementation complexity, determinism, and rate-distortion (RD) performance. Experimental results show that the proposed algorithms can lead to up to 2.8 × speed up compared with existing inter frame WPP methods. While the Simple 3D-WPP and Static 3D-, WPP algorithm may introduce an BD rate loss between 0 to 4.9% as compared with existing algorithms, the more complex Dynamic 3D-WPP algorithm achieves better parallelism with virtually no coding performance loss.

Proceedings ArticleDOI
27 Jul 2016
TL;DR: Simulation results demonstrate that the method of this paper can detect the forgery and locate its position.
Abstract: Nowadays videos are widely used in every aspect of society such as transport, security, justice identification and so on. Thus, the authenticity and integrity of video are very important. This paper proposes a new method to detect forgeries of video with statics background. In general, adjacent frames in a video with the same background have strong correlation. If the video being tampered, the continuity of the frames correlation will be disturbed. In this method, pixel lines are obtained by intercepting the sequence of video frames in the horizontal or vertical direction. Every four continuous pixel lines make up a pixel belt. Then, by using the histogram intersection method, the correlation between pixel belts will be calculated. The simulations show that if the video tampered, there will be outliers exist in the correlation coefficients. Simulation results demonstrate that the method of this paper can detect the forgery and locate its position.

Patent
24 Aug 2016
TL;DR: In this article, a change in brightness in ambient lighting is detected when an electronic apparatus operates in a dual-camera mode utilizing images captured by a first image sensor and a second image sensor.
Abstract: Techniques and examples pertaining to frame synchronization for dynamic frame rate in dual-camera applications are described. A change in brightness in ambient lighting is detected when an electronic apparatus operates in a dual-camera mode utilizing images captured by a first image sensor and a second image sensor. In response to the detected change in brightness in the ambient lighting, exposure times and frame rates of the first image sensor and the second image sensor are adjusted and the frame rates are synchronized.

Journal ArticleDOI
TL;DR: This paper summarizes all the proposed techniques involved in digital video inter-frame forgery detection for MPEG-1, 2, 4 and H.264/AVC encoded videos and finds that double MPEG compression was best detected using the technique which utilizes Benford’s law.
Abstract: Objectives: This paper summarizes all the proposed techniques involved in digital video inter-frame forgery detection for MPEG-1, 2, 4 and H.264/AVC encoded videos. Methods/Statistical Analysis: Double compression detection techniques are classified here on the basis of footprints analyzed during detection. The detection methods designed for videos that use fixed GOP structure for first and any of the subsequent compression are different from the videos that use different GOP structure. Video inter-frame forgery detection techniques are then analyzed on the basis of type of forgery they detect and the type of codec used for video encoding. Findings: Digital videos often provide forensic evidence in legal, medical and surveillance applications but are more prone to inter-frame forgeries, which are not only easy to perform but are equally difficult to detect as well. The analysis of the literature ascertained that majority of the proposed techniques are dependent on the number of frames tampered and video codec used to encode the videos. Among these proposed techniques, double MPEG compression was best detected using the technique which utilizes Benford’s law and was proposed by Chen and Shi on MPEG-1 and MPEG-2 encoded videos. On the other hand, Wang et al. gave sound results for all kinds of inter-frame forgeries on MPEG-2 encoded videos by utilizing the measure of optical flow consistency. Since very few authors focussed on forgery detection in MPEG-4 encoded videos and thereby such techniques have not been discussed in many survey papers. Moreover, digital cameras especially surveillance cameras which generate massive amount of videos these days have built-in MPEG-4 codec because it offers a better compression rate. Application/Improvements: Video forensics domain, therefore, is in dire need of a technique that will detect any kind of inter-frame forgery in MPEG-4 encoded videos.

Patent
21 Sep 2016
TL;DR: In this article, a fast interframe prediction method based on motion estimation and temporal-spatial correlation is proposed, which is not limited to video sequences with specific features, and does not rely too much on the image resolution, texture and other features.
Abstract: The invention discloses a fast inter-frame prediction method based on motion estimation and temporal-spatial correlation, comprising the following steps: (1) in terms of motion estimation, two rounds of diamond search respectively at a step length of 1 and at a step length of 2 are carried out with a medium value MV as an initial search point, motion estimation is stopped if an obtained optimal MV is the initial medium value MV, or, a newly obtained optimal MV instead of the medium value MV is used to conduct a TZSearch process; and (2) in terms of PU mode selection and CU depth choice, the coding mode and rate-distortion cost information of nine adjacent blocks of spatial and temporal neighborhoods of a current coding unit are used, and if a certain number of neighborhood coding units adopt a skip mode and the rate-distortion cost of the current coding unit is less than a threshold, that a current block is in a motion flat area is judged, and the PU mode traversal process and CU partition are stopped in advance. The time for high-definition video coding is reduced greatly. The method is not limited to video sequences with specific features, and does not rely too much on the image resolution, texture and other features.

Patent
21 Sep 2016
TL;DR: In this article, an adaptive H264-to-HEVC (High Efficiency Video Coding) inter-frame fast transcoding method and apparatus is presented, which synthesizes adaptive mode mapping and a motion vector and can carry out reasonable repeated utilization.
Abstract: The invention relates to the field of video coding and decoding, and particularly relates to an adaptive H264-to-HEVC (High Efficiency Video Coding) inter-frame fast transcoding method and apparatus. Aiming at the problems in the prior art, the invention provides a fast transcoding method and apparatus. Aiming at the inter-frame prediction acceleration problem in the H264-to-HEVC transcoding, the invention discloses a scheme which synthesizes adaptive mode mapping and a motion vector and can carry out reasonable repeated utilization. In an inter-frame prediction stage, a series of quadtree based segmentation are possibly carried out and then CU8s (Coding Units with a size of 8*8) are obtained; the inter-frame prediction is also possibly interrupted in advance, segment of some CUs does not need to extend to the degree of the CU8, some CUs do not need to be segmented to CU16s (Coding Units with a size of 16*16), and the most ideally, the CU64s (Coding Units with a size of 64*64) are directly and integrally coded, wherein CU32, CU16 and CU8 inter-frame prediction all obtains the temporary optimal mode; and after the entire CU64 inter-frame prediction is ended, the optimal coding mode is obtained, and according to the optimal coding mode, coding is carried out.

Journal ArticleDOI
TL;DR: In this paper, an iterative rate-matching process was proposed for inter-frame decoding in broadcast wireless communication. But the proposed scheme is not optimal in terms of data-rates.
Abstract: A novel inter-frame coding approach to the problem of varying channel-state conditions in broadcast wireless communication is developed in this paper; this problem causes the appropriate code-rate to vary across different transmitted frames and different receivers as well. The main aspect of the proposed approach is that it incorporates an iterative rate-matching process into the decoding of the received set of frames, such that: throughout inter-frame decoding, the code-rate of each frame is progressively lowered to or below the appropriate value, and prior to applying or re-applying conventional physical-layer channel decoding on it. This iterative rate-matching process is asymptotically analyzed in this paper. It is shown to be optimal, in the sense defined in the paper. Consequently, the data-rates achievable by the proposed scheme are derived. Overall, it is concluded that, compared to the existing solutions, inter-frame coding presents a better complexity versus data-rate tradeoff. In terms of complexity, the overhead of inter-frame decoding includes operations that are similar in type and scheduling to those employed in the relatively-simple iterative erasure decoding. In terms of data-rates, compared to the state-of-the-art two-stage scheme involving both error-correcting and erasure coding, inter-frame coding increases the data-rate by a factor that reaches up to $1.55\times$ .

Patent
13 Apr 2016
TL;DR: In this paper, a video decoding macro-block-grade parallel scheduling method for perceiving calculation complexity is proposed, which consists of two critical technologies: the first one involves establishing a macroblock decoding complexity prediction linear model according to entropy decoding and macro-blocks information after reordering such as the number of non-zero coefficients, macroblock interframe predictive coding types, motion vectors and the like, performing complexity analysis on each module, and fully utilizing known macroblock information so as to improve the parallel efficiency.
Abstract: The invention discloses a video decoding macro-block-grade parallel scheduling method for perceiving calculation complexity. The method comprises two critical technologies: the first one involves establishing a macro-block decoding complexity prediction linear model according to entropy decoding and macro-block information after reordering such as the number of non-zero coefficients, macro-block interframe predictive coding types, motion vectors and the like, performing complexity analysis on each module, and fully utilizing known macro-block information so as to improve the parallel efficiency; and the second one involves combining macro-block decoding complexity with calculation parallel under the condition that macro-block decoding dependence is satisfied, performing packet parallel execution on macro-blocks according to an ordering result, dynamically determining the packet size according to the calculation capability of a GPU, and dynamically determining the packet number according to the number of macro-blocks which are currently parallel so that the emission frequency of core functions is also controlled while full utilization of the GPU is guaranteed and high-efficiency parallel is realized. Besides, parallel cooperative operation of a CPU and the GPU is realized by use of a buffer area mode, resources are fully utilized, and idle waiting is reduced.

Patent
20 Jul 2016
TL;DR: In this paper, a data mining-based HEVC inter-frame fast mode selection method is proposed, under the condition that coding rate distortion performance is kept constant, and the coding computation complexity of HEVC is significantly reduced.
Abstract: The invention provides a data mining-based HEVC inter-frame fast mode selection method. The method includes the following steps that: data information collection is performed on videos with different resolutions and different textures; data information is analyzed, so that useful information is identified; the useful information is utilized to build a training sample set; and the training sample set is utilized to build a decision tree, so that testing can be carried out. With the data mining-based HEVC inter-frame fast mode selection method of the invention adopted, under the condition that coding rate distortion performance is kept constant basically, the coding computation complexity of HEVC is significantly reduced, and coding time is greatly shortened.

Proceedings ArticleDOI
01 Dec 2016
TL;DR: Experimental results show that the algorithm works well, can effectively improve the integrity of target detection and gets super pixel by graph-based on the regional targets obtained.
Abstract: For interframe difference will produce ghosting, target area will have holes and other issues, we propose a moving object detection method based on the interframe difference of block and graph-based. First, the current frame image and the adjacent frame image is divided into blocks of the same size, Secondly, set the threshold by OSTU, make the two images with difference operation to get the approximate area of the target, then, on the regional targets obtained we get super pixel by graph-based. Finally, by comparing the image of the result of difference and the image of clustering through the rule proposed in this paper, we can get the target. Experimental results show that the algorithm works well, can effectively improve the integrity of target detection.