scispace - formally typeset
Search or ask a question

Showing papers on "Inter frame published in 2017"


Journal ArticleDOI
Chen Zhao1, Siwei Ma1, Jian Zhang1, Ruiqin Xiong1, Wen Gao1 
TL;DR: A novel algorithm for effectively reconstructing videos from CS measurements based on an effective scheme based on the split Bregman iteration algorithm to solve the formulated weighted minimization problem.
Abstract: The compressive sensing (CS) theory indicates that robust reconstruction of signals can be obtained from far fewer measurements than those required by the Nyquist–Shannon theorem. Thus, CS has great potential in video acquisition and processing, considering that it makes the subsequent complex data compression unnecessary. In this paper, we propose a novel algorithm for effectively reconstructing videos from CS measurements. The algorithm comprises double phases, of which the first phase exploits intra-frame correlation and provides good initial recovery for each frame, and the second phase iteratively enhances reconstruction quality by alternating interframe multihypothesis (MH) prediction and sparsity modeling of residuals in a weighted manner. The weights of residual coefficients are updated in each iteration using a statistical method based on the MH predictions. These procedures are performed in the unit of overlapped patches such that potential blocking artifacts can be effectively suppressed through averaging. In addition, we devise an effective scheme based on the split Bregman iteration algorithm to solve the formulated weighted ${\ell }_{1}$ minimization problem. The experimental results demonstrate that the proposed algorithm outperforms the state-of-the-art methods in both objective and subjective reconstruction quality.

99 citations


Journal ArticleDOI
TL;DR: A novel video forgery detection algorithm for detecting an inter-frame forgery based on Zernike opponent chromaticity moments and a coarseness feature analysis by matching from the coarse-to-fine models is put forward.
Abstract: Inter-frame forgery is the most common type of video forgery methods. However, few algorithms have been suggested for detecting this type of forgery, and the former detection methods cannot ensure the detection speed and accuracy at the same time. In this paper, we put forward a novel video forgery detection algorithm for detecting an inter-frame forgery based on Zernike opponent chromaticity moments and a coarseness feature analysis by matching from the coarse-to-fine models. Coarse detection applied to extract abnormal points is carried out first; each frame is converted from a 3D RGB color space into a 2D opposite chromaticity space combined with the Zernike moment correlation. The juggled points are then obtained exactly from abnormal points using a Tamura coarse feature analysis for fine detection. Coarse detection not only has a high-efficiency detection speed, but also a low omission ratio; however, it is accompanied by mistaken identifications, and the precision is not ideal. Therefore, fine detection was proposed to help to make up the difference in precision. The experimental results prove that this algorithm has a higher efficiency and accuracy than previous algorithms.

57 citations


Journal ArticleDOI
TL;DR: Experimental results demonstrate that the proposed MLGT-based RC method can achieve much better R-D performances, quality smoothness, bit rate accuracy, buffer control results, and subjective visual quality than the other state-of-the-art one-pass RC methods, and the achieved R- D performances are very close to the performance limits from the FixedQP method.
Abstract: In this paper, a joint machine learning and game theory modeling (MLGT) framework is proposed for inter frame coding tree unit (CTU) level bit allocation and rate control (RC) optimization in high efficiency video coding (HEVC). First, a support vector machine-based multi-classification scheme is proposed to improve the prediction accuracy of CTU-level rate-distortion (R-D) model. The legacy “chicken-and-egg” dilemma in video coding is proposed to be overcome by the learning-based R-D model. Second, a mixed R-D model-based cooperative bargaining game theory is proposed for bit allocation optimization, where the convexity of the mixed R-D model-based utility function is proved, and Nash bargaining solution is achieved by the proposed iterative solution search method. The minimum utility is adjusted by the reference coding distortion and frame-level quantization parameter (QP) change. Finally, intra frame QP and inter frame adaptive bit ratios are adjusted to make inter frames have more bit resources to maintain smooth quality and bit consumption in the bargaining game optimization. Experimental results demonstrate that the proposed MLGT-based RC method can achieve much better R-D performances, quality smoothness, bit rate accuracy, buffer control results, and subjective visual quality than the other state-of-the-art one-pass RC methods, and the achieved R-D performances are very close to the performance limits from the FixedQP method.

53 citations


Journal ArticleDOI
TL;DR: This paper presents a comprehensive study and analysis of numerous cutting edge video steganography methods and their performance evaluations from literature, and suggests current research directions and recommendations to improve on existing video Steganography techniques.
Abstract: In the last two decades, the science of covertly concealing and communicating data has acquired tremendous significance due to the technological advancement in communication and digital content. Steganography is the art of concealing secret data in a particular interactive media transporter, e.g., text, audio, image, and video data in order to build a covert communication between authorized parties. Nowadays, video steganography techniques have become important in many video-sharing and social networking applications such as Livestreaming, YouTube, Twitter, and Facebook because of the noteworthy development of advanced video over the Internet. The performance of any steganographic method ultimately relies on the imperceptibility, hiding capacity, and robustness. In the past decade, many video steganography methods have been proposed; however, the literature lacks of sufficient survey articles that discuss all techniques. This paper presents a comprehensive study and analysis of numerous cutting edge video steganography methods and their performance evaluations from literature. Both compressed and raw video steganography methods are surveyed. In the compressed domain, video steganography techniques are categorized according to the video compression stages as venues for data hiding such as intra frame prediction, inter frame prediction, motion vectors, transformed and quantized coefficients, and entropy coding. On the other hand, raw video steganography methods are classified into spatial and transform domains. This survey suggests current research directions and recommendations to improve on existing video steganography techniques.

51 citations


Journal ArticleDOI
TL;DR: The proposed Spatial-Temporal Recurrent Residual Network (STR-ResNet) is able to efficiently reconstruct videos with diversified contents and complex motions, which outperforms the existing video SR approaches and offers new state-of-the-art performances on benchmark datasets.

48 citations


Journal ArticleDOI
TL;DR: Results of extensive experimentation in diverse and realistic forensic set-ups show that the proposed technique can detect and locate tampering with an average accuracy of 83% and 80% respectively, regardless of the number of frames inserted, removed or duplicated.
Abstract: In the midst of low cost and easy-to-use multimedia editing software, which make it exceedingly simple to tamper with digital content, the domain of digital multimedia forensics has attained considerable significance. This research domain deals with production of tools and techniques that enable authentication of digital evidence prior to its use in various critical and consequential matters, such as politics, criminal investigations, defense planning. This paper presents a forensic scheme for detection of frame-based tampering in digital videos, especially those captured by surveillance cameras. Frame-based tampering, which involves insertion, removal or duplication of frames into or from video sequences, is usually very difficult to detect via simple visual inspection. Such forgeries, however, disturb the temporal correlation among successive frames of the tampered video. These disturbances, when analyzed in an appropriate manner, help reveal the evidence of forgery. The forensic technique presented in this paper relies on objective analysis of prediction residual and optical flow gradients for the detection of frame-based tampering in MPEG-2 and H.264 encoded videos. The proposed technique is also capable of determining the exact location of the forgery in the given video sequence. Results of extensive experimentation in diverse and realistic forensic set-ups show that the proposed technique can detect and locate tampering with an average accuracy of 83% and 80% respectively, regardless of the number of frames inserted, removed or duplicated.

32 citations


Journal ArticleDOI
TL;DR: The proposed key frame extraction method for video copyright protection has advantage in computation complexity and robustness on several video formats, video resolution, and so on.
Abstract: The paper proposes a key frame extraction method for video copyright protection. The fast and robust method is based on frame difference with low level features, including color feature and structure feature. A two-stage method is used to extract accurate key frames to cover the content for the whole video sequence. Firstly, an alternative sequence is got based on color characteristic difference between adjacent frames from original sequence. Secondly, by analyzing structural characteristic difference between adjacent frames from the alternative sequence, the final key frame sequence is obtained. And then, an optimization step is added based on the number of final key frames in order to ensure the effectiveness of key frame extraction. Compared with the previous methods, the proposed method has advantage in computation complexity and robustness on several video formats, video resolution, and so on.

32 citations


Journal ArticleDOI
TL;DR: This paper enables video coding for video stabilization by constructing the camera motions based on the motion vectors employed in the video coding by designing a grid-based 2D method, named as CodingFlow, which is optimized for a spatially-variant motion compensation.
Abstract: Video coding focuses on reducing the data size of videos. Video stabilization targets at removing shaky camera motions. In this paper, we enable video coding for video stabilization by constructing the camera motions based on the motion vectors employed in the video coding. The existing stabilization methods rely heavily on image features for the recovery of camera motions. However, feature tracking is time-consuming and prone to errors. On the other hand, nearly all captured videos have been compressed before any further processing and such a compression has produced a rich set of block-based motion vectors that can be utilized for estimating the camera motion. More specifically, video stabilization requires camera motions between two adjacent frames. However, motion vectors extracted from video coding may refer to non-adjacent frames. We first show that these non-adjacent motions can be transformed into adjacent motions such that each coding block within a frame contains a motion vector referring to its adjacent previous frame. Then, we regularize these motion vectors to yield a spatially-smoothed motion field at each frame, named as CodingFlow , which is optimized for a spatially-variant motion compensation. Based on CodingFlow, we finally design a grid-based 2D method to accomplish the video stabilization. Our method is evaluated in terms of efficiency and stabilization quality, both quantitatively and qualitatively, which shows that our method can achieve high-quality results compared with the state-of-the-art methods (feature-based).

31 citations


Proceedings ArticleDOI
05 May 2017
TL;DR: This paper presents a comprehensive study and analysis of numerous cutting edge video steganography methods and their performance evaluations from literature, and suggests current research directions and recommendations to improve on existing video Steganographic techniques.
Abstract: Nowadays, video steganography has become important in many security applications. The performance of any steganographic method ultimately relies on the imperceptibility, hiding capacity, and robustness. In the past decade, many video steganography methods have been proposed; however, the literature lacks of sufficient survey articles that discuss all techniques. This paper presents a comprehensive study and analysis of numerous cutting edge video steganography methods and their performance evaluations from literature. Both compressed and raw video steganographic methods are surveyed. In the compressed domain, video steganographic techniques are categorized according to the video compression stages as venues for data hiding such as intra frame prediction, inter frame prediction, motion vectors, transformed and quantized coefficients, and entropy coding. On the other hand, raw video steganographic methods are classified into spatial and transform domains. This survey suggests current research directions and recommendations to improve on existing video steganographic techniques.

29 citations


Journal ArticleDOI
TL;DR: A new algorithm is proposed for forgery detection in MPEG videos using spatial and time domain analysis of quantization effect on DCT coefficients of I and residual errors of P frames to identify malicious inter-frame forgery comprising frame insertion or deletion.
Abstract: In this paper, a new algorithm is proposed for forgery detection in MPEG videos using spatial and time domain analysis of quantization effect on DCT coefficients of I and residual errors of P frames. The proposed algorithm consists of three modules, including double compression detection, malicious tampering detection and decision fusion. Double compression detection module employs spatial domain analysis using first significant digit distribution of DCT coefficients in I frames to detect single and double compressed videos using an SVM classifier. Double compression does not necessarily imply the existence of malignant tampering in the video. Therefore, malicious tampering detection module utilizes time domain analysis of quantization effect on residual errors of P frames to identify malicious inter-frame forgery comprising frame insertion or deletion. Finally, decision fusion module is used to classify input videos into three categories, including single compressed videos, double compressed videos without malicious tampering and double compressed videos with malicious tampering. The experimental results and the comparison of the results of the proposed method with those of other methods show the efficiency of the proposed algorithm.

26 citations


Journal ArticleDOI
TL;DR: A way to compensate for the inter-frame gap (IFG) of image sensor system, that is, the time gap between consecutive image frames, is proposed by an interleaved Hamming coding scheme.

Journal ArticleDOI
TL;DR: The proposed GHEVC decoder is fully compliant with the HEVC standard, where explicit synchronization points ensure the correct HEVC module execution order and its processing efficiency was highly optimized by keeping the decompressed frames in the GPU memory for subsequent inter frame prediction.
Abstract: The high compression efficiency that is provided by the high efficiency video coding (HEVC) standard comes at the cost of a significant increase of the computational load at the decoder. Such an increased burden is a limiting factor to accomplish real-time decoding, specially for high definition video sequences (e.g., Ultra HD 4K). In this scenario, a highly parallel HEVC decoder for the state-of-the-art graphics processor units (GPUs) is presented, i.e., GHEVC. Contrasting to our previous contributions, the data-parallel GHEVC decoder integrates the whole decompression pipeline (except for the entropy decoding), both for intra- and interframes. Furthermore, its processing efficiency was highly optimized by keeping the decompressed frames in the GPU memory for subsequent inter frame prediction. The proposed GHEVC decoder is fully compliant with the HEVC standard, where explicit synchronization points ensure the correct HEVC module execution order. Moreover, the GPU-based HEVC decoder is experimentally evaluated for different GPU devices, an extensive range of recommended HEVC configurations and video sequences, where an average frame rate of 145, 318, and 605 frames per second for Ultra HD 4K, WQXGA, and Full HD, respectively, was obtained in the Random Access configuration with the NVIDIA GeForce GTX TITAN X GPU.

Journal ArticleDOI
Lin Ding1, Yonghong Tian1, Hongfei Fan1, Yaowei Wang1, Tiejun Huang1 
TL;DR: Extensive experiments show that the proposed DFC can significantly reduce the bitrate of deep features in the videos while maintaining the retrieval accuracy, and is proposed as a rate-performance-loss optimization model.
Abstract: With the explosion in the use of cameras in mobile phones or video surveillance systems, it is impossible to transmit a large amount of videos captured from a wide area into a cloud for big data analysis and retrieval. Instead, a feasible solution is to extract and compress features from videos and then transmit the compact features to the cloud. Meanwhile, many recent studies also indicate that the features extracted from the deep convolutional neural networks will lead to high performance for various analysis and recognition tasks. However, how to compress video deep features meanwhile maintaining the analysis or retrieval performance still remains open. To address this problem, we propose a high-efficiency deep feature coding (DFC) framework in this paper. In the DFC framework, we define three types of features in a group-of-features (GOFs) according to their coding modes (i.e., I-feature, P-feature, and S-feature). We then design two prediction structures for these features in a GOF, including a sequential prediction structure and an adaptive prediction structure. Similar to video coding, it is important for P-feature residual coding optimization to make a tradeoff between feature bitrate and analysis/retrieval performance when encoding residuals. To do so, we propose a rate-performance-loss optimization model. To evaluate various feature coding methods for large-scale video retrieval, we construct a video feature coding data set, called VFC-1M, which consists of uncompressed videos from different scenarios captured from real-world surveillance cameras, with totally 1M visual objects. Extensive experiments show that the proposed DFC can significantly reduce the bitrate of deep features in the videos while maintaining the retrieval accuracy.

Journal ArticleDOI
TL;DR: The experimental results establish the effectiveness of the proposed approach in colorizing videos having different types of motion and visual content even in the presence of occlusion, in terms of accuracy and computational requirements.
Abstract: We propose a new technique for video colorization based on spatiotemporal color propagation in the 3D video volume, utilizing the dominant orientation response obtained from the steerable pyramid decomposition of the video. The volumetric color diffusion from the sources that are marked by scribbles occurs across spatiotemporally smooth regions, and the prevention of leakage is facilitated by the spatiotemporal discontinuities in the output of steerable filters, representing the object boundaries and motion boundaries. Unlike most existing methods, our approach dispenses with the need of motion vectors for interframe color transfer and provides a general framework for image and video colorization. The experimental results establish the effectiveness of the proposed approach in colorizing videos having different types of motion and visual content even in the presence of occlusion, in terms of accuracy and computational requirements.

Journal ArticleDOI
TL;DR: An improved inter prediction algorithm for video coding based on fractal theory and H.264 is proposed, which confirms a slightly increase in Peak Signal-to-Noise Ratio, a significant decrease in bitrate while the time consumed for compression remains less than 60% of that using JM19.0.
Abstract: Video compression has become more and more important nowadays along with the increasing application of video sequences and rapidly growing resolution of them. H.264 is a widely applied video coding standard for academic and commercial purposes. And fractal theory is one of the most active branches in modern mathematics, which has shown a great potential in compression. In this paper, this study proposes an improved inter prediction algorithm for video coding based on fractal theory and H.264. This study take the same approach to make intra predictions as H.264 and this study adopt the fractal theory to make inter predictions. Some improvements are introduced in this algorithm. First, luminance and chrominance components are coded separately and the partitions are no longer associated as in H.264. Second, the partition mode for chrominance components has been changed and the block size now rages from $16\times 16$ to $4\times 4$ , which is the same as luminance components. Third, this study introduced adaptive quantization parameter offset, changing the offset for every frame in the quantization process to acquire better reconstructed image. Comparison between the improved algorithm, the original fractal compress algorithm and JM19.0 (The latest H.264/AVC reference software) confirms a slightly increase in Peak Signal-to-Noise Ratio, a significant decrease in bitrate while the time consumed for compression remains less than 60% of that using JM19.0.

Book ChapterDOI
23 Aug 2017
TL;DR: An AAC steganalysis scheme to detect the steganographies which embedded secret information by modifying MDCT coefficient is proposed, which shows that the detection accuracy is above 85.34% when the relative embed rate is over 50%, this performance is obviously better than the literatures methods.
Abstract: AAC (Advanced Audio Coding), as an efficient audio codec, has been used widely in mobile internet applications. Steganographies based on AAC are emerging and bringing new challenges to information content security. In this paper, an AAC steganalysis scheme to detect the steganographies which embedded secret information by modifying MDCT coefficient is proposed. The modification of MDCT coefficient will cause the statistical characteristic of the difference between inter-frame and intra-frame changed simultaneously. Based on this ideal, we proposed a scheme to extract combination features to classify cover and stego audio. There are 16 groups of sub-features to represent the correlation characteristics between the multi-order differential coefficients of Intra and Inter frame (MDI2), each sub-feature’s performance are analyzed in this paper, and an ensemble classifier is used to realize the steganalyzer. Experiment results show that the detection accuracy of the proposed scheme are above 85.34% when the relative embed rate is over 50%, this performance is obviously better than the literatures methods. Due to the similarity of the coding principle of AAC and MP3, the proposed features can be applied into MP3 steganalysis.

Journal ArticleDOI
TL;DR: Results suggest that the proposed sketch attack can generate the outline image of the original frame for not only intra frame but also inter frame.
Abstract: In this paper, we propose a novel sketch attack for H.264 advanced video coding (H.264/AVC) format-compliant encrypted video. We briefly describe the notion of sketch attack, review the conventional sketch attacks designed for discrete cosine transform (DCT)-based compressed image, and identify their shortcomings when applied to attack compressed video. Specifically, the conventional DCT-based sketch attacks are incapable in sketching outlines for inter frame, which is deployed to significantly reduce temporal redundancy in video compression. To sketch directly from inter frame, we put forward a sketch attack by considering the partially decoded information of the H.264/AVC compressed video, namely, the number of bits spent on coding a macroblock. To evaluate the sketch image, we consider the Canny edge map as the ideal outline image. Experiments are conducted to verify the performance of the proposed sketch attack using ICADR2013, High Efficiency Video Coding dash, and Xiph video data sets. Results suggest that the proposed sketch attack can generate the outline image of the original frame for not only intra frame but also inter frame.

Journal ArticleDOI
TL;DR: It is shown that, given an estimate of the IFC matrix, the proposed approach results in a convex quadratic optimization problem with respect to the reverberation prediction weights, and a closed-form solution can be accordingly derived.

Journal ArticleDOI
TL;DR: A frame-level complexity-based bit-allocation-balancing method, by jointly considering the inter-frame correlation between intra frame and previous encoded inter frame, is brought up so that the smoothness of the visual quality can be improved between adjacent inter- and intra-frames.

Journal ArticleDOI
Fu Caimei1, Youming Li1, Yu-Cheng He2, Ming Jin1, Gang Wang1, Lei Peng1 
TL;DR: Theoretical analysis and simulation results show that the proposed algorithm outperforms the ED algorithm, and an algorithm for searching the optimal dynamic double thresholds is derived with very low complexity according to the Neyman-Pearson (NP) test criterion.
Abstract: This paper proposes an inter-frame dynamic double threshold (IF-DDT) spectrum sensing algorithm in order to improve the sensing performance based on energy detection (ED) in cognitive radios (CRs). Based on both the activity model of the primary user (PU) and the sensing mechanism of the secondary user (SU), the proposed algorithm exploits the relationship between two adjacent sensing frames and designs dynamic double thresholds for each sensing frame to enhance spectrum sensing performance when the received energy cannot give a reliable local decision. The detection probability and false alarm probability of the proposed sensing scheme are analyzed, and an algorithm for searching the optimal dynamic double thresholds is derived with very low complexity according to the Neyman-Pearson (NP) test criterion. Theoretical analysis and simulation results show that the proposed algorithm outperforms the ED algorithm.

Patent
04 Jan 2017
TL;DR: In this paper, a synchronous location and mapping method is proposed, which comprises the following steps: based on magnitude of interframe motion, selecting a key frame sequence from image frames, based on an interframe matching result of the key frame sequences, introducing the image frames into an attitude graph, and constructing one or more effective detection loops for performing bundle adjustment and correcting the image frame based on the attitude graph after the bundle adjustment.
Abstract: The embodiments of the invention relate to a synchronous location and mapping method. The method comprises the following steps: based on magnitude of interframe motion, selecting a key frame sequence from image frames, based on an interframe matching result of the key frame sequence, introducing the image frames into an attitude graph, and based on the attitude graph, constructing one or more effective detection loops for performing bundle adjustment and correcting the image frames based on the attitude graph after the bundle adjustment. The method can effectively reduce the scale of optimization calculation.

Proceedings ArticleDOI
10 Mar 2017
TL;DR: This paper proposes a comprehensive tamper detection method that exploits abnormalities in the spatio-temporal domain along with that in the compressed domain and shows that the proposed method outperforms an existing method in terms of accuracy.
Abstract: Validating the authenticity and integrity of videos appearing on mass and social media is a challenging problem with rapid technological advancement in video editing software. Video editing can be easily accomplished without much effort; however, the consequences can be severe. Tampering of digital video in the temporal domain involves a number of techniques like replication, insertion, shuffling and removal of frames which can be performed even by naive users. Evidences presented in video formats are more appealing and acceptable in the court of law. Therefore, it is important to devise techniques for video forgery detection in digital forensics. In this paper, we propose a comprehensive tamper detection method that exploits abnormalities in the spatio-temporal domain along with that in the compressed domain. Frame shuffling detection which remained as an unexplored area for long time is addressed in our work. We are able to localize and differentiate the type of tampering present in video. Experimental results on 78 videos show that the proposed method outperforms an existing method in terms of accuracy.

Journal Article
TL;DR: An robust and efficient video stabilization algorithm based on inter-frame image matching score that is effective to stabilize translational, rotational, and zooming jitter and robust to local motions is proposed and has the state-of-the-art processing speed to meet the needs of real-time equipment.
Abstract: Video stabilization is an important video enhancement technology which aims at removing annoying shaky motion from videos. In this paper, we propose an robust and efficient video stabilization algorithm based on inter-frame image matching score. Firstly, image matching is performed by a method combining Maximally Stable Extremal Regions (MSERs) detection algorithm and Features from Accelerated Segment Test (FAST) corner detection algorithm, which can get the matching score and the motion parameters of the frame image. Then, the matching score is filtered to filter out the high frequency component and keep the low frequency component. And the motion compensation is performed on the current frame image according to the ratio of the matching score before and after the filtering to retain the global motion and remove the local jitter. Various classical corner detection operators and region matching operators are compared in experiments. And experimental results illustrate that the proposed method is effective to stabilize translational, rotational, and zooming jitter and robust to local motions, and has the state-of-the-art processing speed to meet the needs of real-time equipment.

Journal ArticleDOI
TL;DR: The methodology was investigated over four vital parameters, the file size, computational time, SSIM (Structural Similarity Index) & PSNR (Peak Signal to Noise ratio), which proved the superiority of the proposed technique.

Proceedings ArticleDOI
01 Aug 2017
TL;DR: A novel Multi-Level Subtraction (MLS) approach was proposed for video frame insertion forgery detection achieving a recall rate of 93.92% and precision rate of 100% on a forensically realistic video database.
Abstract: Video forgery acts have been increasing in recent years due to the easy accessibility of sophisticated video editing software. Criminals may be using video tampering as a way to get acquitted on the basis that the video evidences presented in court could not prove that they have performed the crime at a particular time or place. For high profile criminal court cases, it is likely that a video that is altered slightly will be considered as unacceptable. A highly accurate forgery detection system can therefore help in ensuring the authenticity of the video evidences. In this paper, a novel Multi-Level Subtraction (MLS) approach was proposed for video frame insertion forgery detection achieving a recall rate of 93.92% and precision rate of 100% on a forensically realistic video database. Experiments also demonstrated the efficacy of MLS system in system accuracy over other modern video forgery detection techniques available today.

Patent
24 May 2017
TL;DR: In this article, a video interframe prediction enhancement method based on a deep neural network is proposed, which comprises the steps of grouping video sequences with different contents, meanwhile compressing different quantization parameters, thus generating a plurality of training set sequence and test set sequence pairs with different compression rates and grouping the sequence pairs.
Abstract: The invention discloses a video interframe prediction enhancement method based on a deep neural network. The method comprises the steps of grouping video sequences with different contents, meanwhile compressing different quantization parameters, thus generating a plurality of training set sequence and test set sequence pairs with different compression rates and grouping the sequence pairs, extracting pictures from all video sequence pairs of each group to form train set and verification set image pairs under the groups with different compression rates; training a video interframe prediction enhancement model under the group with the compression rate based on the deep convolution neural network; testing validity of the interframe prediction enhancement model, and transplanting the trained model into the video interframe prediction enhancement model of an encoder in the case that the model is valid; parallelizing the test network by using a parallel development tool based on a GPU, and compiling a program into a dynamic link library file, and importing the dynamic link library file into the coder to optimize time complexity. Therefore, the situation that each quantization parameter needs to be trained respectively is avoided, and meanwhile, robustness under a usage scenario is also improved.

Patent
29 Mar 2017
TL;DR: In this article, a real-time video stabilization method based on a homography matrix was proposed for stabilizing video sequences obtained by devices such as a handheld DV and an unmanned aerial vehicle.
Abstract: The invention discloses a real-time video stabilization method based on a homography matrix and is used for stabilizing video sequences obtained by devices such as a handheld DV and an unmanned aerial vehicle. The method comprises the steps of A, extracting angular points uniformly distributed in images; B, calculating an interframe optical flow vector and tracking an interframe motion angular point through utilization of the interframe optical flow vector; C, calibrating the tracked angular point through utilization of a layered affine calibration algorithm; D, solving an interframe homography matrix through utilization of a random sampling unification algorithm; E, separating interframe active motion compensation quantity and distortion calibration quantity by employing a kalman filter; and F, carrying out stabilization conversion on one video image by employing the homography matrix, the motion compensation quantity and the distortion calibration quantity to obtain stabilized current frame output. According to the method, shake existing in the video sequences can be effectively removed, the algorithm complexity is relatively low, the operation speed is fast, and the high application value is provided for the real-time video processing system.

Journal ArticleDOI
TL;DR: The experimental results demonstrate that the proposed scheme can efficiently suppress I frame flicker and maintain the smoothness of subjective quality and can achieve a PSNR gain by 0.26 dB on average when compared with the rate control scheme adopted by the HEVC reference software HM15.0.
Abstract: During I frame switching, the subjective quality between I frame and P frames usually have obvious fluctuation due to different coding methods. The periodic temporal visual fluctuation will cause video flicker. According to extensive experiments, we observe that I frame flicker possess a strong regional characteristics and different region have different degree of flicker. Based on this observation, a region-based I frame rate control scheme is proposed to suppress I frame flicker according to the different characteristics of the moving and non-moving regions. Firstly, by jointly considering the inter-frame dependency between I frame and subsequent un-encoded P frames and the inter-frame correlation between I frame and previous encoded P frame, an optimization model is proposed to achieve the optimal QPs for different regions. Secondly, a region-based inter-frame dependency model is proposed to separately describe the inter-frame dependency of different regions, which can accurately describe their description of the inter-frame dependency. The experimental results demonstrate that the proposed scheme can efficiently suppress I frame flicker and maintain the smoothness of subjective quality. Moreover, the proposed scheme can achieve a PSNR gain by 0.26 dB on average when compared with the rate control scheme adopted by the HEVC reference software HM15.0.

Patent
26 Oct 2017
TL;DR: In this article, a method for obtaining two consecutive video frames at a global motion estimation function for execution on a processor is proposed, wherein the video frames comprise a current video frame and a previous video frame, and estimating motion between the two consecutive videos frames by matching a set of feature points common to both video frames.
Abstract: A method includes obtaining two consecutive video frames at a global motion estimation function for execution on a processor, wherein the video frames comprise a current video frame and a previous video frame, and estimating motion between the two consecutive video frames by matching a set of feature points common to both video frames. The set of feature points is maintained by tracking a number of feature points in the current video frame, refreshing the feature points if the number of feature points falls below a refresh threshold, and replenishing the feature points if the number of feature points falls below a replenish threshold. Motion filtering may be performed by buffering a homogenous transformation of the global motion estimation, calculating a geometric mean of the buffered motions, and estimating intentional camera trajectory based on the geometric mean.

Patent
31 May 2017
TL;DR: In this article, a robot interframe pose estimation method based on a convolution neural network feature descriptor is proposed, which comprises the steps of firstly adopting a feature point extracting algorithm to extract feature point in an image of a current frame, then tailoring a local area image with the position of the feature point being the center, inputting the local image to a CNN provided with a middle layer, and extracting an output vector quantity of the convolutional neural network to be the feature descriptor of feature point; conducting feature point matching on feature descriptors in two adjacent frame images
Abstract: The invention relates to a robot interframe pose estimating method based on a convolution neural network feature descriptor. The method comprises the steps of firstly adopting a feature point extracting algorithm to extract a feature point in an image of a current frame, then tailoring a local area image with the position of the feature point being the center, inputting the local area image to a convolution neural network provided with a middle layer, and extracting an output vector quantity of the middle layer of the convolution neural network to be the feature descriptor of the feature point; conducting feature point matching on feature descriptors in two adjacent frame images, and adopting an interfarme motion estimating algorithm to estimate the change of the pose of the robot between two adjacent frame images according to the obtained feature matching relationship.