Showing papers on "Inter frame published in 2018"

PDF

Open Access

Journal Article•DOI•

Less is more: Micro-expression recognition from video using apex frame

[...]

Sze-Teng Liong¹, John See², KokSheik Wong³, Raphael C.-W. Phan²•Institutions (3)

Feng Chia University¹, Multimedia University², Monash University Malaysia Campus³

01 Mar 2018-Signal Processing-image Communication

TL;DR: A new feature extractor, Bi-Weighted Oriented Optical Flow (Bi-WOOF) is proposed to encode essential expressiveness of the apex frame of a video, with a proposed technique achieving a state-of-the-art F1-score recognition performance.

...read moreread less

Abstract: Despite recent interest and advances in facial micro-expression research, there is still plenty of room for improvement in terms of micro-expression recognition. Conventional feature extraction approaches for micro-expression video consider either the whole video sequence or a part of it, for representation. However, with the high-speed video capture of micro-expressions (100–200 fps), are all frames necessary to provide a sufficiently meaningful representation? Is the luxury of data a bane to accurate recognition? A novel proposition is presented in this paper, whereby we utilize only two images per video, namely, the apex frame and the onset frame. The apex frame of a video contains the highest intensity of expression changes among all frames, while the onset is the perfect choice of a reference frame with neutral expression. A new feature extractor, Bi-Weighted Oriented Optical Flow (Bi-WOOF) is proposed to encode essential expressiveness of the apex frame. We evaluated the proposed method on five micro-expression databases—CAS(ME) 2 , CASME II, SMIC-HS, SMIC-NIR and SMIC-VIS. Our experiments lend credence to our hypothesis, with our proposed technique achieving a state-of-the-art F1-score recognition performance of 0.61 and 0.62 in the high frame rate CASME II and SMIC-HS databases respectively.

...read moreread less

212 citations

Journal Article•DOI•

Key Frame Extraction in the Summary Space

[...]

Xuelong Li¹, Bin Zhao², Xiaoqiang Lu¹•Institutions (2)

Chinese Academy of Sciences¹, Northwestern Polytechnical University²

01 Jun 2018-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: The experimental results on 80 videos from two datasets indicate the superior performance of the proposed key frame extraction approach, which aims to find the representative frames of the video and filter out similar frames from the representative frame set.

...read moreread less

Abstract: Key frame extraction is an efficient way to create the video summary which helps users obtain a quick comprehension of the video content. Generally, the key frames should be representative of the video content, meanwhile, diverse to reduce the redundancy. Based on the assumption that the video data are near a subspace of a high-dimensional space, a new approach, named as key frame extraction in the summary space, is proposed for key frame extraction in this paper. The proposed approach aims to find the representative frames of the video and filter out similar frames from the representative frame set. First of all, the video data are mapped to a high-dimensional space, named as summary space. Then, a new representation is learned for each frame by analyzing the intrinsic structure of the summary space. Specifically, the learned representation can reflect the representativeness of the frame, and is utilized to select representative frames. Next, the perceptual hash algorithm is employed to measure the similarity of representative frames. As a result, the key frame set is obtained after filtering out similar frames from the representative frame set. Finally, the video summary is constructed by assigning the key frames in temporal order. Additionally, the ground truth, created by filtering out similar frames from human-created summaries, is utilized to evaluate the quality of the video summary. Compared with several traditional approaches, the experimental results on 80 videos from two datasets indicate the superior performance of our approach.

...read moreread less

42 citations

Journal Article•DOI•

Inter-frame passive-blind forgery detection for video shot based on similarity analysis

[...]

Dong-Ning Zhao¹, Dong-Ning Zhao², Ren-Kui Wang³, Zhe-Ming Lu³•Institutions (3)

Shenzhen University¹, Harbin Institute of Technology², Zhejiang University³

01 Oct 2018-Multimedia Tools and Applications

TL;DR: This paper mainly calculates H-S and S-V color histograms of every frame in a video shot and compares the similarity between histograms to detect and locate tampered frames in the shot and utilizes SURF feature extraction and FLANN matching to confirm the forgery types in the tampered locations.

...read moreread less

Abstract: Frame insertion, deletion and duplication are common inter-frame tampering operations in digital videos. In this paper, based on similarity analysis, a passive-blind forensics scheme for video shots is proposed to detect inter-frame forgeries. This method is composed of two parts: HSV (Hue-Saturation-Value) color histogram comparison and SURF (Speeded Up Robust Features) feature extraction together with FLANN (Fast Library for Approximate Nearest Neighbors) matching for double-checking. We mainly calculate H-S and S-V color histograms of every frame in a video shot and compare the similarity between histograms to detect and locate tampered frames in the shot. Then we utilize SURF feature extraction and FLANN matching to further confirm the forgery types in the tampered locations. Experimental results demonstrate that the proposed detection method is efficient and accurate in terms of forgery identification and localization. In contrast to other inter-frame forgery detection methods, our scheme can detect three kinds of forgery operations and has its own superiority and applicability as a passive-blind detection method.

...read moreread less

39 citations

Journal Article•DOI•

Adaptive Quantization Parameter Selection For H.265/HEVC by Employing Inter-Frame Dependency

[...]

Jing He¹, En-hui Yang², Fuzheng Yang¹, Kehu Yang¹•Institutions (2)

Xidian University¹, University of Waterloo²

01 Dec 2018-IEEE Transactions on Circuits and Systems for Video Technology

TL;DR: An adaptive frame-level QP selection algorithm is proposed for the H.265/HEVC random access coding by taking into account the inter-frame dependency, and results show that in comparison with HM-16.0, the proposed algorithm reduces the BD-rate by 3.49% with negligible increase of encoding time.

...read moreread less

Abstract: Rate-distortion optimization (RDO) is widely applied in video coding, which aims at minimizing the coding distortion at a target bitrate. Conventionally, RDO is performed independently on each individual frame to avoid high computational complexity. However, extensive use of temporal/spatial predictions result in strong coding dependencies among neighboring frames, which make the current RDO be non-optimally used. To further improve video coding performance, it would be desirable to perform global RDO among a group of neighboring frames while maintaining approximately the same coding complexity. In this paper, the problem of global RDO is studied by jointly determining the quantization parameters (QPs) for a group of neighboring frames. Specifically, an adaptive frame-level QP selection algorithm is proposed for the H.265/HEVC random access coding by taking into account the inter-frame dependency. To measure the inter-frame dependency, a model based on the energy of prediction residuals is first established. With the help of the model, the problem of global RDO is then analyzed for the hierarchical coding structure in H.265/HEVC. Finally, the QP and the corresponding Lagrangian multiplier for each coding frame are determined adaptively by considering the total impact of its coding distortion on that of future frames in the encoding order. Experimental results show that in comparison with HM-16.0, the proposed algorithm reduces, on average, the BD-rate by 3.49% with negligible increase of encoding time. In addition, the quality fluctuation of the coded video by the proposed algorithm is lower than that by HM-16.0.

...read moreread less

29 citations

Journal Article•DOI•

Indian sign language recognition using graph matching on 3D motion captured signs

[...]

D. Anil Kumar, A. S. C. S. Sastry, P. V. V. Kishore, E. Kiran Kumar

14 Jun 2018-Multimedia Tools and Applications

TL;DR: This study proposes the use of graph matching (GM) to enable 3D motion capture for Indian sign language recognition and demonstrates that the approach increases the accuracy of recognizing signs in continuous sentences.

...read moreread less

Abstract: A machine cannot easily understand and interpret three-dimensional (3D) data. In this study, we propose the use of graph matching (GM) to enable 3D motion capture for Indian sign language recognition. The sign classification and recognition problem for interpreting 3D motion signs is considered an adaptive GM (AGM) problem. However, the current models for solving an AGM problem have two major drawbacks. First, spatial matching can be performed on a fixed set of frames with a fixed number of nodes. Second, temporal matching divides the entire 3D dataset into a fixed number of pyramids. The proposed approach solves these problems by employing interframe GM for performing spatial matching and employing multiple intraframe GM for performing temporal matching. To test the proposed model, a 3D sign language dataset is created that involves 200 continuous sentences in the sign language through a motion capture setup with eight cameras.The method is also validated on 3D motion capture benchmark action dataset HDM05 and CMU. We demonstrated that our approach increases the accuracy of recognizing signs in continuous sentences.

...read moreread less

25 citations

Journal Article•DOI•

Detection of inter-frame forgeries in digital videos.

[...]

K. Sitara¹, K. Sitara², Babu M. Mehtre²•Institutions (2)

University of Hyderabad¹, Reserve Bank of India²

26 May 2018-Forensic Science International

TL;DR: This paper proposes a method for zooming detection and it is incorporated in video tampering detection, capable of differentiating various inter-frame tamper events and its localization in the temporal domain.

...read moreread less

20 citations

Book Chapter•DOI•

A Digital Forensic Technique for Inter–Frame Video Forgery Detection Based on 3D CNN

[...]

Jamimamul Bakas¹, Ruchira Naskar¹•Institutions (1)

National Institute of Technology, Rourkela¹

17 Dec 2018

TL;DR: This paper proposes a deep learning based digital forensic technique using 3D Convolutional Neural Network (3D-CNN) for detection of the above form of video forgery, and proves the performance efficiency of the proposed 3D CNN model is \(97\%\) on an average, and is applicable to a wide range of video quality.

...read moreread less

Abstract: With the present-day rapid growth in use of low-cost yet efficient video manipulating software, it has become extremely crucial to authenticate and check the integrity of digital videos, before they are used in sensitive contexts. For example, a CCTV footage acting as the primary source of evidence towards a crime scene. In this paper, we deal with a specific class of video forgery detection, viz., inter-frame forgery detection. We propose a deep learning based digital forensic technique using 3D Convolutional Neural Network (3D-CNN) for detection of the above form of video forgery. In the proposed model, we introduce a difference layer in the CNN, which mainly targets to extract the temporal information from the videos. This in turn, helps in efficient inter-frame video forgery detection, given the fact that, temporal information constitute the most suitable form of features for inter-frame anomaly detection. Our experimental results prove that the performance efficiency of the proposed deep learning 3D CNN model is \(97\%\) on an average, and is applicable to a wide range of video quality.

...read moreread less

17 citations

Journal Article•DOI•

Video authentication using spatio temporal relationship for tampering detection

[...]

K. N. Sowmya¹, H. R. Chennamma², Lalitha Rangarajan³•Institutions (3)

JSSATE Noida¹, Sri Jayachamarajendra College of Engineering², University of Mysore³

01 Aug 2018

TL;DR: The proposed STTFR algorithm aims to verify video integrity through the creation of a 128 bit message digest from the input video of variable length that will be unique to that video and acts as a fingerprint.

...read moreread less

Abstract: This paper discusses a novel approach to detect inter frame and intra frame video forgery using content based signature. A novel technique called the “Spatio Temporal Triad Feature Relationship” (STTFR) is employed to generate a unique content based signature – value for any given video sequence. The proposed STTFR algorithm aims to verify video integrity through the creation of a 128 bit message digest from the input video of variable length that will be unique to that video and acts as a fingerprint. Change in the video sequence, either at the spatial or at the temporal level will result in a different fingerprint than the one obtained originally. The knowledge of the signature will not enable any person/entity to recreate the original video as the signature is generated by combining spatial and temporal fingerprints in an orderly and systematic approach. We have verified our technique with standard datasets and found accurate results.

...read moreread less

16 citations

Journal Article•DOI•

A multi-channel approach through fusion of audio for detecting video inter-frame forgery

[...]

Tianqiang Huang¹, Xueli Zhang¹, Wei Huang¹, Lingpeng Lin¹, Weifeng Su² - Show less +1 more•Institutions (2)

Fujian Normal University¹, United International College²

01 Aug 2018-Computers & Security

TL;DR: This paper proposed a fusion of audio forensics detection methods for video inter-frame forgery by extracting the results of the audio channel and the video frame sequence channel and using the QDCT feature to fine detect the suspected forgery location.

...read moreread less

14 citations

Journal Article•DOI•

Region-Based Multiple Description Coding for Multiview Video Plus Depth Video

[...]

Chunyu Lin¹, Yao Zhao¹, Jimin Xiao², Tammam Tillo³•Institutions (3)

Beijing Jiaotong University¹, Xi'an Jiaotong-Liverpool University², Libera!³

01 May 2018-IEEE Transactions on Multimedia

TL;DR: A region-based multiple description coding scheme is proposed for robust 3-D video communication in this paper, in which two descriptions are formed by setting the left and right view as dominant in the first and second description, respectively.

...read moreread less

Abstract: Interframe and interview predictions are widely employed in multiview video coding. This technique improves the coding efficiency, but it also increases the vulnerability of the coded bitstream. Thus, one packet loss will affect many subsequent frames in the same view and probably in other referenced views. To address this problem, a region-based multiple description coding scheme is proposed for robust 3-D video communication in this paper, in which two descriptions are formed by setting the left and right view as dominant in the first and second description, respectively. This approach exploits the fact that most regions in the reference view could be synthesized from the base view. Hence, these regions could be skipped or only coarsely encoded. In our work, the disoccluded regions, illumination-affected regions, and remaining regions are first determined and extracted. By assigning different quantization parameters for these three different regions according to the network status, an efficient multiple description scheme is formed. Experimental results demonstrate that the proposed scheme achieves considerably better performance compared with the traditional approach.

...read moreread less

14 citations

Journal Article•DOI•

Joint Intra and Multiple Description Coding for Packet Loss Resilient Video Transmission

[...]

Mohammad Kazemi¹, Razib Iqbal², Shervin Shirmohammadi³•Institutions (3)

University of Isfahan¹, Missouri State University², University of Ottawa³

01 Apr 2018-IEEE Transactions on Multimedia

TL;DR: It is found that, in MDC streams, the best policy is to encode selective frames as I-frame instead of coding some macroblocks of frames in intra mode, and a cost function based on which intra/inter frame type is decided is developed.

...read moreread less

Abstract: Multiple description coding (MDC) is a technique for video transmission over error prone networks where the descriptions are routed over multiple paths. Intra coding such as MDC provides error resiliency but coding in this mode must be decided with care since it degrades the compression ratio. In this paper, we present our investigation results for a new intra coding approach in MDC. We have found that, in MDC streams, the best policy is to encode selective frames as I-frame instead of coding some macroblocks of frames in intra mode. In order to find the most suitable I-frame positions within a given video stream, we developed a cost function based on which intra/inter frame type is decided. The MDC scheme with the proposed intra coding criterion, with and without redundancy optimization, is implemented in the H.264/AVC reference software, JM16.0. Based on the experimental performance evaluation, we show that our method achieves higher average PSNR compared to the other optimized MDCs found in the literature.

...read moreread less

Proceedings Article•DOI•

CamShift Target Tracking Based on the Combination of Inter-frame Difference and Background Difference

[...]

Lianghua He¹, Lai Ge¹•Institutions (1)

China University of Geosciences (Wuhan)¹

25 Jul 2018

TL;DR: Aiming at the detection of moving objects in video series, a moving object detection algorithm based on background difference method and inter-frame difference method is proposed and overcomes the problems of false detection and empty in the previous detection algorithms.

...read moreread less

Abstract: Aiming at the detection of moving objects in video series, a moving object detection algorithm based on background difference method and inter-frame difference method is proposed. A new background update method is proposed to update the unchanged background area into the background frame. Experiments show that this method overcomes the problems of false detection and empty in the previous detection algorithms. The method can meet the need of real-time detection and tracking of moving targets with the advantages of high accuracy and fast calculation speed.

...read moreread less

Journal Article•DOI•

An Inter-Frame Forgery Detection Algorithm for Surveillance Video

[...]

Li Qian, Rangding Wang, Xu Dawen

28 Nov 2018-Information-an International Interdisciplinary Journal

TL;DR: Experimental results demonstrate that the scheme has high detection and localization accuracy and the algorithm is composed of feature extraction and abnormal point localization.

...read moreread less

Abstract: Surveillance systems are ubiquitous in our lives, and surveillance videos are often used as significant evidence for judicial forensics. However, the authenticity of surveillance videos is difficult to guarantee. Ascertaining the authenticity of surveillance video is an urgent problem. Inter-frame forgery is one of the most common ways for video tampering. The forgery will reduce the correlation between adjacent frames at tampering position. Therefore, the correlation can be used to detect tamper operation. The algorithm is composed of feature extraction and abnormal point localization. During feature extraction, we extract the 2-D phase congruency of each frame, since it is a good image characteristic. Then calculate the correlation between the adjacent frames. In the second phase, the abnormal points were detected by using k-means clustering algorithm. The normal and abnormal points were clustered into two categories. Experimental results demonstrate that the scheme has high detection and localization accuracy.

...read moreread less

Proceedings Article•DOI•

Spatial-Temporal Fusion Convolutional Neural Network for Simulated Driving Behavior Recognition

[...]

Yaocong Hu¹, Mingqi Lu¹, Xiaobo Lu¹•Institutions (1)

Southeast University¹

01 Nov 2018

TL;DR: A two stream CNN framework for video-based driving behaviour recognition, in which spatial stream CNN captures appearance information from still frames, whilst temporal streamCNN captures motion information with pre-computed optical flow displacement between a few adjacent video frames is employed.

...read moreread less

Abstract: Abnormal driving behaviour is one of the leading cause of terrible traffic accidents endangering human life. Therefore, study on driving behaviour surveillance has become essential to traffic security and public management. In this paper, we conduct this promising research and employ a two stream CNN framework for video-based driving behaviour recognition, in which spatial stream CNN captures appearance information from still frames, whilst temporal stream CNN captures motion information with pre-computed optical flow displacement between a few adjacent video frames. We investigate different spatial-temporal fusion strategies to combine the intra frame static clues and inter frame dynamic clues for final behaviour recognition. So as to validate the effectiveness of the designed spatial-temporal deep learning based model, we create a simulated driving behaviour dataset, containing 1237 videos with 6 different driving behavior for recognition. Experiment result shows that our proposed method obtains noticeable performance improvements compared to the existing methods.

...read moreread less

Journal Article•DOI•

A Pattern-Based Artificial Bee Colony Algorithm for Motion Estimation in Video Compression Techniques

[...]

D. Jude Hemanth¹, J. Anitha¹•Institutions (1)

Karunya University¹

01 Apr 2018-Circuits Systems and Signal Processing

TL;DR: Experimental results results show the improvement for the proposed approach over other block matching algorithms in terms of the performance measures.

...read moreread less

Abstract: Block matching (BM) motion estimation plays an inevitable role in video coding applications. BM approaches are used for data compression. The compression is achieved by removing the temporal redundancy in the video sequences. In the BM process, each video frame is subdivided into macroblocks. Each macroblock in the current frame is compared with the previous frame. The main objective is to minimize sum absolute difference. In this work, some modifications have been performed on conventional artificial bee colony algorithm to improve the conventional BM systems. An initial pattern is used in the proposed algorithm to reduce the computational cost. The computational cost is represented in terms of search points and convergence time. Experimental results results show the improvement for the proposed approach over other block matching algorithms in terms of the performance measures.

...read moreread less

Patent•

Intra-frame and inter-frame combined prediction method for P frames or B frames

[...]

Ronggang Wang¹, Kui Fan¹, Ge Li¹, Gao Wen¹•Institutions (1)

Peking University¹

25 Sep 2018

TL;DR: In this paper, an intra-frame and inter-frame combined prediction method for P frames or B frames is proposed, which consists of self-adaptively selecting by means of a rate-distortion optimization (RDO) decision whether to use the intra frame and inter frame combined prediction or not.

...read moreread less

Abstract: An intra-frame and inter-frame combined prediction method for P frames or B frames. The method comprises: self-adaptively selecting by means of a rate-distortion optimization (RDO) decision whether to use the intra-frame and inter-frame combined prediction or not; using a method for weighting an intra prediction block and an inter prediction block in the intra-frame and inter-frame combined prediction to obtain a final prediction block; and obtaining the weighting coefficient of the intra prediction block and the inter prediction block according to prediction distortion statistics of the prediction method. Therefore, prediction precision can be improved, and coding and decoding efficiency of the prediction blocks are improved. The advantages of intra prediction and inter prediction are fully utilized in the present invention; and the optimal prediction parts of the two methods are selected to be combined, so that to a certain extent, areas with excessive distortion can be removed out of the intra prediction block and the inter prediction block, thus obtaining a better prediction effect and achieving excellent practicality and robustness.

...read moreread less

Journal Article•DOI•

Sparse3D: A new global model for matching sparse RGB-D dataset with small inter-frame overlap

[...]

Canyu Le¹, Xin Li²•Institutions (2)

Xiamen University¹, Louisiana State University²

01 Sep 2018-Computer-aided Design

TL;DR: A novel 3D global matching algorithm to handle the challenging reconstruction of RGB-D datasets whose inter-frame overlap is small due to insufficient temporal sampling or fast camera movement, and a novel global model for alignment pruning and pose optimization is proposed.

...read moreread less

Abstract: We present a novel 3D global matching algorithm, Sparse3D, to handle the challenging reconstruction of RGB-D datasets whose inter-frame overlap is small due to insufficient temporal sampling or fast camera movement. To support a more reliable reconstruction, two major technical components are proposed: (1) pairwise alignment using a set of complementary features, and (2) a novel global model for alignment pruning and pose optimization. We examine the effectiveness of our algorithm on multiple benchmark datasets under various inter-frame overlap, and demonstrate it better reliability over existing RGB-D reconstruction algorithms.

...read moreread less

Proceedings Article•DOI•

High Secure Video Steganography Based on Shuffling of Data on Least Significant DCT Coefficients

[...]

Meenu Suresh¹, I. Shatheesh Sam¹•Institutions (1)

Nesamony Memorial Christian College¹

14 Jun 2018

TL;DR: The experimental result exemplifies that the algorithm not only provides higher security but also good video quality and can withstand against attacks.

...read moreread less

Abstract: In this paper, a novel video steganography scheme is proposed based on random integer generation in DCT domain. The proposed technique detects the carrier frames using scene change detection. The scene change is identified by the interframe difference of the DCT coefficients. Once the carrier frame is detected the carrier frame is divided into sub-images. The DCT coefficients of the sub-images are estimated and the 8 least significant DCT coefficients are replaced by the threshold value. The threshold value depends on the confidential information to be hidden either 0 or 1. The position of the confidential information depends on the random integer generated. The confidential information is shuffled based on the randomly generated integer, which increases the security. The experimental result exemplifies that the algorithm not only provides higher security but also good video quality and can withstand against attacks.

...read moreread less

Journal Article•DOI•

Fast Video Dehazing Using Per-Pixel Minimum Adjustment

[...]

Zhong Luan¹, Hao Zeng¹, Yuanyuan Shang¹, Zhuhong Shao, Hui Ding - Show less +1 more•Institutions (1)

Capital Normal University¹

12 Feb 2018-Mathematical Problems in Engineering

TL;DR: The proposed algorithm greatly improved the efficiency of video dehazing and avoided halos and block effect and a new quad-tree method to estimate the atmospheric light was proposed.

...read moreread less

Abstract: To reduce the computational complexity and maintain the effect of video dehazing, a fast and accurate video dehazing method is presented. The preliminary transmission map is estimated by the minimum channel of each pixel. An adjustment parameter is designed to fix the transmission map to reduce color distortion in the sky area. We propose a new quad-tree method to estimate the atmospheric light. In video dehazing stage, we keep the atmospheric light unchanged in the same scene by a simple but efficient parameter, which describes the similarity of the interframe image content. By using this method, unexpected flickers are effectively eliminated. Experiments results show that the proposed algorithm greatly improved the efficiency of video dehazing and avoided halos and block effect.

...read moreread less

Patent•

Inter-frame difference and convolutional neural network fusion-based ship video detection method

[...]

Ruan Yaduan, Zhang Yuhang, Zhang Yuandi, Wang Linhuang, Zhao Borui, Chen Qimei - Show less +2 more

29 Jun 2018

TL;DR: Wang et al. as discussed by the authors proposed a convolutional neural network fusion-based ship video detection method, which comprises the four parts of preprocessing a video; obtaining an ROI of each frame and extracting low layer features; obtaining high layer features of image by using a modified VGG16 network; and predicting a ship saliency map of the ROI and extracting a ship target.

...read moreread less

Abstract: The invention discloses an inter-frame difference and convolutional neural network fusion-based ship video detection method. The method comprises the four parts of preprocessing a video; obtaining anROI of each frame and extracting low layer features; obtaining high layer features of each frame of image by using a modified VGG16 network; and predicting a ship saliency map of the ROI of each frameand extracting a ship target. A relationship between continuous video frames is fully utilized; the interference of a background is reduced; a moving ship is accurately located; a ship moving regionis obtained; and compared with ship image saliency detection only using the low layer features, the method not only can be directly applied to the ship video detection but also reduces the situation of incomplete ship detection, has higher adaptability to a complex inland river moving ship scene, has higher detection precision, solves the problem of inaccurate inland river ship target saliency detection, and has extremely high practical application values.

...read moreread less

Proceedings Article•DOI•

Deep Learning-based Transformation Matrix Estimation for Bidirectional Interframe Prediction

[...]

Satoru Jimbo¹, Ji Wang¹, Yoshiyuki Yashima¹•Institutions (1)

Chiba Institute of Technology¹

01 Oct 2018

TL;DR: A new method to apply deep learning to bidirectional interframe prediction in video compression to create an interpolated frame by the geometric transformation matrices estimated by CNN whose inputs are temporally previous and future frames.

...read moreread less

Abstract: In this paper, we propose a new method to apply deep learning to bidirectional interframe prediction in video compression. The novelty of the proposed method is to create an interpolated frame by the geometric transformation matrices estimated by CNN whose inputs are temporally previous and future frames. The proposed method can achieve considerably higher efficiency for bidirectional prediction because the geometric transformation matrix estimated by learning can express parallel translation, zoom in/out and change of blurriness with arbitrary accuracy. Experimental results show the prediction error reduction of over 30% compared with H.265/HEVC, especially for video sequences with small motion.

...read moreread less

Proceedings Article•DOI•

Fast Motion Estimation in HEVC Inter Coding: An Overview of Recent Advances

[...]

Yongfei Zhang¹, Chao Zhang¹, Rui Fan•Institutions (1)

Beihang University¹

01 Nov 2018

TL;DR: This review paper provides a comprehensive review of the state-of-the-art fast ME algorithms for HEVC inter coding, for both integer-pixel and fractional-pixel ME algorithms.

...read moreread less

Abstract: High Efficiency Video Coding (HEVC), the latest video coding standard, is becoming popular due to its excellent coding performance, in particular in the case of high-resolution video applications. However, the significant gain in performance is achieved at the cost of substantially higher encoding complexity than its precedent H.264/AVC, in which motion estimation (ME) is one of the most time-consuming parts that effectively removes temporal redundancy. During the development, especially after the release of H.265/HEVC, plenty of fast ME algorithms have been developed to reduce the motion estimation complexity for better application of HEVC into practical real-time video applications. In this review, we provide a comprehensive review of the state-of-the-art fast ME algorithms for HEVC inter coding, for both integer-pixel and fractional-pixel ME algorithms. In all, this review paper provides a comprehensive review of the recent advances of ME for HEVC inter frame coding and hopefully it may provide valuable leads for the improvement, implementation and applications of HEVC inter-prediction as well as for the ongoing development of the next generation video coding standard.

...read moreread less

Journal Article•DOI•

Algorithm for object detection and tracking combined on four inter-frame difference and optical flow methods

[...]

Liu Xin, Jin Xuanhong

01 Aug 2018-Opto-electronic Engineering

TL;DR: An improved moving objects detection method based on four inter-frame differential method and optical flow algorithm is proposed, which enhances the processing speed of optical flow method and reduces the effects of environment's illumination.

...read moreread less

Abstract: To solve the problem of multiple targets' detection and tracking under the complex environment, in this paper, an improved moving objects detection method is proposed based on four inter-frame differential method and optical flow algorithm. Firstly, four inter-frame difference method is used to process the of video sequences. Then objects in the video is detected accurately by the optical flow algorithm used on light streaming video sequences. This improved method enhances the processing speed of optical flow method and reduces the effects of environment's illumination. Finally, the paper compares the proposed algorithm with particle filter, ViBe algorithm under different scenarios with different moving targets and individual number. This improved method is proved not only with good robustness, but also can work more quickly and accurately on the target detection and tracking.

...read moreread less

Proceedings Article•DOI•

Object Detection by using Interframe Difference Algorithm

[...]

Taichi Nakashima¹, Yoshito Yabuta¹•Institutions (1)

Tottori University¹

01 Sep 2018

TL;DR: Through experiments, it was demonstrated that when the object to be detected moves in the field of view of a camera, the proposed method can distinguish between the presence and absence of the object.

...read moreread less

Abstract: In this paper, we describe a detection algorithm for detecting moving objects in a video frame. The proposed method utilizes the interframe difference and applies dynamic binarization using discriminant analysis. Through experiments, it was demonstrated that when the object to be detected moves in the field of view of a camera, the proposed method can distinguish between the presence and absence of the object. The positions of the moving object in the image are determined by observing the histograms of each frame.

...read moreread less

Proceedings Article•DOI•

A Sub-Partitioning Method for Point Cloud Inter-prediction Coding

[...]

Cristiano Santos¹, Fernando Lopes, Antonio M. G. Pinheiro, Luis A. da Silva Cruz•Institutions (1)

Universidade Federal de Pelotas¹

01 Dec 2018

TL;DR: An inter prediction technique based on the ICP algorithm and variable-size macroblocks that can be used to significantly reduce the number of bits required to represent a point cloud video sequence by inter coding the geometry information.

...read moreread less

Abstract: In recent years, 3D point clouds have gained more attention with the possibility of applications such as virtual reality, autonomous vehicles and 3D mapping of historical artifacts, among others. However, raw point clouds generate very large amounts of data. Thus, compression is essential to enable emerging 3D systems for communication and storage.This paper presents an inter prediction technique based on the ICP algorithm and variable-size macroblocks that can be used to significantly reduce the number of bits required to represent a point cloud video sequence by inter coding the geometry information. Since consecutive frames in dynamic point cloud sequences are not guaranteed to fill the exact same 3D volume, a spatial alignment step before motion estimation is required to increase the likelihood of good matchings and thus generate a high number of inter-coded macroblocks. A decision step is also included in order to select the most favorable coding mode: intra-coding, inter-coding or inter-coding with macroblock subdivision. The proposed technique was tested in the PCC MPEG reference software and four MPEG test sequences, obtaining average bitrate reductions of about 8% with PSNR gains up to 1dB.

...read moreread less

Patent•

Inter-frame fast mode selection method based on decision tree

[...]

Zhang Hao, Lei Shizhe, Wang Saibo, Mou Fan, Fu Ting - Show less +1 more

15 Jun 2018

TL;DR: In this article, an inter-frame fast mode selection method based on a decision tree is proposed, which is characterized by obtaining CU information at a specific location with good correlation; carrying out decision tree prediction to obtain predictive coding of an optimum mode, and obtaining some information after current CU coding in real time; and by utilizing correlation of time domain and airspace information, and combined with relevant information of surrounding CUs, carrying out fine tuning on the number and sequence of interframe coding modes.

...read moreread less

Abstract: The invention discloses an inter-frame fast mode selection method based on a decision tree. The method is characterized by obtaining CU information at a specific location with good correlation; carrying out decision tree prediction to obtain predictive coding of an optimum mode, and obtaining some information after current CU coding in real time; and by utilizing correlation of time domain and airspace information, and combined with relevant information of surrounding CUs, carrying out fine tuning on the number and sequence of inter-frame coding modes. The method can predict the inter-frame mode in advance, can adjust mode sequence in real time in the inter-frame mode prediction process and skip unnecessary mode prediction, thereby greatly reducing inter-frame mode prediction time and reducing coding time; and the method is simple and feasible, and facilitates industrialization promotion of a new video coding standard.

...read moreread less

Proceedings Article•DOI•

Adaptive Baseline Monocular Dense Mapping with Inter-Frame Depth Propagation

[...]

Kaixuan Wang¹, Shaojie Shen¹•Institutions (1)

Hong Kong University of Science and Technology¹

01 Oct 2018

TL;DR: Two core contributions are proposed to improve the mapping performance by exploiting the information in multi-baseline observations and sequential depth estimations by integrating the sequential depth estimation of the same physical point in a robust probabilistic manner.

...read moreread less

Abstract: State-of-the-art monocular dense mapping methods usually divide the image sequence into several separate multi-view stereo problems thus have limited utilization of the information in multi-baseline observations and sequential depth estimations. In this paper, two core contributions are proposed to improve the mapping performance by exploiting the information. The first is an adaptive baseline matching cost computation that uses the sequential input images to provide each pixel with wide-baseline observations. The second is a frame-to-frame propagated depth filter which integrates the sequential depth estimation of the same physical point in a robust probabilistic manner. Two contributions are integrated into a monocular dense mapping system that generates the depth maps in real-time for both pinhole and fisheye cameras. Our system is fully parallelized and can run at more than 25 fps on a Nvidia Jetson TX2. We compare our work with state-of-the-art methods on the public dataset. Onboard UAV mapping and handhold experiments are also used to demonstrate the performance of our method. For the benefit of the community, we make the implementation open source11https://github.com/HKUST-Aerial-Robotics/Pinhole-Fisheye-Mapping .

...read moreread less

Journal Article•DOI•

Adaptive Gradient Information and BFGS Based Inter Frame Rate Control for High Efficiency Video Coding

[...]

Ye Yuyun¹, Xiaohai He¹, Teng Qizhi¹, Linbo Qing¹, Lin Hongwei², Lin Hongwei¹, Xia Dechun¹ - Show less +3 more•Institutions (2)

Sichuan University¹, Northwest University for Nationalities²

01 Jun 2018-Multimedia Tools and Applications

TL;DR: Experimental results show that the proposed GBRL method can achieve bitrate error reduction and peak signal to noise ratio (PSNR) improvement especially for the sequences with large motion, compared to the state-of-the-art rate control methods.

...read moreread less

Abstract: In order to meet the emerging demands of high-fidelity video services, a new video coding standard — High Efficiency Video Coding (HEVC) is developed to improve the compression performance of high definition (HD) videos and save half of the bitrate for the same perceptual video quality compared with H.264/Advanced Video Coding (AVC). Rate control still plays a significant role in HD video data transmission via the communication channel. However, R-lambda model based HEVC rate control algorithm does not take the relationship between the encoding complexity and Human Visual System (HVS) into account, what’s more, the convergence speed of Least Mean Square (LMS) algorithm is slow. In this paper, an adaptive gradient information and Broyden Fletcher Goldfarb Shanno (BFGS) based R-lambda model (GBRL) is proposed for the inter frame rate control, where the gradient based on Sobel operator can effectively measure the frame-content complexity and BFGS algorithm converges speedily than LMS algorithm. Experimental results show that the proposed GBRL method can achieve bitrate error reduction and peak signal to noise ratio (PSNR) improvement especially for the sequences with large motion, compared to the state-of-the-art rate control methods. In addition, if the optimal initial quantization parameter (QP) prediction model based on linear regression can be incorporated into the proposed GBRL method, the performance of rate control can be further improved.

...read moreread less

Patent•

Video object segmentation method and device, electronic equipment, storage medium and program

[...]

Li Xiaoxiao, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi, Ping Luo, Lyu Jianqin, Xiaoou Tang - Show less +5 more

29 Jun 2018

TL;DR: In this article, a video object segmentation method is described, which is based on transferring the segmentation result of a reference frame to at least one other frame in the video.

...read moreread less

Abstract: The embodiment of the invention discloses a video object segmentation method and device, electronic equipment, a storage medium and a program. The method comprises the following steps that: in at least parts of frames of a video, carrying out the interframe transfer of the object segmentation result of a reference frame in sequence from the reference frame, and obtaining the object segmentation result of each other frame in at least parts of frames; determining other frames of lost objects of the object segmentation result of the reference frame in at least parts of frames; taking other determined frames as target frames to segment the lost object so as to update the object segmentation result of the target frame; and transferring the object segmentation result of which the target frame isupdated to at least one other frame in the video. By use of the embodiment of the invention, video object segmentation result accuracy is improved.

...read moreread less

Patent•

Fast selection method and device of inter-frame prediction mode, and electronic equipment

[...]

Zhang Wendong, Zhang Peichuan, Zhong Liang, Bao Jiajing

05 Jun 2018

TL;DR: In this paper, the authors proposed a fast selection method for inter-frame prediction, which comprises the steps of judging whether a current coding unit is a minimal coding unit with preset depth; if no, dividing the current coding units into four sub-coding units; computing a rate-distortion cost of the current unit under a split mode and a minimal ratedistortioncost under a to-be-selected undivided mode; and determining the best prediction mode of the currently coding unit according to the rate distortion cost under the split mode, and the minimal rate-

...read moreread less

Abstract: The invention provides a fast selection method and a fast selection device of an inter-frame prediction mode, and electronic equipment. The method comprises the steps of judging whether a current coding unit is a minimal coding unit with preset depth; if no, dividing the current coding unit into four sub-coding units; computing a rate-distortion cost of the current coding unit under a Split mode and a minimal rate-distortion cost of the current coding unit under a to-be-selected undivided mode; and determining the best prediction mode of the current coding unit according to the rate-distortioncost under the Split mode and the minimal rate-distortion cost under the to-be-selected undivided mode. According to the method provided by the invention, the best prediction mode is determined according to the rate-distortion cost under the Split mode and the minimal rate-distortion cost under the to-be-selected undivided mode, good coding quality and high coding rate can be simultaneously effectively achieved, the coding rate can be greatly improved under the premise of ensuring the coding quality, and the problem that the existing method is difficult to simultaneously achieve good coding quality and high coding rate is alleviated.

...read moreread less