scispace - formally typeset
Search or ask a question

Showing papers on "Block-matching algorithm published in 2022"


Journal ArticleDOI
TL;DR: Experimental results show that the algorithm can effectively extract the key frames of aerobics video action, the fidelity of the extracted key frames is higher than 0.9, and the precision and recall are both higher than 99%.

26 citations



Book ChapterDOI
TL;DR: In this article , a neural field architecture for representing and compressing videos that deliberately removes data redundancy through the use of motion information across video frames is proposed, which is typically smoother and less complex than color signals, requires a far fewer number of parameters.
Abstract: Neural fields have emerged as a powerful paradigm for representing various signals, including videos. However, research on improving the parameter efficiency of neural fields is still in its early stages. Even though neural fields that map coordinates to colors can be used to encode video signals, this scheme does not exploit the spatial and temporal redundancy of video signals. Inspired by standard video compression algorithms, we propose a neural field architecture for representing and compressing videos that deliberately removes data redundancy through the use of motion information across video frames. Maintaining motion information, which is typically smoother and less complex than color signals, requires a far fewer number of parameters. Furthermore, reusing color values through motion information further improves the network parameter efficiency. In addition, we suggest using more than one reference frame for video frame reconstruction and separate networks, one for optical flows and the other for residuals. Experimental results have shown that the proposed method outperforms the baseline methods by a significant margin.

9 citations


Journal ArticleDOI
TL;DR: In this article , a method of dance motion recognition and video key frame extraction based on multifeature fusion is designed to learn the complicated and changeable dancer motion recognition, and the video sequences are clustered by the clustering algorithm according to the scene.
Abstract: The purpose of video key frame extraction is to use as few video frames as possible to represent as much video content as possible, reduce redundant video frames, and reduce the amount of computation, so as to facilitate quick browsing, content summarization, indexing, and retrieval of videos. In this paper, a method of dance motion recognition and video key frame extraction based on multifeature fusion is designed to learn the complicated and changeable dancer motion recognition. Firstly, multiple features are fused, and then the similarity is measured. Then, the video sequences are clustered by the clustering algorithm according to the scene. Finally, the key frames are extracted according to the minimum amount of motion. Through the quantitative analysis and research of the simulation results of different models, it can be seen that the model proposed in this paper can show high performance and stability. The breakthrough of video clip retrieval technology is bound to effectively promote the inheritance and development of dance, which is of great theoretical significance and practical value.

4 citations


Journal ArticleDOI
TL;DR: This work describes a video copy detection strategy that created the properties for a spatial-temporal domain and shows various features in terms of both accuracy and efficiency.
Abstract: The detection of video piracy has improved and emerged as a popular issue in the field of digital video copyright protection because a sequence of videos often comprises a huge amount of data. The major difficulty in achieving efficient and simple video copy detection is to identify compressed and exclusionary video characteristics. To do this, we describe a video copy detection strategy that created the properties for a spatial-temporal domain. The first step is to separate each video sequence into the individual video frame, and then extract the boundaries of each video frame by using PCA SIFT and Hessian- Laplace. Next, for each video frame, we have to implement SVM and KNN features in the spatial and temporal domains to measure their performance matrices in the feature extraction. Finally, the global features found in the Video copy detection are accomplished uniquely and efficiently. Experiments arranged a commonly used VCDB 2014 video dataset, showing that result. The proposed approach is based on various copy detection algorithms and shows various features in terms of both accuracy and efficiency.

3 citations


Proceedings ArticleDOI
06 Feb 2022
TL;DR: This work estimates optical flows between a pair of input frames and predicts future motions using two schemes: motion doubling and motion mirroring, and develops a synthesis network to generate a future frame from the warped frames.
Abstract: We propose a novel video frame extrapolation algorithm based on future motion estimation. First, we estimate optical flows between a pair of input frames and predict future motions using two schemes: motion doubling and motion mirroring. Then, we forward warp the input frames by employing the two kinds of predicted motion fields, respectively. Finally, we develop a synthesis network to generate a future frame from the warped frames. Experimental results show that the proposed algorithm outperforms recent video frame extrapolation algorithms on various datasets.

2 citations



Book ChapterDOI
28 Oct 2022
TL;DR: In this paper , the authors proposed an area efficient motion estimator using JAYA optimization-based block matching (JAYA-BM), without compromising searching speed, which is performed using Verilog language and synthesized with different FPGA device families in Xilinx tool.
Abstract: AbstractWe are in modern world and preferred to use modern services such as high-definition television, video-on-demand, video email, and video conferencing. The modern communication system exchanges information via audio, image, video, and graphics formats for long distance. The different video coding formats are MPEG-1, MPEG-2, MPEG-4, H.261, H.263, H.264, Theora, Real video RV40, VP9, and AVI, used for those services. Due to the temporal and spatial redundant problem, codes required a motion estimation algorithm. This paper proposes an area efficient motion estimator using JAYA optimization-based block matching (JAYA-BM), without compromising searching speed. The JAYA-BM algorithm strives to search the motion vector by superfast manner. The field programmable gate array (FPGA) design of proposed motion estimator performed using Verilog language and synthesized with different FPGA device families in Xilinx tool. The simulation result shows that our motion estimator is able to enhance the hardware usage and searching speed with better quality.KeywordsFPGASuperfast motion estimatorJAYA optimizationBlock matching

2 citations


Proceedings ArticleDOI
28 May 2022
TL;DR: Wang et al. as mentioned in this paper proposed a deep video compression method for P-frame in sub-sampled color spaces regarding the YUV420, which has been widely adopted in many state-of-theart hybrid video compression standards, in an effort to achieve high compression performance.
Abstract: In this paper, we propose a deep video compression method for P-frame in sub-sampled color spaces regarding the YUV420, which has been widely adopted in many state-of-art hybrid video compression standards, in an effort to achieve high compression performance. We adopt motion estimation and motion compression to facilitate the inter prediction of the videos with YUV420 color format, shrinking the total data volume of motion information. Moreover, the motion compensation module on YUV420 is cooperated to enhance the quality of the compensated frame with the consideration of the resolution alignment in the sub-sampled color spaces. To explore the cross-component correlation, the residual encoder-decoder is accompanied with two head-branches and color information fusion. Additionally, a weighted loss emphasizing more on the Y component is utilized to enhance the compression efficiency. Experimental results show that the proposed method can realize 19.82% bit rate reductions on average compared to the deep video compression (DVC) method in terms of the combined PSNR and predominant gains on the Y component.

1 citations


Journal ArticleDOI
TL;DR: Wang et al. as mentioned in this paper proposed a feature-space video coding framework (FVC), which performs all major operations in the feature space, including motion estimation, motion compression, motion compensation and residual compression.
Abstract: Deep video compression is attracting increasing attention from both deep learning and video processing community. Recent learning-based approaches follow the hybrid coding paradigm to perform pixel space operations for reducing redundancy along both spatial and temporal dimentions, which leads to inaccurate motion estimation or less effective motion compensation. In this work, we propose a feature-space video coding framework (FVC), which performs all major operations ( i.e. , motion estimation, motion compression, motion compensation and residual compression) in the feature space. Specifically, a new deformable compensation module, which consists of motion estimation, motion compression and motion compensation, is proposed for more effective motion compensation. In our deformable compensation module, we first perform motion estimation in the feature space to produce the motion information ( i.e. , the offset maps). Then the motion information is compressed by using the auto-encoder style network. After that, we use the deformable convolution operation to generate the predicted feature for motion compensation. Finally, the residual information between the feature from the current frame and the predicted feature from the deformable compensation module is also compressed in the feature space. Motivated by the conventional codecs, in which the blocks with different sizes are used for motion estimation, we additionally propose two new modules called resolution-adaptive motion coding (RaMC) and resolution-adaptive residual coding (RaRC) to automatically cope with different types of motion and residual patterns at different spatial locations. Comprehensive experimental results demonstrate that our proposed framework achieves the state-of-the-art performance on three benchmark datasets including HEVC, UVG and MCL-JCV.

1 citations


Proceedings ArticleDOI
01 Mar 2022
TL;DR: In this paper , an adaptive bilateral matching technique for decoder-side motion vector refinement in video coding is presented, which allows encoder to choose not only the conventional bilateral matching mode with symmetric motion vector difference but also the asymmetric alternatives.
Abstract: This paper presents an adaptive bilateral matching technique for decoder-side motion vector refinement in video coding. It allows encoder to choose not only the conventional bilateral matching mode with symmetric motion vector difference but also the asymmetric alternatives. To study the efficiency of the proposed technique, the proposed method is integrated in the Versatile Video Coding Test Model 11.0. The experimental result reports an overall of −2.78% luma Bjontegaard Delta rate for the random-access configurations. The compression efficiency on top of the Enhanced Compression Model that beyond the VVC capability are also reported.

Proceedings ArticleDOI
21 Aug 2022
TL;DR: Wang et al. as discussed by the authors proposed a motion approximation scheme to utilize the motion vector between the reference frames, which is able to generate additional compensated frames to further refine the missing details in the target frame.
Abstract: In recent years, various methods have been proposed to tackle the compressed video quality enhancement problem. It aims at restoring the distorted information in low-quality target frames from high-quality reference frames in the compressed video. Most methods for video quality enhancement contain two key stages, i.e., the synchronization and the fusion stages. The synchronization stage synchronizes the input frames by compensating the estimated motion vector to reference frames. The fusion stage reconstructs each frame with the compensated frames. However, the synchronization stage in previous works merely estimates the motion vector between the reference frame and the target frame. Due to the quality fluctuation of frames and region occlusion of objects, the missing detail information cannot be adequately replenished. To make full use of the temporal motion between input frames, we propose a motion approximation scheme to utilize the motion vector between the reference frames. It is able to generate additional compensated frames to further refine the missing details in the target frame. In the fusion stage, we propose a deep neural network to extract frame features with blended attention to the texture details and the quality discrepancy at different times. The experimental results show the effectiveness and robustness of our method.

Proceedings ArticleDOI
01 Dec 2022
TL;DR: In this article , the authors proposed a next-frame prediction model that predicts the next frame using the previous five frames, which achieves a peak signal-to-noise ratio (PSNR) of 29.35dB with a mean square error loss of 0.003 for the UCF101 dataset video sequence.
Abstract: High Efficiency Video Coding (HEVC) is a video compression standard that compresses video sequences with 50% less bit rate compared to ancestor H.264 standard. In HEVC, the motion compensation block utilizes the motion vectors to generate the motion compensated frame. The motion vectors are generated using the motion estimation process that improves the efficiency of HEVC at the expense of high computational complexity. The next-frame prediction technique can be used to predict the motion compensated frame. This paper proposes the next-frame prediction model that predicts the next frame using the previous five frames. The experimental results show that the proposed method achieves a Peak Signal-to-Noise Ratio(PSNR) of 29.35dB with a mean square error loss of 0.003 for the UCF101 dataset video sequence, which is better than state-of-the-art methods.


Journal ArticleDOI
TL;DR: In this article , the authors proposed an all-direction search (ADS) pattern, which searches for the best block in all possible directions, and a half way stop technique is applied in search process.
Abstract: The search performance of block-matching motion estimation algorithms, i.e., search speed and motion estimation quality, mainly depends on the shape and size of the search pattern. Most motion estimation algorithms employ search patterns of square, diamond, hexagonal, cross diamond or cross hexagonal shapes. These search patterns achieve good results for video sequences with simple motion activity, but unfavourable results for video sequences with complex motion activity. After a thorough investigation of the effect of search pattern on search performance, this paper proposes an all-direction search (ADS) pattern, which searches for the best block in all possible directions. For further improvement in search speed, a half way stop technique is applied in search process. The results show that the proposed ADS algorithm outperforms other state-of-the-art and eminent motion estimation algorithms. The higher the direction complexity in the neighbouring motion vectors, the better the prediction quality of ADS, which is further proved by the simulation results.

Journal ArticleDOI
TL;DR: In this article , a motion estimating criterion is proposed in which takes into account the noise generated in the frame, resulting in inaccurate block selection or rejection, and compared to the previous four criteria in terms of PSNR, the average number of assessed search points for every block, and mean MAD for every picture.
Abstract: The size of video data is growing exponentially worldwide and hence there is a need for better video coding standards. MPEG and H.26X have provided several standards for video coding. The latest and effective video coding standards are AVC, HEVC, and AV1. MPEG and H.26X, which employ block matching approaches for temporal coding and the mean absolute difference (MAD) is commonly used as the block matching parameter. MAD is very simple and there is very little complexity in its implementation, but sometimes MAD results in spurious selection of matched block due to different transformations in sequential images and the noise introduced in the frame. To overcome this problem there have been many different matching criteria like vector matching criterion (VMC), smooth constrained mean absolute error (SCMAE) based on DCT, and scaled value criterion (SVC) based on modified pixel values. Criteria described thus far do not take into account the noise generated in the frame, resulting in inaccurate block selection or rejection. A novel motion estimating criterion is proposed in this study and compared to the previous four criteria in terms of PSNR, the average number of assessed search points for every block, and mean MAD for every picture. The suggested matching criterion improves assessed number of search points by about 33.92%, PSNR value by about 9% and average of MAD per pixel by about 70%.

Journal ArticleDOI
TL;DR: In this article , a modified full search block matching algorithm (MFSA) is introduced to find the appropriate mobility of the moving object in the video frame, which is used to track the object (motion estimation) after it has been identified in the segmented phase.
Abstract: Video surveillance has risen as one of the most promising methods for people who live alone in their dwellings. Few video surveillance innovations have recently been introduced. However, due to various changes in illumination, abrupt shifts in target appearance, identical non-target artifacts in the background, and occlusions, developing a reliable video surveillance algorithm remains a difficult challenge. This work attempts to introduce a new framework for moving object detection and tracking by following four major phases: “Video-to-Frame Conversion, Pre-Processing, Background Subtraction, Feature-Based Multi-object Detection, Multi-object Tracking by Filtering”. Initially, in the Video-to-Frame Conversion process, the recorded input video clips are transformed into distinct frames. During pre-processing, the noise is removed from the video frame using a filtering approach, and thereby the nature of the images will be enhanced. In the proposed work, a Weiner filter is used to remove noise and other undesirable features during the pre-processing. Then, to distinguish the frontal areas of objects, background subtraction is performed using the neutrosophic set in noiseless video frames (pre-processed frames). The objects in the background-subtracted frames are separated using Improved Region Growing (IRG) segmentation model in the Feature-Based Multi-object Detection phase. The objects in the frames are determined from this segmented image. The Modified Full Search Algorithm is being used to track the object (motion estimation) on the video frame after it has been identified in the segmented phase. The Modified full search block matching algorithm (MFSA) is introduced in this research work to find the appropriate mobility. Promising results have been obtained by the proposed work, and also the mathematical excellence of the new method is also proven over other state-of-the-art models.

Proceedings ArticleDOI
21 Sep 2022
TL;DR: In this paper , the authors proposed an algorithm that estimates the motion vectors while taking into account the impact of the decoded residual image on the quality of the compensated image, which provides a higher PSNR for a given bit rate compared to the traditional method.
Abstract: In most video coding standards, the reduction of temporal redundancy in a video is based on the traditional block-matching algorithm (BMA). It first estimates the motion vectors that minimize the distortion between the original image and its predicted version. The difference between these two images, i.e. residual image, is then encoded and its decoded version compensates the predicted image. This paper proposes an algorithm that estimates the motion vectors while taking into account the impact of the decoded residual image on the quality of the compensated image. This algorithm provides a higher PSNR for a given bit rate compared to the traditional method. This proof-of-concept shows the importance of taking into account the compensated image in the motion vector estimation process and should help in the design of solutions based on deep neural networks.

Proceedings ArticleDOI
01 Dec 2022
TL;DR: Wang et al. as mentioned in this paper proposed a fixed threshold to detect real and false targets in sports videos, where if the prediction probability is higher than this threshold, the target is considered as a real target, and if it is lower than the threshold, it is judged as a false target.
Abstract: Panoramic synthesis technology of sports video combined with recurrent neural network is an important content in the field of information analysis, and shot boundary detection is an important support for video analysis. Because the research of video motion information is widely used in video splicing, video segmentation, video compression, video matching, video surveillance and other fields. In recent years, video motion information analysis has become a research direction that attracts much attention in the field of computer vision. However, in practical applications, factors such as background motion, camera shake, and irregular motion of foreground objects caused by camera motion have brought great difficulties to the analysis of motion information in videos. For object detection, the last step is to predict the probability that certain regions contain specific objects. According to a fixed threshold, if the prediction probability is higher than this threshold, it is considered as a real target, and if the prediction probability is lower than this threshold, it is judged as a false target. However, there are some problems with the fixed threshold. In the case of huge scene differences or different sizes of multiple targets in the image, the fixed threshold is difficult to distinguish the high probability false target from the low probability true target.

Book ChapterDOI
14 Sep 2022
TL;DR: In this article , the authors proposed a technique for automatic measurement of motion activity using accumulation of quantized pixel differences among the frames of given video segment, where accumulated motions of shot are represented as a two dimensional matrix.
Abstract: Recently, motion activity which is defined as amount of motion in a video sequence has been included as a descriptor in MPEG-7 standard. The motion activity descriptors (MADs) which describe this motion activity need to enable efficient content analyzing, indexing, browsing, and querying of video data. To address this issue, first, we propose a novel technique for automatic measurement of motion activity using accumulation of quantized pixel differences among the frames of given video segment. As a result, accumulated motions of shot are represented as a two dimensional matrix. Also, we investigate an efficient and scalable technique to compare these matrices and generate MADs that are representing various motions of shots effectively. Not only the degrees (amounts) but also the locations of motions are computed and presented accurately. Our preliminary experimental studies indicate that the proposed techniques are effective in capturing and comparing motion activities.

Posted ContentDOI
02 Dec 2022
TL;DR: In this article , a motion estimation method for real-world fisheye videos is presented by combining perspective projection with knowledge about the underlying FISheye projection, which is obtained by camera calibration since actual lenses rarely follow exact models.
Abstract: Fisheye cameras prove a convenient means in surveillance and automotive applications as they provide a very wide field of view for capturing their surroundings. Contrary to typical rectilinear imagery, however, fisheye video sequences follow a different mapping from the world coordinates to the image plane which is not considered in standard video processing techniques. In this paper, we present a motion estimation method for real-world fisheye videos by combining perspective projection with knowledge about the underlying fisheye projection. The latter is obtained by camera calibration since actual lenses rarely follow exact models. Furthermore, we introduce a re-mapping for ultra-wide angles which would otherwise lead to wrong motion compensation results for the fisheye boundary. Both concepts extend an existing hybrid motion estimation method for equisolid fisheye video sequences that decides between traditional and fisheye block matching in a block-based manner. Compared to that method, the proposed calibration and re-mapping extensions yield gains of up to 0.58 dB in luminance PSNR for real-world fisheye video sequences. Overall gains amount to up to 3.32 dB compared to traditional block matching.

Proceedings ArticleDOI
22 Aug 2022
TL;DR: In this article , the authors present a study of motion coding schemes for learned video compression, which include signaling an incremental flow map between a coding frame and a motion-compensated frame derived from the flow map predictor.
Abstract: This paper presents a study of motion coding schemes for learned video compression. Most learned video compression systems explicitly signal optical flow maps to characterize motion between video frames for motion compensation. The flow maps, usually of the same size as the video frames, represent a considerable portion of the compressed bitstream. This work studies several schemes to make a non-linear prediction of the flow maps for efficient motion coding. These include signaling an incremental flow map between a coding frame and a motion-compensated frame derived from the flow map predictor. In forming the flow map predictor, we propose a learned motion extrapolation module and a motion forward warping scheme. They are further incorporated into two novel approaches, termed double warping and frame synthesis with motion forward warping, in creating an inter-frame predictor by combining the incremental flow and the flow map predictor. Extensive experiments are conducted to analyze the merits and faults of these variants, and demonstrate their superiority to predictive motion coding and intra motion coding.

Journal ArticleDOI
TL;DR: In this paper , the estimation of the accuracy of image reconstruction by lifting filters is described, where the main compression of the video stream is provided by eliminating inter-frame redundancy using motion compensation methods for image fragments of adjacent frames.
Abstract: AbstractIn the article describe the estimation of the accuracy of image reconstruction by lifting filters, that in video codecs, the main compression of the video stream is provided by eliminating inter-frame redundancy using motion compensation methods for image fragments of adjacent frames. However, the use of motion compensation methods requires the formation of additional data (metadata) containing information about the types of image blocks used, the coordinates of their movement, etc. At the same time, in order to increase the compression of the video stream without compromising its quality, higher accuracy of motion compensation is required, which leads to an increase in the number of blocks and, accordingly, to an increase in the volume of metadata that reduces the effectiveness of motion compensation. This is the main problem of compressing streaming video without degrading the quality of images. In addition, the higher accuracy of positioning blocks with motion compensation dramatically reduces the speed of image processing, which is not always feasible in real-time system.KeywordsImagesMetadataStreaming videoCompression

Journal ArticleDOI
01 Oct 2022-Sensors
TL;DR: The experimental results show that the proposed frame selection strategy ensures the maximum safe frame removal under the premise of continuous video content at different vehicle speeds in various halation scenes.
Abstract: In order to address the discontinuity caused by the direct application of the infrared and visible image fusion anti-halation method to a video, an efficient night vision anti-halation method based on video fusion is proposed. The designed frame selection based on inter-frame difference determines the optimal cosine angle threshold by analyzing the relation of cosine angle threshold with nonlinear correlation information entropy and de-frame rate. The proposed time-mark-based adaptive motion compensation constructs the same number of interpolation frames as the redundant frames by taking the retained frame number as a time stamp. At the same time, considering the motion vector of two adjacent retained frames as the benchmark, the adaptive weights are constructed according to the interframe differences between the interpolated frame and the last retained frame, then the motion vector of the interpolated frame is estimated. The experimental results show that the proposed frame selection strategy ensures the maximum safe frame removal under the premise of continuous video content at different vehicle speeds in various halation scenes. The frame numbers and playing duration of the fused video are consistent with that of the original video, and the content of the interpolated frame is highly synchronized with that of the corresponding original frames. The average FPS of video fusion in this work is about six times that in the frame-by-frame fusion, which effectively improves the anti-halation processing efficiency of video fusion.

Proceedings ArticleDOI
16 Feb 2022
TL;DR: In this article , the motion estimation and its typical algorithms of H.264 video coding standard are studied, focusing on UMHexagons (hereinafter referred to as UMH), the initial prediction order of motion vector and 5×5 spiral full search template is optimized, greatly improving the search efficiency.
Abstract: H.264 is an advanced video coding standard with wide application prospects. It has excellent compression performance and excellent image coding quality. It is widely used in many common fields such as video transmission, video storage and video editing. H. 264 uses a lot of complex computation, in which motion estimation takes up the largest amount of computation. Efficient motion estimation algorithm is of great significance to improve the efficiency of video coding. In this paper, the motion estimation and its typical algorithms of H.264 video coding standard are studied, focusing on UMHexagons (hereinafter referred to as UMH), the initial prediction order of motion vector and 5×5 spiral full search template is optimized, greatly improving the search efficiency.

Posted ContentDOI
12 Jan 2022
TL;DR: In this paper , the authors propose a neural field architecture for representing and compressing videos that deliberately removes data redundancy through the use of motion information across video frames, which is typically smoother and less complex than color signals, requires a far fewer number of parameters.
Abstract: Neural fields have emerged as a powerful paradigm for representing various signals, including videos. However, research on improving the parameter efficiency of neural fields is still in its early stages. Even though neural fields that map coordinates to colors can be used to encode video signals, this scheme does not exploit the spatial and temporal redundancy of video signals. Inspired by standard video compression algorithms, we propose a neural field architecture for representing and compressing videos that deliberately removes data redundancy through the use of motion information across video frames. Maintaining motion information, which is typically smoother and less complex than color signals, requires a far fewer number of parameters. Furthermore, reusing color values through motion information further improves the network parameter efficiency. In addition, we suggest using more than one reference frame for video frame reconstruction and separate networks, one for optical flows and the other for residuals. Experimental results have shown that the proposed method outperforms the baseline methods by a significant margin. The code is available in https://github.com/daniel03c1/eff_video_representation

Proceedings ArticleDOI
13 Dec 2022
TL;DR: In this paper , an enhanced motion list reordering (EMLR) approach is proposed, in which the refined motion information is used in the motion lists reordering, and a dedicated motion refinement with a simplified version of motion refinement process is proposed.
Abstract: In video coding, motion information consisting of motion vectors and reference index is typically involved in motion compensation. Motion list is widely used to efficiently compress the motion information, in which a motion index indicating the motion information is signaled. And the compression efficiency can be improved by template matching based motion list reordering. Besides, the motion information is further refined before being used in motion compensation by motion refinement process such as decoder-side motion vector refinement and template matching. However, the original motion information of the motion list rather than the refined motion information is used in the motion list reordering, which limits the coding performance. Therefore, an enhanced motion list reordering (EMLR) approach is proposed in this paper, in which the refined motion information is used in the motion list reordering. To derive the refined motion information, a dedicated motion refinement with a simplified version of motion refinement process is proposed. Furthermore, a simplified version of EMLR with two fast algorithms (EMLR-S) is proposed. Experimental results demonstrate that EMLR can achieve 0.19% BD-rate saving on average, and EMLR-S can achieve 0.1 % BD-rate saving with negligible coding complexity change compared to ECM-4.0 under random access configuration.

Journal ArticleDOI
TL;DR: A fast compensation algorithm for the interframe motion of multimedia video based on Manhattan distance has a high peak signal-to-noise ratio and a high bit rate, which effectively improves the visual effect of the video.
Abstract: To improve the video quality, aiming at the problems of low peak signal-to-noise ratio, poor visual effect, and low bit rate of traditional methods, this paper proposes a fast compensation algorithm for the interframe motion of multimedia video based on Manhattan distance. The absolute median difference based on wavelet transform is used to estimate the multimedia video noise. According to the Gaussian noise variance estimation result, the active noise mixing forensics algorithm is used to preprocess the original video for noise mixing, and the fuzzy C-means clustering method is used to smoothly process the noisy multimedia video and obtain significant information from the multimedia video. The block-based motion idea is to divide each frame of the video sequence into nonoverlapping macroblocks, find the best position of the block corresponding to the current frame in the reference frame according to the specific search range and specific rules, and obtain the relative Manhattan distance between the current frame and the background of multimedia video using the Manhattan distance calculation formula. Then, the motion between the multimedia video frames is compensated. The experimental results show that the algorithm in this paper has a high peak signal-to-noise ratio and a high bit rate, which effectively improves the visual effect of the video.

Journal ArticleDOI
TL;DR: The authors proposed using data formed by compression codecs that are used in the most common video hosting platforms (Youtube, Vimeo, etc.) to form an automated system of comparing video sequences.
Abstract: The use of motion vectors for identifying video sequences has been well studied (in the framework of research on the topic CBCD – Content-Based Copy Detection ‒ detecting copies of videos based on content analysis). This makes it possible to check the similarity of two video fragments or search for a fragment in a larger video sequence. Existing and well-known methods for forming identification datasets typically use complete video stream decoding. The authors suggested using the motion vectors of a compressed video stream, which reduces the computational costs for identifying video sequences and uses simplified algorithms to generate identification data. Unlike the previously proposed methods, which implement either modified video codecs or obsolete ones, the authors propose using data formed by compression codecs that are used in the most common video hosting platforms (Youtube, Vimeo, etc.) The possibility of forming an automated system of comparing video sequences, along with its possibilities and limitations, will be studied in the following works.

Posted ContentDOI
28 Mar 2022
TL;DR: Wang et al. as mentioned in this paper proposed an adaptive-parameter-based visual background extractor (APViBe) algorithm for background modelling, and all motion sequences are identified by using an improved fast compressive tracking (FCT) algorithm based on an adaptive learning rate and measurement matrix (ALMFCT).
Abstract: Abstract We propose a novel algorithm for detecting the duplication of moving objects in video and locating forged motion sequences. First, the algorithm constructs an energy factor ( EF ) curve to identify the suspect frames of the video. Second, an adaptive-parameter-based Visual Background Extractor (ViBe) algorithm (APViBe) is employed for background modelling. Moreover, all motion sequences are identified by using an improved fast compressive tracking (FCT) algorithm based on an adaptive learning rate and measurement matrix (ALMFCT). Third, a similarity-analysis-based scheme (SAS) is designed to search for pairs of suspect motion sequences. Finally, the flip-invariant scale-invariant feature transform (FISIFT) algorithm is used to match the feature points of moving objects in the pairs of suspect motion sequences, based on which the forged motion sequences in the video are confirmed. Experimental results show that the proposed approach outperforms previous algorithms in computational efficiency, accuracy and robustness.