scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Efficient Parallel Framework for HEVC Motion Estimation on Many-Core Processors

TL;DR: This paper analyzes the ME structure in HEVC and proposes a parallel framework to decouple ME for different partitions on many-core processors and achieves more than 30 and 40 times speedup for 1920 × 1080 and 2560 × 1600 video sequences, respectively.
Abstract: High Efficiency Video Coding (HEVC) provides superior coding efficiency than previous video coding standards at the cost of increasing encoding complexity. The complexity increase of motion estimation (ME) procedure is rather significant, especially when considering the complicated partitioning structure of HEVC. To fully exploit the coding efficiency brought by HEVC requires a huge amount of computations. In this paper, we analyze the ME structure in HEVC and propose a parallel framework to decouple ME for different partitions on many-core processors. Based on local parallel method (LPM), we first use the directed acyclic graph (DAG)-based order to parallelize coding tree units (CTUs) and adopt improved LPM (ILPM) within each CTU (DAGILPM), which exploits the CTU-level and prediction unit (PU)-level parallelism. Then, we find that there exist completely independent PUs (CIPUs) and partially independent PUs (PIPUs). When the degree of parallelism (DP) is smaller than the maximum DP of DAGILPM, we process the CIPUs and PIPUs, which further increases the DP. The data dependencies and coding efficiency stay the same as LPM. Experiments show that on a 64-core system, compared with serial execution, our proposed scheme achieves more than 30 and 40 times speedup for 1920 × 1080 and 2560 × 1600 video sequences, respectively.
Citations
More filters
Journal ArticleDOI
TL;DR: A deep HSI sharpening method is presented for the fusion of an LR-HSI with an HR-MSI, which directly learns the image priors via deep convolutional neural network-based residual learning.
Abstract: Hyperspectral image (HSI) sharpening, which aims at fusing an observable low spatial resolution (LR) HSI (LR-HSI) with a high spatial resolution (HR) multispectral image (HR-MSI) of the same scene to acquire an HR-HSI, has recently attracted much attention. Most of the recent HSI sharpening approaches are based on image priors modeling, which are usually sensitive to the parameters selection and time-consuming. This paper presents a deep HSI sharpening method (named DHSIS) for the fusion of an LR-HSI with an HR-MSI, which directly learns the image priors via deep convolutional neural network-based residual learning. The DHSIS method incorporates the learned deep priors into the LR-HSI and HR-MSI fusion framework. Specifically, we first initialize the HR-HSI from the fusion framework via solving a Sylvester equation. Then, we map the initialized HR-HSI to the reference HR-HSI via deep residual learning to learn the image priors. Finally, the learned image priors are returned to the fusion framework to reconstruct the final HR-HSI. Experimental results demonstrate the superiority of the DHSIS approach over existing state-of-the-art HSI sharpening approaches in terms of reconstruction accuracy and running time.

302 citations


Cites background from "Efficient Parallel Framework for HE..."

  • ...selections, which may need parallel computing [15], [16] to speed up....

    [...]

Journal ArticleDOI
TL;DR: A snapshot of the fast-growing deep learning field for microscopy image analysis, which explains the architectures and the principles of convolutional neural networks, fully Convolutional networks, recurrent neural Networks, stacked autoencoders, and deep belief networks and their formulations or modelings for specific tasks on various microscopy images.
Abstract: Computerized microscopy image analysis plays an important role in computer aided diagnosis and prognosis. Machine learning techniques have powered many aspects of medical investigation and clinical practice. Recently, deep learning is emerging as a leading machine learning tool in computer vision and has attracted considerable attention in biomedical image analysis. In this paper, we provide a snapshot of this fast-growing field, specifically for microscopy image analysis. We briefly introduce the popular deep neural networks and summarize current deep learning achievements in various tasks, such as detection, segmentation, and classification in microscopy image analysis. In particular, we explain the architectures and the principles of convolutional neural networks, fully convolutional networks, recurrent neural networks, stacked autoencoders, and deep belief networks, and interpret their formulations or modelings for specific tasks on various microscopy images. In addition, we discuss the open challenges and the potential trends of future research in microscopy image analysis using deep learning.

235 citations

Journal ArticleDOI
TL;DR: A novel model, stacked autoencoder Levenberg-Marquardt model, which is a type of deep architecture of neural network approach aiming to improve forecasting accuracy, and an optimized structure of the traffic flow forecasting model with a deep learning approach is presented.
Abstract: Forecasting accuracy is an important issue for successful intelligent traffic management, especially in the domain of traffic efficiency and congestion reduction. The dawning of the big data era brings opportunities to greatly improve prediction accuracy. In this paper, we propose a novel model, stacked autoencoder Levenberg-Marquardt model, which is a type of deep architecture of neural network approach aiming to improve forecasting accuracy. The proposed model is designed using the Taguchi method to develop an optimized structure and to learn traffic flow features through layer-by-layer feature granulation with a greedy layerwise unsupervised learning algorithm. It is applied to real-world data collected from the M6 freeway in the U.K. and is compared with three existing traffic predictors. To the best of our knowledge, this is the first time that an optimized structure of the traffic flow forecasting model with a deep learning approach is presented. The evaluation results demonstrate that the proposed model with an optimized structure has superior performance in traffic flow forecasting.

216 citations


Cites background from "Efficient Parallel Framework for HE..."

  • ...In terms of efficiency concerns [39], [40], the computational time is also an important factor in traffic flow forecasting....

    [...]

Journal ArticleDOI
TL;DR: Inspired by kernel learning, a kernel version of ML-ELM is developed, namely, multilayer kernel ELM (ML-KELM), whose contributions are elimination of manual tuning on the number of hidden nodes in every layer and no random projection mechanism so as to obtain optimal model generalization.
Abstract: Recently, multilayer extreme learning machine (ML-ELM) was applied to stacked autoencoder (SAE) for representation learning. In contrast to traditional SAE, the training time of ML-ELM is significantly reduced from hours to seconds with high accuracy. However, ML-ELM suffers from several drawbacks: 1) manual tuning on the number of hidden nodes in every layer is an uncertain factor to training time and generalization; 2) random projection of input weights and bias in every layer of ML-ELM leads to suboptimal model generalization; 3) the pseudoinverse solution for output weights in every layer incurs relatively large reconstruction error; and 4) the storage and execution time for transformation matrices in representation learning are proportional to the number of hidden layers. Inspired by kernel learning, a kernel version of ML-ELM is developed, namely, multilayer kernel ELM (ML-KELM), whose contributions are: 1) elimination of manual tuning on the number of hidden nodes in every layer; 2) no random projection mechanism so as to obtain optimal model generalization; 3) exact inverse solution for output weights is guaranteed under invertible kernel matrix, resulting to smaller reconstruction error; and 4) all transformation matrices are unified into two matrices only, so that storage can be reduced and may shorten model execution time. Benchmark data sets of different sizes have been employed for the evaluation of ML-KELM. Experimental results have verified the contributions of the proposed ML-KELM. The improvement in accuracy over benchmark data sets is up to 7%.

156 citations


Cites background from "Efficient Parallel Framework for HE..."

  • ...For execution, Xfinal = g(X(1) · (̃((1)))T ) · ̃ unified is directly calculated that alleviates both issues of memory storage and execution time for time-critical applications [18]–[20]....

    [...]

Journal ArticleDOI
TL;DR: This paper proposes a switching-type controller, in which a nonlinear controller with two parameters to be tuned is first designed by adding a power integrator, and then a switching mechanism is proposed to tune the parameters online to finite-time stabilize the system.
Abstract: In this paper, global adaptive finite-time stabilization is investigated by logic-based switching control for a class of uncertain nonlinear systems with the powers of positive odd rational numbers Parametric uncertainties entering the state equations nonlinearly can be fast time-varying or jumping at unknown time instants, and the control coefficient appearing in the control channel can be unknown The bounds of the parametric uncertainties and the unknown control coefficient are not required to know a priori Our proposed controller is a switching-type one, in which a nonlinear controller with two parameters to be tuned is first designed by adding a power integrator, and then a switching mechanism is proposed to tune the parameters online to finite-time stabilize the system An example is provided to demonstrate the effectiveness of the proposed result

121 citations


Cites background from "Efficient Parallel Framework for HE..."

  • ...It was demonstrated that finitetime stable systems usually have some desired features such as faster convergence rates, higher accuracies, and better disturbance rejection properties [4], [5]....

    [...]

References
More filters
Book
01 Jan 1974
TL;DR: This text introduces the basic data structures and programming techniques often used in efficient algorithms, and covers use of lists, push-down stacks, queues, trees, and graphs.
Abstract: From the Publisher: With this text, you gain an understanding of the fundamental concepts of algorithms, the very heart of computer science. It introduces the basic data structures and programming techniques often used in efficient algorithms. Covers use of lists, push-down stacks, queues, trees, and graphs. Later chapters go into sorting, searching and graphing algorithms, the string-matching algorithms, and the Schonhage-Strassen integer-multiplication algorithm. Provides numerous graded exercises at the end of each chapter. 0201000296B04062001

9,262 citations


"Efficient Parallel Framework for HE..." refers background or methods in this paper

  • ...In this section, on the premise of keeping data dependencies and coding efficiency the same as LPM, we will first generate a DAG [30] to capture the dependency relationships among neighboring CTUs....

    [...]

  • ...We generate a DAG [30], [31] to capture the depen-...

    [...]

  • ...CTUs and the precedence constraints among the CTUs [30]....

    [...]

Journal ArticleDOI
TL;DR: The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards-in the range of 50% bit-rate reduction for equal perceptual video quality.
Abstract: High Efficiency Video Coding (HEVC) is currently being prepared as the newest video coding standard of the ITU-T Video Coding Experts Group and the ISO/IEC Moving Picture Experts Group. The main goal of the HEVC standardization effort is to enable significantly improved compression performance relative to existing standards-in the range of 50% bit-rate reduction for equal perceptual video quality. This paper provides an overview of the technical features and characteristics of the HEVC standard.

7,383 citations


"Efficient Parallel Framework for HE..." refers background in this paper

  • ...H IGH Efficiency Video Coding (HEVC) is the stateof-the-art video coding standard [2]–[5]....

    [...]

  • ...The current PU may have data dependencies on its neighboring left, left-down, upper, upperleft, and upper-right PUs, whose motion data may be available for the current PU [2]....

    [...]

  • ...The size of CTU is usually set as 64 × 64 [2]....

    [...]

01 Jan 2001

4,379 citations


"Efficient Parallel Framework for HE..." refers methods in this paper

  • ...The coding efficiency of all the methods is compared in terms of combined Bjøntegaard delta bitrates (BD-rate) [38], [39], which is calculated by the average PSNR...

    [...]

Journal ArticleDOI
TL;DR: Experimental data are presented that clearly demonstrate the scope of application of peak signal-to-noise ratio (PSNR) as a video quality metric and it is shown that as long as the video content and the codec type are not changed, PSNR is a valid quality measure.
Abstract: Experimental data are presented that clearly demonstrate the scope of application of peak signal-to-noise ratio (PSNR) as a video quality metric. It is shown that as long as the video content and the codec type are not changed, PSNR is a valid quality measure. However, when the content is changed, correlation between subjective quality and PSNR is highly reduced. Hence PSNR cannot be a reliable method for assessing the video quality across different video contents.

1,899 citations