scispace - formally typeset
Proceedings ArticleDOI

Hierarchical Parallelization of an H.264/AVC Video Encoder

TLDR
A hierarchical parallelization of H.264 encoders very well suited to low cost clusters is proposed and is a compromise between speed-up and latency and then a broader spectrum of applications can be covered.
Abstract
Last generation video encoding standards increase computing demands in order to reach the limits on compression efficiency. This is particularly the case of H.264/AVC specification that is gaining interest in industry. We are interested in applying parallel processing to H.264 encoders in order to fulfill the computation requirements imposed by stressing applications like video on demand, videoconference, live broadcast, etc. Given a delivered video quality and bit rate, the main complexity parameters are image resolution, frame rate and latency. These parameters can still be pushed forward in such a way that special purpose hardware solutions are not available. Parallel processing based on off-the-shelf components is a more flexible general purpose alternative. In this work we propose a hierarchical parallelization of H.264 encoders very well suited to low cost clusters. Our proposal uses MPI message passing parallelization at two levels: GOP and frame. The GOP level encodes simultaneously several groups of consecutive frames and the frame level encodes in parallel several slices of one frame. In previous work we found that GOP parallelism alone gives good speed-up but imposes very high latency, on the other side frame parallelism gets less efficiency but low latency. Combining both approaches we obtain a compromise between speed-up and latency and then a broader spectrum of applications can be covered.

read more

Citations
More filters
Journal ArticleDOI

Parallel Scalability of Video Decoders

TL;DR: This work investigates the parallelism available in video decoders, an important application domain now and in the future, and proposes a new parallelization strategy, called Dynamic 3D-Wave, which allows certain MBs of consecutive frames to be decoded in parallel.
Patent

Method and system for parallel encoding of a video

TL;DR: In this article, a method and system for parallel encoding of frames in a video are described, exploiting parallel processing at both frame and slice levels, and a significant speedup in comparison to the sequential encoding approach is achieved while maintaining high visual quality for output video.
Proceedings ArticleDOI

Scalability of Macroblock-level Parallelism for H.264 Decoding

TL;DR: This study presents a quantitative analysis of the main bottlenecks of the application and estimates the acceleration levels that are required to make the MB-level parallel decoder scalable.
Proceedings ArticleDOI

Designing multi-leader-based Allgather algorithms for multi-core clusters

TL;DR: This work proposes a novel and scalable multi-leader-based hierarchical Allgather design that allows better cache sharing for Non-Uniform Memory Access (NUMA) machines and makes better use of the network speed available with high performance interconnects such as InfiniBand.
Book ChapterDOI

Parallel H.264 Decoding on an Embedded Multicore Processor

TL;DR: This work presents an implementation of the 3D-Wave parallelization strategy on a multicore architecture composed of NXP TriMedia TM3270 embedded processors and shows that the parallel H.264 implementation scales very well, achieving a speedup of more than 54 on a 64-core processor.
References
More filters
Book

Parallel programming with MPI

TL;DR: This chapter discusses the design and Coding of Parallel Programs, performance, and grouping data for Communication in the context of parallel computing.
Book

Parallel Programming in OpenMP

TL;DR: Aimed at the working researcher or scientific C/C++ or Fortran programmer, this text introduces the competent research programmer to a new vocabulary of idioms and techniques for parallelizing software using OpenMP.
Book

Quantitative system performance: computer system analysis using queueing network models

TL;DR: This book shows the quantitative system performance computer system analysis using queuing network models as your friend in spending the time.
Journal ArticleDOI

Symbolic performance modeling of parallel systems

TL;DR: This work presents an analytic performance modeling approach aimed to minimize prediction cost, while providing a prediction accuracy that is sufficient to enable major code and data mapping decisions, based on a performance simulation language called PAMELA.
Related Papers (5)