Proceedings ArticleDOI
Hierarchical Parallelization of an H.264/AVC Video Encoder
A. Rodriguez,A. Gonzalez,Manuel P. Malumbres +2 more
- pp 363-368
TLDR
A hierarchical parallelization of H.264 encoders very well suited to low cost clusters is proposed and is a compromise between speed-up and latency and then a broader spectrum of applications can be covered.Abstract:
Last generation video encoding standards increase computing demands in order to reach the limits on compression efficiency. This is particularly the case of H.264/AVC specification that is gaining interest in industry. We are interested in applying parallel processing to H.264 encoders in order to fulfill the computation requirements imposed by stressing applications like video on demand, videoconference, live broadcast, etc. Given a delivered video quality and bit rate, the main complexity parameters are image resolution, frame rate and latency. These parameters can still be pushed forward in such a way that special purpose hardware solutions are not available. Parallel processing based on off-the-shelf components is a more flexible general purpose alternative. In this work we propose a hierarchical parallelization of H.264 encoders very well suited to low cost clusters. Our proposal uses MPI message passing parallelization at two levels: GOP and frame. The GOP level encodes simultaneously several groups of consecutive frames and the frame level encodes in parallel several slices of one frame. In previous work we found that GOP parallelism alone gives good speed-up but imposes very high latency, on the other side frame parallelism gets less efficiency but low latency. Combining both approaches we obtain a compromise between speed-up and latency and then a broader spectrum of applications can be covered.read more
Citations
More filters
Journal ArticleDOI
Parallel Scalability of Video Decoders
TL;DR: This work investigates the parallelism available in video decoders, an important application domain now and in the future, and proposes a new parallelization strategy, called Dynamic 3D-Wave, which allows certain MBs of consecutive frames to be decoded in parallel.
Patent
Method and system for parallel encoding of a video
TL;DR: In this article, a method and system for parallel encoding of frames in a video are described, exploiting parallel processing at both frame and slice levels, and a significant speedup in comparison to the sequential encoding approach is achieved while maintaining high visual quality for output video.
Proceedings ArticleDOI
Scalability of Macroblock-level Parallelism for H.264 Decoding
Mauricio Alvarez Mesa,Alex Ramirez,Arnaldo Azevedo,Cor Meenderinck,Ben Juurlink,Mateo Valero +5 more
TL;DR: This study presents a quantitative analysis of the main bottlenecks of the application and estimates the acceleration levels that are required to make the MB-level parallel decoder scalable.
Proceedings ArticleDOI
Designing multi-leader-based Allgather algorithms for multi-core clusters
TL;DR: This work proposes a novel and scalable multi-leader-based hierarchical Allgather design that allows better cache sharing for Non-Uniform Memory Access (NUMA) machines and makes better use of the network speed available with high performance interconnects such as InfiniBand.
Book ChapterDOI
Parallel H.264 Decoding on an Embedded Multicore Processor
Arnaldo Azevedo,Cor Meenderinck,Ben Juurlink,Andrei Terechko,Jan Hoogerbrugge,Mauricio Alvarez,Alex Ramirez +6 more
TL;DR: This work presents an implementation of the 3D-Wave parallelization strategy on a multicore architecture composed of NXP TriMedia TM3270 embedded processors and shows that the parallel H.264 implementation scales very well, achieving a speedup of more than 54 on a 64-core processor.
References
More filters
Book
Parallel programming with MPI
TL;DR: This chapter discusses the design and Coding of Parallel Programs, performance, and grouping data for Communication in the context of parallel computing.
Book
Parallel Programming in OpenMP
TL;DR: Aimed at the working researcher or scientific C/C++ or Fortran programmer, this text introduces the competent research programmer to a new vocabulary of idioms and techniques for parallelizing software using OpenMP.
Book
Quantitative system performance: computer system analysis using queueing network models
TL;DR: This book shows the quantitative system performance computer system analysis using queuing network models as your friend in spending the time.
Journal ArticleDOI
Symbolic performance modeling of parallel systems
TL;DR: This work presents an analytic performance modeling approach aimed to minimize prediction cost, while providing a prediction accuracy that is sufficient to enable major code and data mapping decisions, based on a performance simulation language called PAMELA.