High speed architecture for Variable Block Size Motion Estimation in H.264

doi:10.1109/ICE-CCN.2013.6528478

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

2-Dimensional systolic architecture for H.264/AVC variable block size motion estimation

[...]

P. Jayakrishnan¹, Harish M. Kittur¹•Institutions (1)

VIT University¹

01 Nov 2014

TL;DR: A new design for the implementation of Full-Search (FS) Variable Block Size (VBS) Motion Estimation (ME) and the Sum of Absolute Differences (SAD) is presented by re-using the outputs, which features high efficiency in terms of operating frequency and reduction in hardware complexity.

...read moreread less

Abstract: Video coding is used for lot of multimedia purposes like video conferencing, digital storage media, Internet streaming and television broadcasting. This paper presents a new design for the implementation of Full-Search (FS) Variable Block Size (VBS) Motion Estimation (ME), which is a key issue of different video compression standards such as MPEG-1, MPEG-2, MPEG-4 Visual, H.261, H.263 and H.264. The FS algorithm is widely used for implementation of ME in video compression algorithms. This design is fully parametric in terms of block size, which is variable, and the Sum of Absolute Differences (SAD) is presented by re-using the outputs. The design features high efficiency in terms of operating frequency and reduction in hardware complexity. These architectures are designed using Verilog Hardware Description Language (HDL) and the functionalities are verified using ModelSim Simulator. For two different designs, namely 1-D and 2-D systolic architectures are analyzed in terms of frequency, gate count, total power. The design is synthesized using CADENCE RTL compiler with TSMC 90nm standard cell library. The operating frequency of 1-D design is 323.20 MHz and 2-D design is 166.67 MHz and the gate count for 1-D is around 5k and for 2-D is around 21k gates and these designs can treat up to 41 Motion Vectors.

...read moreread less

References

PDF

Open Access

More filters

Journal Article•DOI•

An efficient VLSI architecture for H.264 variable block size motion estimation

[...]

Chien-Min Ou, Chian-Feng Le¹, Wen-Jyi Hwang•Institutions (1)

National Taiwan Normal University¹

01 Nov 2005-IEEE Transactions on Consumer Electronics

TL;DR: A novel flexible VLSI architecture for the implementation of variable block size motion estimation (VBSME) that has lower latency and higher throughput over other exiting VBSME architectures for the hardware implementation of H.264 encoders.

...read moreread less

Abstract: This paper proposes a novel flexible VLSI architecture for the implementation of variable block size motion estimation (VBSME). The architecture is able to perform a full motion search on integral multiples of 4/spl times/4 blocks sizes. To use the architecture, each 16/spl times/16 macroblock of the source frames should be partitioned into sixteen 4/spl times/4 non-overlapping subblocks, called primitive subblocks. The architecture contains sixteen modules and one VBSME processor. Each module, realized by cascading ID systolic arrays, is responsible for the block-matching operations of a different primitive subblock. The realization has the advantages of high throughput, high flexibility and 100 % processing element (PE) utilization. The motion estimation of all the primitive subblocks is performed in parallel. Because these primitive subblocks can be used to form the 41 subblocks of different sizes specified by the H.264, the VBSME processor is employed to concurrently compute the sums of absolute differences (SADs) of all the 41 subblocks from the SADs of the primitive subblocks. This new architecture has lower latency and higher throughput over other exiting VBSME architectures for the hardware implementation of H.264 encoders.

...read moreread less

75 citations

"High speed architecture for Variabl..." refers background in this paper

...The comparisons between 1-D and 2-D [2], [3], [4] architectures are given in the Table-I....
[...]

Proceedings Article•DOI•

A VLSI architecture for advanced video coding motion estimation

[...]

S.Y. Yap¹, John V. McCanny¹•Institutions (1)

Queen's University Belfast¹

24 Jun 2003

TL;DR: This work proposes a new 1-D VLSI architecture for full search variable block size motion estimation (FSVBSME), which can process up to 41 motion vector subblocks (within a macroblock) in a comparable number of clock cycles.

...read moreread less

Abstract: With the advent of new video standards such as MPEG-4 part-10 and H.264/H.26L, demands for advanced video coding (AVC), particularly in area of variable block searching motion estimation (VBSME), are increasing. This has led to research into suitable flexible hardware architectures to perform the various types of VBSME. We propose a new 1-D VLSI architecture for full search variable block size motion estimation (FSVBSME). The variable block size, sum of absolute differences (SAD) computation is performed by reusing the results of smaller subblock computations. These are permuted and combined by incorporating a shuffling mechanism within each processing element (PE). Whereas a conventional 1-D architecture can process only one motion vector, this architecture can process up to 41 motion vector (MV) subblocks (within a macroblock) in a comparable number of clock cycles.

...read moreread less

42 citations

"High speed architecture for Variabl..." refers background in this paper

...The design has an increase in speed as compared to [1] with the highest working frequency of 323....
[...]
...Reference [1] shows similar for block b1, it contains pixels p4 - p7, p20 - p23, p36 - p39 and p52 - p55, and so on (refer Fig....
[...]
...Reference [1] Pixel values in the blocks (namely, p0 and p1)...
[...]
...It needs similar amount of clock cycles as compared with any other 1-D architecture [1], [4]....
[...]

Journal Article•DOI•

A Memory-Efficient and Highly Parallel Architecture for Variable Block Size Integer Motion Estimation in H.264/AVC

[...]

Chao-Yang Kao¹, Youn-Long Lin¹•Institutions (1)

National Tsing Hua University¹

01 Jun 2010-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A memory-efficient and highly parallel VLSI architecture for full search VBSME (FSVBSME), which can save 98% of on-chip memory access with only 25% of local memory overhead, and a novel data reuse scheme to reduce memory access.

...read moreread less

Abstract: Variable block size motion estimation (VBSME) is one of several contributors to H.264/AVC's excellent coding efficiency. However, its high computational complexity and huge memory traffic make deign difficult. In this paper, we propose a memory-efficient and highly parallel VLSI architecture for full search VBSME (FSVBSME). Our architecture consists of 16 2-D arrays each consists of 16 × 16 processing elements (PEs). Four arrays form a group to match in parallel four reference blocks against one current block. Four groups perform block matching for four current blocks in a pipelined fashion. Taking advantage of overlapping among multiple reference blocks of a current block and between search windows of adjacent current blocks, we propose a novel data reuse scheme to reduce memory access. Compared with the popular Level C data reuse scheme, our approach can save 98% of on-chip memory access with only 25% of local memory overhead. Synthesized into a TSMC 180-nm CMOS cell library, our design is capable of processing 1920 × 1088 30 fps video when running at 130 MHz. The architecture is scalable for wider search range, multiple reference frames and pixel truncation as well as down sampling. We suggest a criterion called design efficiency for comparing different works. It shows that the proposed design is 72% more efficient than the best design to date.

...read moreread less

29 citations

Journal Article•DOI•

Very large scale integration (VLSI) implementation of low-complexity variable block size motion estimation for H.264/AVC coding

[...]

Shih-Chang Hsia¹, Po-Yi Hong¹•Institutions (1)

National Kaohsiung First University of Science and Technology¹

09 Sep 2010-Iet Circuits Devices & Systems

TL;DR: The proposed fast algorithm can reduce about 90% motion searching time, whereas PSNR only decreases about 0.02 dB on average, and VLSI architecture is designed with parallel structure and pipeline timing schedule to achieve high throughput rate for the HDTV system.

...read moreread less

Abstract: This study presents a fast algorithm and its very large scale integration (VLSI) design to implement the variable block size motion estimation. The fast algorithm is proposed with a hardware-oriented concept for regular VLSI design. Simulations show that the proposed algorithm can reduce about 90% motion searching time, whereas PSNR only decreases about 0.02 dB on average. Based on the fast algorithm, VLSI architecture is designed with parallel structure and pipeline timing schedule to achieve high throughput rate for the HDTV system. The chip can compute 41 vectors for various block size during 24-240 cycles as using only 96 processing elements. Comparisons with contemporary VLSI architectures, this chip can offer higher processing speed, wider searching range and lower circuit complexity.

...read moreread less

10 citations

High speed architecture for Variable Block Size Motion Estimation in H.264

Citations

References

"High speed architecture for Variabl..." refers background in this paper

"High speed architecture for Variabl..." refers background in this paper

Related Papers (5)