Author
Hongtu Jiang
Bio: Hongtu Jiang is an academic researcher from Lund University. The author has contributed to research in topics: Field-programmable gate array & Memory bandwidth. The author has an hindex of 5, co-authored 11 publications receiving 182 citations.
Papers
More filters
TL;DR: To achieve real-time performance with high resolution video streams, a dedicated hardware architecture with streamlined dataflow and memory access reduction schemes are developed to implement a video segmentation unit used for embedded automated video surveillance systems.
Abstract: This paper presents the implementation of a video segmentation unit used for embedded automated video surveillance systems. Various aspects of the underlying segmentation algorithm are explored and modifications are made with potential improvements of segmentation results and hardware efficiency. In addition, to achieve real-time performance with high resolution video streams, a dedicated hardware architecture with streamlined dataflow and memory access reduction schemes are developed. The whole system is implemented on a Xilinx field-programmable gate array platform, capable of real-time segmentation with VGA resolution at 25 frames per second. Substantial memory bandwidth reduction of more than 70% is achieved by utilizing pixel locality as well as wordlength reduction. The hardware platform is intended as a real-time testbench, especially for observations of long term effects with different parameter settings.
57 citations
01 Jul 2008
TL;DR: A memory reduction scheme for the video segmentation unit, reducing bandwidth with more than 70%, and a low complexity morphology architecture that only requires memory proportional to the input image width, have been developed.
Abstract: This paper presents the design of an embedded automated digital video surveillance system with real-time performance. Hardware accelerators for video segmentation, morphological operations, labeling and feature extraction are required to achieve the real-time performance while tracking will be handled in software in an embedded processor. By implementing a complete embedded system, bottlenecks in computational complexity and memory requirements can be identified and addressed. Accordingly, a memory reduction scheme for the video segmentation unit, reducing bandwidth with more than 70%, and a low complexity morphology architecture that only requires memory proportional to the input image width, have been developed. On a system level, it is shown that a labeling unit based on a contour tracing technique does not require unique labels, resulting in more than 50% memory reduction. The hardware accelerators provide the tracking software with image objects properties, i.e. features, thereby decoupling the tracking algorithm from the image stream. A prototype of the embedded system is running in real-time, 25 fps, on a field programmable gate array development board. Furthermore, the system scalability for higher image resolution is evaluated.
47 citations
23 May 2005
TL;DR: A hardware accelerator is proposed, with a dedicated architecture aimed at addressing both computation and memory bandwidth demands, and a controller synthesis tool is used to relieve the effort for the manual design of the complex control unit which schedules the operations of the whole system.
Abstract: Among many of the algorithms for video segmentation, one based on a statistical background model (Stauffer, C. and Grimson, W., Proc. IEEE Conf. Computer Vision and Pattern Recognition, 1999) was developed with the unique feature of robustness in multi-modal background scenarios. However, with a large number of calculations due to the pixel-wise processing of each frame, such an algorithm could only achieve a low frame rate, far from real-time requirements, on computers. A hardware accelerator is proposed, with a dedicated architecture aimed at addressing both computation and memory bandwidth demands. The whole system is targeted to an FPGA platform, which serves as a real-time test bench where long term effects caused by fixed point quantization and various parameter settings can be studied. Meanwhile, memory bandwidth as well as memory size are investigated, and reduction by up to 60 percent, through similarity exploitation for neighboring Gaussian parameters, is envisioned. Furthermore, a controller synthesis tool is used to relieve the effort for the manual design of the complex control unit which schedules the operations of the whole system.
43 citations
22 Nov 2006
TL;DR: To achieve real-time performance with high resolution video streams, a dedicated hardware architecture with streamlined dataflow and memory access reduction schemes are developed and the whole system is implemented on a Xilinx FPGA platform.
Abstract: This paper presents the implementation of a video segmentation unit used for embedded automated video surveillance systems. Various aspects of the underlying segmentation algorithm are explored and modifications are made with potential improvements of segmentation results and hardware efficiency. In addition, to achieve real-time performance with high resolution video streams, a dedicated hardware architecture with streamlined dataflow and memory access reduction schemes are developed. The whole system is implemented on a Xilinx FPGA platform, capable of real-time segmentation with VGA resolution at 25 frames per second. Substantial memory bandwidth reduction of more than 70% is achieved by utilizing pixel locality as well as wordlenghth reduction. The hardware platform is intended as a real-time testbench for observations of long term effects with different parameter settings, which is hard to achieve on a PC platform.
12 citations
15 Dec 2003
TL;DR: A customized image convolution processor with three level memory hierarchy is implemented on Xilinx VirtexE FPGAs that has the performance close to that of TI highest performance C64x processor at less than 1/8 of the clock frequency with substantial I/O bandwidth reductions.
Abstract: In this paper, a customized image convolution processor with three level memory hierarchy is implemented on Xilinx VirtexE FPGAs. Due to its fully pipelined datapath for calculations and streamlined data flow architecture, the processor has the performance close to that of TI highest performance C64x processor at less than 1/8 of the clock frequency with substantial I/O bandwidth reductions. Furthermore, potential power savings are envisioned in future ASIC implementations by meaningful memory hierarchy explorations. In addition, a dedicated controller composed of Finite State Machine with incremental branch optimization architecture is developed to control all the operations in calculations and data transfer.
7 citations
Cited by
More filters
TL;DR: The purpose of this paper is to provide a survey and an original classification of improvements of the original MOG, and to discuss relevant issues to reduce the computation time.
Abstract: Mixture of Gaussians is a widely used approach for background modeling to detect moving objects from static cameras. Numerous improvements of the original method developed by Stauffer and Grimson [1] have been proposed over the recent years and the purpose of this paper is to provide a survey and an original classification of these improvements. We also discuss relevant issues to reduce the computation time. Firstly, the original MOG are reminded and discussed following the challenges met in video sequences. Then, we categorize the different improvements found in the literature. We have classified them in term of strategies used to improve the original MOG and we have discussed them in term of the critical situations they claim to handle. After analyzing the strategies and identifying their limitations, we conclude with several promising directions for future research.
495 citations
TL;DR: An extended and updated survey of the recent researches and patents which concern statistical background modeling to achieve a comparative evaluation and to conclude with several promising directions for future research.
Abstract: Background modeling is currently used to detect moving objects in video acquired from static cameras. Numerous statistical methods have been developed over the recent years. The aim of this paper is firstly to provide an extended and updated survey of the recent researches and patents which concern statistical background modeling and secondly to achieve a comparative evaluation. For this, we firstly classified the statistical methods in terms of category. Then, the original methods are reminded and discussed following the challenges met in video sequences. We classified their respective improvements in terms of strategies used. Furthermore, we discussed them in terms of the critical situations they claim to handle. Finally, we conclude with several promising directions for future research. The survey also discussed relevant patents.
339 citations
Book•
14 Dec 2005
TL;DR: A one-of-a-kind survey of the field of Reconfigurable Computing gives a comprehensive introduction to a discipline that offers a 10X-100X acceleration of algorithms over microprocessors.
Abstract: A one-of-a-kind survey of the field of Reconfigurable Computing Gives a comprehensive introduction to a discipline that offers a 10X-100X acceleration of algorithms over microprocessors Discusses the impact of reconfigurable hardware on a wide range of applications: signal and image processing, network security, bioinformatics, and supercomputing Includes the history of the field as well as recent advances Includes an extensive bibliography of primary sources
178 citations
TL;DR: Two hardware implementations of the OpenCV version of the Gaussian mixture model (GMM), a background identification algorithm, are proposed, able to perform real-time background identification on high definition (HD) video sequences with frame size 1920 × 1080.
Abstract: Background identification is a common feature in many video processing systems. This paper proposes two hardware implementations of the OpenCV version of the Gaussian mixture model (GMM), a background identification algorithm. The implemented version of the algorithm allows a fast initialization of the background model while an innovative, hardware-oriented, formulation of the GMM equations makes the proposed circuits able to perform real-time background identification on high definition (HD) video sequences with frame size 1920 × 1080. The first of the two circuits is designed with commercial field-programmable gate-array (FPGA) devices as target. When implemented on Virtex6 vlx75t, the proposed circuit process 91 HD fps (frames per second) and uses 3% of FPGA logic resources. The second circuit is oriented to the implementation in UMC-90 nm CMOS standard cell technology, and is proposed in two versions. Both versions can process at a frame rate higher than 60 HD fps. The first version uses the constant voltage scaling technique to provide a low power implementation. It provides silicon area occupation of 28847 μm2 and energy dissipation per pixel of 15.3 pJ/pixel. The second version is designed to reduce silicon area utilization and occupies 21847 μm2 with an energy dissipation of 49.4 pJ/pixel.
84 citations
TL;DR: Experimental results show that the proposed background subtraction method is a good solution to obtain high accuracy and low resource requirements simultaneously, and is preferable for implementation in real-time embedded systems such as smart cameras.
Abstract: This letter proposes a background subtraction method for Bayer-pattern image sequences. The proposed method models the background in a Bayer-pattern domain using a mixture of Gaussians (MoG) and classifies the foreground in an interpolated red, green, and blue (RGB) domain. This method can achieve almost the same accuracy as MoG using RGB color images while maintaining computational resources (time and memory) similar to MoG using grayscale images. Experimental results show that the proposed method is a good solution to obtain high accuracy and low resource requirements simultaneously. This improvement is important for a low-level task like background subtraction since its accuracy affects the performance of high-level tasks, and is preferable for implementation in real-time embedded systems such as smart cameras.
61 citations