Showing papers in &quot;Journal of Real-time Image Processing in 2019&quot;

Real-time vehicle type classification with deep convolutional neural networks

TL;DR: The dynamic mode decomposition is a regression technique that integrates two of the leading data analysis methods in use today: Fourier transforms and singular value decomposition and the quality of the resulting background model is competitive, quantified by the F-measure, recall and precision.

...read moreread less

Abstract: We introduce the method of compressed dynamic mode decomposition (cDMD) for background modeling. The dynamic mode decomposition is a regression technique that integrates two of the leading data analysis methods in use today: Fourier transforms and singular value decomposition. Borrowing ideas from compressed sensing and matrix sketching, cDMD eases the computational workload of high-resolution video processing. The key principal of cDMD is to obtain the decomposition on a (small) compressed matrix representation of the video feed. Hence, the cDMD algorithm scales with the intrinsic rank of the matrix, rather than the size of the actual video (data) matrix. Selection of the optimal modes characterizing the background is formulated as a sparsity-constrained sparse coding problem. Our results show that the quality of the resulting background model is competitive, quantified by the F-measure, recall and precision. A graphics processing unit accelerated implementation is also presented which further boosts the computational performance of the algorithm.

...read moreread less

103 citations

Journal Article•DOI•

[...]

Xinchen Wang¹, Weiwei Zhang¹, Xuncheng Wu¹, Lingyun Xiao, Yubin Qian¹, Zhi Fang¹ - Show less +2 more•Institutions (1)

Shanghai University of Engineering Sciences¹

01 Feb 2019-Journal of Real-time Image Processing

TL;DR: A vehicle type classification system based on deep learning is proposed that uses Faster R-CNN to solve the task and is tested on an NVDIA Jetson TK1 board with 192 CUDA cores.

...read moreread less

Abstract: Vehicle type classification technology plays an important role in the intelligent transport systems nowadays. With the development of image processing, pattern recognition and deep learning, vehicle type classification technology based on deep learning has raised increasing concern. In the last few years, convolutional neural network, especially Faster Region-convolutional neural networks (Faster R-CNN) has shown great advantages in image classification and object detection. It has superiority to traditional machine learning methods by a large margin. In this paper, a vehicle type classification system based on deep learning is proposed. The system uses Faster R-CNN to solve the task. Experimental results show that the method is not only time-saving, but also has more robustness and higher accuracy. Aimed at cars and trucks, it reached 90.65 and 90.51% accuracy. At last, we test the system on an NVDIA Jetson TK1 board with 192 CUDA cores that is envisioned to be forerunner computational brain for computer vision, robotics and self-driving cars. Experimental results show that it costs around 0.354 s to detect an image and keeps high accurate rate with the network embedded on NVDIA Jetson TK1.

...read moreread less

67 citations

Journal Article•DOI•

Overview and evaluation of the JPEG XT HDR image compression standard

[...]

Alessandro Artusi¹, Rafal Mantiuk², Thomas Richter³, Philippe Hanhart⁴, Pavel Korshunov⁵, Massimiliano Agostinelli, Arkady Ten⁶, Touradj Ebrahimi⁴ - Show less +4 more•Institutions (6)

University of Girona¹, University of Cambridge², University of Stuttgart³, École Polytechnique Fédérale de Lausanne⁴, Idiap Research Institute⁵, Dolby Laboratories⁶

DS-KCF: A real-time tracker for RGB-D data

TL;DR: The paper introduces three of currently defined profiles in JPEG XT, each constraining the common decoder architecture to a subset of allowable configurations, and assess the coding efficiency of each profile extensively through subjective assessments, using 24 naïve subjects to evaluate 20 images and objective evaluations.

...read moreread less

Abstract: Standards play an important role in providing a common set of specifications and allowing inter-operability between devices and systems. Until recently, no standard for high-dynamic-range (HDR) image coding had been adopted by the market, and HDR imaging relies on proprietary and vendor-specific formats which are unsuitable for storage or exchange of such images. To resolve this situation, the JPEG Committee is developing a new coding standard called JPEG XT that is backward compatible to the popular JPEG compression, allowing it to be implemented using standard 8-bit JPEG coding hardware or software. In this paper, we present design principles and technical details of JPEG XT. It is based on a two-layer design, a base layer containing a low-dynamic-range image accessible to legacy implementations, and an extension layer providing the full dynamic range. The paper introduces three of currently defined profiles in JPEG XT, each constraining the common decoder architecture to a subset of allowable configurations. We assess the coding efficiency of each profile extensively through subjective assessments, using 24 naive subjects to evaluate 20 images, and objective evaluations, using 106 images with five different tone-mapping operators and at 100 different bit rates. The objective results (based on benchmarking with subjective scores) demonstrate that JPEG XT can encode HDR images at bit rates varying from 1.1 to 1.9 bit/pixel for estimated mean opinion score (MOS) values above 4.5 out of 5, which is considered as fully transparent in many applications. This corresponds to 23-times bitstream reduction compared to lossless OpenEXR PIZ compression.

...read moreread less

65 citations

Journal Article•DOI•

[...]

Sion Hannuna¹, Massimo Camplani¹, Jake Hall¹, Majid Mirmehdi¹, Dima Damen¹, Tilo Burghardt¹, Adeline Paiement¹, Lili Tao¹ - Show less +4 more•Institutions (1)

University of Bristol¹

01 Jan 2019-Journal of Real-time Image Processing

TL;DR: An RGB-D single-object tracker, built upon the extremely fast RGB-only KCF tracker that is able to exploit depth information to handle scale changes, occlusions, and shape changes is proposed.

...read moreread less

Abstract: We propose an RGB-D single-object tracker, built upon the extremely fast RGB-only KCF tracker that is able to exploit depth information to handle scale changes, occlusions, and shape changes. Despite the computational demands of the extra functionalities, we still achieve real-time performance rates of 35–43 fps in MATLAB and 187 fps in our C++ implementation. Our proposed method includes fast depth-based target object segmentation that enables, (1) efficient scale change handling within the KCF core functionality in the Fourier domain, (2) the detection of occlusions by temporal analysis of the target’s depth distribution, and (3) the estimation of a target’s change of shape through the temporal evolution of its segmented silhouette allows. Finally, we provide an in-depth analysis of the factors affecting the throughput and precision of our proposed tracker and perform extensive comparative analysis. Both the MATLAB and C++ versions of our software are available in the public domain.

...read moreread less

51 citations

Journal Article•DOI•

FPGA implementation of cost-effective robust Canny edge detection algorithm

[...]

D. Sangeetha¹, P. Deepa¹•Institutions (1)

Government College of Technology, Coimbatore¹

FPGA implementation of the principal component analysis algorithm for dimensionality reduction of hyperspectral images

TL;DR: This work proposes a novel Canny edge detection algorithm in block level to detect edges without any loss that uses sobel operator, approximation methods to compute gradient magnitude and orientation for replacing complex operations with reduced hardware cost.

...read moreread less

Abstract: Implementation of Canny edge detection algorithm significantly outperforms the existing edge detection techniques in many computer vision algorithms. However, Canny edge detection algorithm is complex, time-consuming process with high hardware cost. To overcome these issues, a novel Canny edge detection algorithm is proposed in block level to detect edges without any loss. It uses sobel operator, approximation methods to compute gradient magnitude and orientation for replacing complex operations with reduced hardware cost, existing non-maximum suppression, block classification for adaptive thresholding and existing hysteresis thresholding. Pipelining is introduced to reduce latency. The proposed algorithm is implemented on Xilinx Virtex-5 FPGA and it provides better performance compared to frame-level Canny edge detection algorithm. The synthesized architecture reduces execution time by 6.8 % and utilizes less resource to detect edges of 512 × 512 image compared to existing distributed Canny edge detection algorithm.

...read moreread less

49 citations

Journal Article•DOI•

[...]

Daniel Fernandez¹, Carlos Villaseca González¹, Daniel Mozos¹, Sebastian Lopez²•Institutions (2)

Complutense University of Madrid¹, University of Las Palmas de Gran Canaria²

Deepgender: real-time gender classification using deep learning for smartphones

TL;DR: This paper proposes the implementation in reconfigurable hardware of the principal component analysis (PCA) algorithm to carry out the dimensionality reduction in hyperspectral images, and demonstrates that the hardware version of the PCA algorithm significantly outperforms a commercial software version.

...read moreread less

Abstract: Remotely sensed hyperspectral imaging is a very active research area, with numerous contributions in the recent scientific literature. The analysis of these images represents an extremely complex procedure from a computational point of view, mainly due to the high dimensionality of the data and the inherent complexity of the state-of-the-art algorithms for processing hyperspectral images. This computational cost represents a significant disadvantage in applications that require real-time response, such as fire tracing, prevention and monitoring of natural disasters, chemical spills, and other environmental pollution. Many of these algorithms consider, as one of their fundamental stages to fully process a hyperspectral image, a dimensionality reduction in order to remove noise and redundant information in the hyperspectral images under analysis. Therefore, it is possible to significantly reduce the size of the images, and hence, alleviate data storage requirements. However, this step is not exempt of computationally complex matrix operations, such as the computation of the eigenvalues and the eigenvectors of large and dense matrices. Hence, for the aforementioned applications in which prompt replies are mandatory, this dimensionality reduction must be considerably accelerated, typically through the utilization of high-performance computing platforms. For this purpose, reconfigurable hardware solutions such as field-programmable gate arrays have been consolidated during the last years as one of the standard choices for the fast processing of hyperspectral remotely sensed images due to their smaller size, weight and power consumption when compared with other high-performance computing systems. In this paper, we propose the implementation in reconfigurable hardware of the principal component analysis (PCA) algorithm to carry out the dimensionality reduction in hyperspectral images. Experimental results demonstrate that our hardware version of the PCA algorithm significantly outperforms a commercial software version, which makes our reconfigurable system appealing for onboard hyperspectral data processing. Furthermore, our implementation exhibits real-time performance with regard to the time that the targeted hyperspectral instrument takes to collect the image data.

...read moreread less

45 citations

Journal Article•DOI•

[...]

Khurram Zeeshan Haider¹, Kaleem Razzaq Malik², Shehzad Khalid³, Tabassam Nawaz⁴, Sohail Jabbar⁵ - Show less +1 more•Institutions (5)

Government College University, Faisalabad¹, COMSATS Institute of Information Technology², Bahria University³, University of Engineering and Technology, Lahore⁴, National Textile University⁵

01 Feb 2019-Journal of Real-time Image Processing

TL;DR: The proposed system ‘Deepgender’ has registered 98% accuracy by combined use of both databases with the specific preprocess procedure, i.e., exhibiting alignment before resizing, and Experiments suggest that accuracy is nearly 100% with frontal and nonblurred facial images.

...read moreread less

Abstract: Face recognition, expression identification, age determination, racial binding and gender classification are common examples of image processing computerization. Gender classification is very straightforward for us like we can tell by the person’s hair, nose, eyes, mouth and skin whether that person is a male or female with a relatively high degree of confidence and accuracy; however, can we program a computer to perform just as well at gender classification? The very problem is the main focus of this research. The conventional sequence for recent real-time facial image processing consists of five steps: face detection, noise removal, face alignment, feature representation and classification. With the aim of human gender classification, face alignment and feature vector extraction stages have been re-examined keeping in view the application of the system on smartphones. Face alignment has been made by spotting out 83 facial landmarks and 3-D facial model with the purpose of applying affine transformation. Furthermore, ‘feature representation’ is prepared through proposed modification in multilayer deep neural network, and hence we name it Deepgender. This convolutional deep neural network consists of some locally connected hidden layers without common weights of kernels as previously followed in legacy layered architecture. This specific case study involves deep learning as four convolutional layers, three max-pool layers (for downsizing of unrelated data), two fully connected layers (connection of outcome to all inputs) and a single layer of ‘multinomial logistic regression.’ Training has been made using CAS-PEAL and FEI which contain 99,594 face images of 1040 people and 2800 face images of 200 individuals, respectively. These images are either in different poses or taken under uncontrolled conditions which are close to real-time input facial image for gender classification application. The proposed system ‘Deepgender’ has registered 98% accuracy by combined use of both databases with the specific preprocess procedure, i.e., exhibiting alignment before resizing. Experiments suggest that accuracy is nearly 100% with frontal and nonblurred facial images. State-of-the-art steps have been taken to overcome memory and battery constraints in mobiles.

...read moreread less

38 citations

Journal Article•DOI•

A kernel-based method for fast and accurate computation of PHT in polar coordinates

[...]

Khalid M. Hosny¹, Mohamed M. Darwish²•Institutions (2)

Zagazig University¹, Assiut University²

A multi-purpose image forensic method using densely connected convolutional neural networks

TL;DR: A novel kernel-based method is proposed for fast, highly accurate and numerically stable computations of polar harmonic transforms (PHT) in polar coordinates which results in highly improved image reconstruction capabilities.

...read moreread less

Abstract: A novel kernel-based method is proposed for fast, highly accurate and numerically stable computations of polar harmonic transforms (PHT) in polar coordinates. Euler formula is used to derive a novel trigonometric formula where the later one is used in the kernel generation. The simplified radial and angular kernels are used in efficient computation PHTs. The proposed method removes the numerical approximation errors involved in conventional methods and provides highly accurate PHTs coefficients which results in highly improved image reconstruction capabilities. Numerical experiments are performed where the results are compared with those of the recent existing methods. In addition to the tremendous reduction in computational times, the obtained results of the proposed method clearly show a significant improvement in rotational invariance.

...read moreread less

37 citations

Journal Article•DOI•

[...]

Yifang Chen¹, Xiangui Kang¹, Yun Q. Shi², Z. Jane Wang³•Institutions (3)

Sun Yat-sen University¹, New Jersey Institute of Technology², University of British Columbia³

Real time security framework for detecting abnormal events at ATM installations

TL;DR: A multi-purpose method based on densely connected convolutional neural networks for simultaneous detection of 11 different types of image manipulations and can achieve better overall performance when tested on different databases as well as better robustness against JPEG compression even under low-quality JPEG compression.

...read moreread less

Abstract: Multi-purpose forensics is attracting increasing attention worldwide. In this paper, we propose a multi-purpose method based on densely connected convolutional neural networks (CNNs) for simultaneous detection of 11 different types of image manipulations. An efficient CNN structure has been specifically designed for forensics by considering vital architecture components, including the number of convolutional layers, the size of convolutional kernels, the nonlinear activations, and the type of pooling layer. The dense connectivity pattern, which has better parameter efficiency than the traditional pattern, is explored to strengthen the propagation of features related to image manipulation detection. When compared with four state-of-the-art methods, our experiments demonstrate that the proposed CNN architecture can achieve better performance in multiple operation detections for different image sizes, especially on small image patches. Consequently, the proposed method can accurately detect local image manipulations. The proposed method can achieve better overall performance when tested on different databases as well as better robustness against JPEG compression even under low-quality JPEG compression.

...read moreread less

34 citations

Journal Article•DOI•

[...]

Vikas Tripathi¹, Vikas Tripathi², Ankush Mittal¹, Durgaprasad Gangodkar¹, Vishnu Kanth¹ - Show less +1 more•Institutions (2)

Graphic Era University¹, Uttarakhand Technical University²

Acceleration techniques and evaluation on multi-core CPU, GPU and FPGA for image processing and super-resolution

TL;DR: A robust computer vision approach for identifying abnormal activity at ATM premises in real time is proposed in which different Window size is used to record magnitude of pixel intensity using root of sum of square method.

...read moreread less

Abstract: Automated Teller Machines (ATM) transactions are quick and convenient, but the machines and the areas surrounding them make people and ATM vulnerable to felonious activities if not properly put under the protection. Responsibility for providing security needs to be fixed, however, most machines have very less or no security. It is imminent to develop security framework that would identify event as their happening. In this paper we propose a robust computer vision approach for identifying abnormal activity at ATM premises in real time. For effective identification of activity, we propose a novel method in which different Window size is used to record magnitude of pixel intensity using root of sum of square method. To describe this pattern, histogram of gradients is used. Further random forest is applied to infer the most likely class. The average accuracy of our security system is 93.1 %. For validation of our approach we have tested it on two standard datasets, HMDB and Caviar. Our approach achieved 52.12 % accuracy on HMDB dataset and 81.48 % on Caviar dataset.

...read moreread less

31 citations

Journal Article•DOI•

[...]

Georgios Georgis¹, George Lentaris¹, Dionysios Reisis¹•Institutions (1)

National and Kapodistrian University of Athens¹

Real-time raindrop detection based on cellular neural networks for ADAS

TL;DR: The current paper builds up on the characteristics of the L-SEABI SR method to introduce parallelization techniques for GPUs and FPGAs for super-resolution reconstruction and confirms the benefits of the proposed acceleration techniques by employing them on a different category of image processing algorithms.

...read moreread less

Abstract: Super-Resolution (SR) techniques constitute a key element in image applications, which need high- resolution reconstruction while in the worst case only a single low-resolution observation is available. SR techniques involve computationally demanding processes and thus researchers are currently focusing on SR performance acceleration. Aiming at improving the SR performance, the current paper builds up on the characteristics of the L-SEABI Super-Resolution (SR) method to introduce parallelization techniques for GPUs and FPGAs. The proposed techniques accelerate GPU reconstruction of Ultra-High Definition content, by achieving three (3x) times faster than the real-time performance on mid-range and previous generation devices and at least nine times (9x) faster than the real-time performance on high-end GPUs. The FPGA design leads to a scalable architecture performing four (4x) times faster than the real-time on low-end Xilinx Virtex 5 devices and sixty-nine times (69x) faster than the real-time on the Virtex 2000t. Moreover, we confirm the benefits of the proposed acceleration techniques by employing them on a different category of image-processing algorithms: on window-based Disparity functions, for which the proposed GPU technique shows an improvement over the CPU performance ranging from 14 times (14x) to 64 times (64x) while the proposed FPGA architecture provides 29x acceleration.

...read moreread less

Journal Article•DOI•

[...]

Fadi Al Machot¹, Mouhannad Ali¹, Ahmad Haj Mosa¹, Christopher Schwarzlmüller¹, Markus Gutmann¹, Kyandoghere Kyamakya¹ - Show less +2 more•Institutions (1)

Adria Airways¹

Fast adaptive switching technique of impulsive noise removal in color images

TL;DR: This paper presents an approach for real-time raindrops detection which is based on cellular neural networks (CNN) and support vector machines (SVM) and proves the possibility of transforming the support vectors into CNN templates.

...read moreread less

Abstract: A core aspect of advanced driver assistance systems (ADAS) is to support the driver with information about the current environmental situation of the vehicle. Bad weather conditions such as rain might occlude regions of the windshield or a camera lens and therefore affect the visual perception. Hence, the automated detection of raindrops has a significant importance for video-based ADAS. The detection of raindrops is highly time critical since video pre-processing stages are required to improve the image quality and to provide their results in real-time. This paper presents an approach for real-time raindrops detection which is based on cellular neural networks (CNN) and support vector machines (SVM). The major idea is to prove the possibility of transforming the support vectors into CNN templates. The advantage of CNN is its ultra fast precessing on embedded platforms such as FPGAs and GPUs. The proposed approach is capable to detect raindrops that might negatively affect the vision of the driver. Different classification features were extracted to evaluate and compare the performance between the proposed approach and other approaches.

...read moreread less

Journal Article•DOI•

[...]

Lukasz Malinski¹, Bogdan Smolka¹•Institutions (1)

Silesian University of Technology¹

Improved reversible data hiding based on PVO and adaptive pairwise embedding

TL;DR: A family of switching filters designed for the impulsive noise removal in color images is analyzed and shows that the novel filters outperform the existing techniques in terms of both denoising accuracy and computational complexity.

...read moreread less

Abstract: In the paper, a family of switching filters designed for the impulsive noise removal in color images is analyzed. The framework of the proposed denoising techniques is based on the concept of cumulated distances between the processed pixel and its neighbors. To increase the filtering efficiency, a robust scheme, in which the sum of distances to only the most similar pixels of the neighborhood serves as a measure of impulsiveness, was elaborated. As this trimmed measure is dependent on the image local structure, an adaptive mechanism was also incorporated. Additionally, a very fast design, which enables image denoising in practical applications, is proposed and the choice of the filter output, which is used to replace the noisy pixels, is discussed. The described family of filters was evaluated on a large set of natural test images and compared with the state-of-the-art restoration methods. The analysis of the achieved results shows that the novel filters outperform the existing techniques in terms of both denoising accuracy and computational complexity. In this way, the proposed techniques can be recommended for the application in various image and video enhancement tasks.

...read moreread less

Journal Article•DOI•

[...]

Haorui Wu¹, Xiaolong Li¹, Yao Zhao¹, Rongrong Ni¹•Institutions (1)

Beijing Jiaotong University¹

Accelerating block-matching and 3D filtering method for image denoising on GPUs

TL;DR: The key observation is that, in each block, the modification for the two prediction errors is independent without exploiting the correlation between them, although they are closely correlated with each other, which guarantees the reversibility of the PVO-based RDH method.

...read moreread less

Abstract: Pixel-value-ordering (PVO) is an efficient technique of reversible data hiding (RDH). By PVO, the cover image is first divided into non-overlapping blocks with equal size. Then, the pixel values in each block are sorted in ascending order. Next, take the second largest/samllest pixel value as a prediction of the largest/samllest pixel value to derive two prediction errors. Finally, the data embedding is constructed by modifying the generated prediction errors of each block. After data embedding, the PVO of each block is unchanged, which guarantees the reversibility. Our key observation is that, in each block, the modification for the two prediction errors is independent without exploiting the correlation between them, although they are closely correlated with each other. In light of this, an improved PVO-based RDH method is proposed in this work. The two prediction errors of each block are considered as a pair, and the pairs are modified for data embedding based on adaptive two-dimensional histogram modification. The proposed method is experimentally verified better than the original PVO-based method and some of its improvements.

...read moreread less

Journal Article•DOI•

[...]

David Honzátko¹, Martin Kruliš¹•Institutions (1)

Charles University in Prague¹

A multi-FPGA architecture-based real-time TFM ultrasound imaging

TL;DR: This paper presents a CUDA-accelerated implementation of the BM3D algorithm, which increased the image processing speed significantly and brings theBM3D method much closer to real applications.

...read moreread less

Abstract: Denoising photographs and video recordings is an important task in the domain of image processing. In this paper, we focus on block-matching and 3D filtering (BM3D) algorithm, which uses self-similarity of image blocks to improve the noise-filtering process. Even though this method has achieved quite impressive results in the terms of denoising quality, it is not being widely used. One of the reasons is a fact that the method is extremely computationally demanding. In this paper, we present a CUDA-accelerated implementation which increased the image processing speed significantly and brings the BM3D method much closer to real applications. The GPU implementation of the BM3D algorithm is not as straightforward as the implementation of simpler image processing methods, and we believe that some parts (especially the block-matching) can be utilized separately or provide guidelines for similar algorithms.

...read moreread less

Journal Article•DOI•

[...]

Mickael Njiki, Abdelhafid Elouardi¹, Samir Bouaziz¹, Olivier Casula, Olivier Roy - Show less +1 more•Institutions (1)

Université Paris-Saclay¹

Real-time optical flow-based video stabilization for unmanned aerial vehicles

TL;DR: A multi-FPGA architecture for real-time TFM imaging using the full matrix capture (FMC) and performs real- time FMC-TFM imaging with a good characterization of defects.

...read moreread less

Abstract: Real-time imaging, using ultrasound techniques, is a complex task in non-destructive evaluation. In this context, fast and precise control systems require design of specialized parallel architectures. Total focusing method (TFM) for ultrasound imaging has many advantages in terms of flexibility and accuracy in comparison to traditional imaging techniques. However, one major drawback is the high number of data acquisitions and computing requirements for this imaging technique. Due to those constraints, the TFM algorithm was earlier classified in the field of post-processing tasks. This paper describes a multi-FPGA architecture for real-time TFM imaging using the full matrix capture (FMC). In the acquisition process, data are acquired using a phased array and processed with synthetic focusing techniques such as the TFM algorithm. The FMC-TFM architecture consists of a set of interconnected FPGAs integrated on an embedded system. Initially, this imaging system was dedicated to data acquisition using a phased array. The algorithm was reviewed and partitioned to parallelize processing tasks on FPGAs. The architecture was entirely described using VHDL language, synthesized and implemented on a V5FX70T Xilinx FPGA for the control and high-level processing tasks and four V5SX95T Xilinx FPGAs for the acquisition and low-level processing tasks. The designed architecture performs real-time FMC-TFM imaging with a good characterization of defects.

...read moreread less

Journal Article•DOI•

[...]

Anli Lim¹, Bharath Ramesh¹, Yue Yang¹, Cheng Xiang¹, Zhi Gao¹, Feng Lin¹ - Show less +2 more•Institutions (1)

National University of Singapore¹

Real-time reversible data hiding with shifting block histogram of pixel differences in encrypted image

TL;DR: In this paper, a real-time video stabilization algorithm for UAVs is proposed, which avoids the necessity of estimating the most general motion model, projective transformation, and considers simpler motion models such as rigid transformation and similarity transformation.

...read moreread less

Abstract: This paper describes the development of a novel algorithm to tackle the problem of real-time video stabilization for unmanned aerial vehicles (UAVs). There are two main components in the algorithm: (1) By designing a suitable model for the global motion of UAV, the proposed algorithm avoids the necessity of estimating the most general motion model, projective transformation, and considers simpler motion models, such as rigid transformation and similarity transformation; (2) to achieve a high processing speed, optical flow-based tracking is employed in lieu of conventional tracking and matching methods used by state-of-the-art algorithms. These two new ideas resulted in a real-time stabilization algorithm, developed over two phases. Stage I considers processing the whole sequence of frames in the video while achieving an average processing speed of 50 fps on several publicly available benchmark videos. Next, Stage II undertakes the task of real-time video stabilization using a multi-threading implementation of the algorithm designed in Stage I.

...read moreread less

Journal Article•DOI•

[...]

Zhenjun Tang¹, Shijie Xu¹, Dengpan Ye², Jinyan Wang¹, Xianquan Zhang¹, Chuanqiang Yu¹ - Show less +2 more•Institutions (2)

Guangxi Normal University¹, Wuhan University²

Design and implementation of an efficient hardware integer motion estimator for an HEVC video encoder

TL;DR: Comparison results illustrate that the proposed RDH-EI algorithm with shifting block histogram can achieve efficient data embedding with high payload and correctly recover image, and outperforms some state-of-the-art algorithms in terms of embedding rate, visual quality and computational time.

...read moreread less

Abstract: Reversible data hiding in encrypted image (RDH-EI) is a hot topic of data hiding in recent years. Most RDH-EI algorithms do not reach desirable embedding rate and their computational costs are not suitable for real-time applications. Aiming at these problems, we propose an efficient RDH-EI algorithm with shifting block histogram of pixel differences in homomorphic encrypted domain. A key step of our RDH-EI algorithm is the block-based encryption scheme with additive homomorphism, which can preserve spatial correlation of plaintext image in homomorphic encrypted domain. In addition, our proposed technique of shifting block histogram can achieve efficient data embedding with high payload and correctly recover image. Extensive experiments are conducted to validate performance of our RDH-EI algorithm. Comparison results illustrate that our RDH-EI algorithm outperforms some state-of-the-art algorithms in terms of embedding rate, visual quality and computational time.

...read moreread less

Journal Article•DOI•

[...]

Estefania Alcocer¹, R. Gutierrez¹, Otoniel López-Granado¹, Manuel P. Malumbres¹•Institutions (1)

Universidad Miguel Hernández de Elche¹

Efficient halftone image steganography based on dispersion degree optimization

TL;DR: This work proposes the implementation of the HEVC ME block in hardware based on a new memory scan order, and a new adder tree structure, which supports asymmetric partitioning modes in a fast and efficient way to reduce the overall video encoding time.

...read moreread less

Abstract: High-Efficiency Video Coding (HEVC) was developed to improve its predecessor standard, H264/AVC, by doubling its compression efficiency. As in previous standards, Motion Estimation (ME) is one of the encoder critical blocks to achieve significant compression gains. However, it demands an overwhelming complexity cost to accurately remove video temporal redundancy, especially when encoding very high-resolution video sequences. To reduce the overall video encoding time, we propose the implementation of the HEVC ME block in hardware. The proposed architecture is based on (a) a new memory scan order, and (b) a new adder tree structure, which supports asymmetric partitioning modes in a fast and efficient way. The proposed system has been designed in VHDL (VHSIC Hardware Description Language), synthesized and implemented by means of the Xilinx FPGA, Virtex-7 XC7VX550T-3FFG1158. Our design achieves encoding frame rates up to 116 and 30 fps at 2 and 4K video formats, respectively.

...read moreread less

Journal Article•DOI•

[...]

Yingjie Xue¹, Wanteng Liu¹, Wei Lu¹, Wei Lu², Yuileong Yeung¹, Xianjin Liu¹, Hongmei Liu¹ - Show less +3 more•Institutions (2)

Sun Yat-sen University¹, Chinese Academy of Sciences²

Realization of CUDA-based real-time registration and target localization for high-resolution video images

TL;DR: An efficient block-based steganographic method for halftone images is proposed, based on optimal dispersion degree (DD), which can measure the complexity of the region texture and choose blocks with complex texture to reduce the visual distortion.

...read moreread less

Abstract: Halftone images are usually used in facsimile and halftone image steganography can be used for facsimile channel In recent years, real-time image processing becomes more and more important In this paper, an efficient block-based steganographic method for halftone images is proposed This method is based on optimal dispersion degree (DD), which can measure the complexity of the region texture To reduce the visual distortion, the blocks with complex texture can be selected as carriers according to the dispersion degree Finally, the secret messages are embedded by flipping the pixels that can minimize the changes of texture structure The experiments demonstrate that the proposed scheme maintains a good image visual quality and realizes acceptable statistical security with high capacity

...read moreread less

Journal Article•DOI•

[...]

Xiyang Zhi¹, Junhua Yan², Yiqing Hang², Shunfei Wang²•Institutions (2)

Harbin Institute of Technology¹, Nanjing University of Aeronautics and Astronautics²

Real-time illumination and shadow invariant lane detection on mobile platform

TL;DR: Improved ORB (Oriented FAST and Rotated BRIEF) based real-time image registration and target localization algorithm for high-resolution video images is proposed, focusing on the parallelization of three of the most time-consuming parts: improved ORB feature extraction, feature matching based on Hamming distance for matching rough points, and Random Sample Consensus algorithm for precise matching and achieving transformation model parameters.

...read moreread less

Abstract: High-resolution video images contain huge amount of data so that the real-time capability of image registration and target localization algorithm is difficult to be achieved when operated on central processing units (CPU). In this paper, improved ORB (Oriented FAST and Rotated BRIEF, FAST, which means “Features from Accelerated Segment Test”, is a corner detection method used for feature points extraction. BRIEF means “Binary Robust Independent Elementary Features”, and it’s a binary bit string used to describe features) based real-time image registration and target localization algorithm for high-resolution video images is proposed. We focus on the parallelization of three of the most time-consuming parts: improved ORB feature extraction, feature matching based on Hamming distance for matching rough points, and Random Sample Consensus algorithm for precise matching and achieving transformation model parameters. Realizing Compute Unified Device Architecture (CUDA)-based real-time image registration and target localization parallel algorithm for high-resolution video images is also emphasized on. The experimental results show that when the registration and localization effect is similar, image registration and target localization algorithm for high-resolution video images achieved by CUDA is roughly 20 times faster than by CPU implementation, meeting the requirement of real-time processing.

...read moreread less

Journal Article•DOI•

[...]

Ayhan Kucukmanisa¹, Gökhan Tarım¹, Oguzhan Urhan¹•Institutions (1)

Kocaeli University¹

Efficient hardware implementation strategy for local normalization of fingerprint images

TL;DR: Experimental results show that the proposed lane detection method is able to provide shadow, illumination and road defects invariant performance compared to the existing methods.

...read moreread less

Abstract: In this work, a novel lane detection method using a single input image is presented. The proposed method adopts a color and shadow invariant preprocessing stage including a feature region detection method called as maximally stable extremal regions. Next, candidate lane regions are examined according to their structural properties such as width–height ratio and orientation. This stage is followed by a template matching-based approach to decide final candidates for lane markings. At the final stage of the proposed method, outliers are eliminated using the random sample consensus approach. The proposed method is computationally lightweight, and thus, it is possible to execute it in real-time on consumer-grade mobile devices. Experimental results show that the proposed method is able to provide shadow, illumination and road defects invariant performance compared to the existing methods.

...read moreread less

Journal Article•DOI•

[...]

Tariq M. Khan¹, Donald G. Bailey², Mohammad A. U. Khan³, Yinan Kong¹•Institutions (3)

Macquarie University¹, Massey University², Effat University³

01 Jan 2019-Journal of Real-time Image Processing

TL;DR: The research work presented here introduces a correction factor that, once multiplied with the output of the conventional normalization algorithm, will enhance only the feature region of the image while avoiding the background area entirely.

...read moreread less

Abstract: Global techniques do not produce satisfying and definitive results for fingerprint image normalization due to the non-stationary nature of the image contents. Local normalization techniques are employed, which are a better alternative to deal with local image statistics. Conventional local normalization techniques involve pixelwise division by the local variance and thus have the potential to amplify unwanted noise structures, especially in low-activity background regions. To counter the background noise amplification, the research work presented here introduces a correction factor that, once multiplied with the output of the conventional normalization algorithm, will enhance only the feature region of the image while avoiding the background area entirely. In essence, its task is to provide the job of foreground segmentation. A modified local normalization has been proposed along with its efficient hardware structure. On the way to achieve real-time hardware implementation, certain important computationally efficient approximations are deployed. Test results show an improved speed for the hardware architecture while sustaining reasonable enhancement benchmarks.

...read moreread less

Journal Article•DOI•

Efficient parallelization on GPU of an image smoothing method based on a variational model

[...]

Carlos A. S. J. Gulo¹, Henrique Ferraz de Arruda², Alex F. de Araujo¹, Antonio Carlos Sementille³, João Manuel R. S. Tavares¹ - Show less +1 more•Institutions (3)

University of Porto¹, University of São Paulo², Sao Paulo State University³

Real-time embedded system for traffic sign recognition based on ZedBoard

TL;DR: This study presents and discusses an implementation of a highly efficient algorithm for image noise smoothing based on general purpose computing on graphics processing units techniques, which facilitates the quick and efficient smoothing of images corrupted by noise, even when performed on large-dimensional data sets.

...read moreread less

Abstract: Medical imaging is fundamental for improvements in diagnostic accuracy. However, noise frequently corrupts the images acquired, and this can lead to erroneous diagnoses. Fortunately, image preprocessing algorithms can enhance corrupted images, particularly in noise smoothing and removal. In the medical field, time is always a very critical factor, and so there is a need for implementations which are fast and, if possible, in real time. This study presents and discusses an implementation of a highly efficient algorithm for image noise smoothing based on general purpose computing on graphics processing units techniques. The use of these techniques facilitates the quick and efficient smoothing of images corrupted by noise, even when performed on large-dimensional data sets. This is particularly relevant since GPU cards are becoming more affordable, powerful and common in medical environments.

...read moreread less

Journal Article•DOI•

[...]

Wajdi Farhat, Hassene Faiedh, Chokri Souani, Kamel Besbes

Energy-efficient image compression algorithm for high-frame rate multi-view wireless capsule endoscopy

TL;DR: An efficient algorithm was proposed, which operates in two processing steps: the detection and the recognition of road signs while the vehicle is moving, which can achieve real-time video processing while assuring constraints and high-level accuracy in terms of detection and recognition rates.

...read moreread less

Abstract: This paper presents a design methodology of a real-time embedded system that processes the detection and recognition of road signs while the vehicle is moving. An efficient algorithm was proposed, which operates in two processing steps: the detection and the recognition. Regions of interest were extracted by using the Maximally Stable Extremal Regions Method. For the recognition phase, Oriented FAST and Rotated BRIEF features were used. A hardware system based on the Xilinx Zynq platform was developed. The designed system can achieve real-time video processing while assuring constraints and a high-level accuracy in terms of detection and recognition rates.

...read moreread less

Journal Article•DOI•

[...]

Pawel Turcza¹, Mariusz Duplaga²•Institutions (2)

AGH University of Science and Technology¹, Jagiellonian University Medical College²

Adaptive error prediction method based on multiple linear regression for reversible data hiding

TL;DR: The design, architecture and VLSI implementation of an image compression algorithm for high-frame rate, multi-view wireless endoscopy, by operating directly on Bayer color filter array image, achieves both high overall energy efficiency and low implementation cost is presented.

...read moreread less

Abstract: The design, architecture and VLSI implementation of an image compression algorithm for high-frame rate, multi-view wireless endoscopy is presented. By operating directly on Bayer color filter array image the algorithm achieves both high overall energy efficiency and low implementation cost. It uses two-dimensional discrete cosine transform to decorrelate image values in each $$4\times 4$$ block. Resulting coefficients are encoded by a new low-complexity yet efficient entropy encoder. An adaptive deblocking filter on the decoder side removes blocking effects and tiling artifacts on very flat image, which enhance the final image quality. The proposed compressor, including a 4 KB FIFO, a parallel to serial converter and a forward error correction encoder, is implemented in 180 nm CMOS process. It consumes 1.32 mW at 50 frames per second (fps) and only 0.68 mW at 25 fps at 3 MHz clock. Low silicon area 1.1 mm $$\times$$ 1.1 mm, high energy efficiency (27 $$\upmu$$ J/frame) and throughput offer excellent scalability to handle image processing tasks in new, emerging, multi-view, robotic capsules.

...read moreread less

Journal Article•DOI•

[...]

Bin Ma¹, Xiaoyu Wang¹, Qi Li¹, Bing Li¹, Jian Li¹, Chunpeng Wang¹, Yun Q. Shi² - Show less +3 more•Institutions (2)

Qilu University of Technology¹, New Jersey Institute of Technology²

Consensus-based trajectory estimation for ball detection in calibrated cameras systems

TL;DR: An adaptive error prediction method based on multiple linear regression (MLR) algorithm that can provide a comparatively spare prediction-error image for data embedding, and thus can improve the performance of reversible data hiding.

...read moreread less

Abstract: To improve the prediction accuracy, this paper proposes an adaptive error prediction method based on multiple linear regression (MLR) algorithm. The MLR matrix function that indicates the inner correlations between the pixels and their neighbors is established adaptively according to the consistency of pixels in local area of a natural image, and thus the objected pixel is predicted accurately with the achieved MLR function that denotes the consistency of the neighboring pixels. Compared with the conventional methods that predict the objected pixel with fixed predictors through simple arithmetic combination of its surroundings pixel, the proposed method can provide a comparatively spare prediction-error image for data embedding, and thus can improve the performance of reversible data hiding. Experimental results show that the proposed method outperforms most state-of-the-art error prediction algorithms.

...read moreread less

Journal Article•DOI•

[...]

Pascaline Parisot¹, Christophe De Vleeschouwer¹•Institutions (1)

Université catholique de Louvain¹

A real-time fuzzy morphological algorithm for retinal vessel segmentation

TL;DR: It is concluded that randomized consensus-based methods are competitive compared to the alternative deterministic graph-based solutions, while offering the additional advantage to naturally extend to the cost-effective single-view scenario.

...read moreread less

Abstract: This paper considers the detection of the ball in team sport scenes observed with still or motion-compensated calibrated cameras. Foreground masks do provide primary cues to identify circular moving objects in the scene, but are shown to be too noisy to achieve reliable detections of weakly contrasted balls, especially when a single viewpoint is available, as often desired for reduced deployment cost. In those cases, trajectory analysis has been shown to provide valuable complementary information to differentiate true and false positives among the candidates detected by the foreground mask(s). In this paper, we focus on the detection of ball trajectory segments, exclusively from visual cues, without considering semantic reasoning about team play to connect those segments into long trajectories. We revisit several recent works, and introduce a publicly available dataset to compare them. We conclude that randomized consensus-based methods are competitive compared to the alternative deterministic graph-based solutions, while offering the additional advantage to naturally extend to the cost-effective single-view scenario. As an original contribution, we also introduce a procedure to efficiently clean up the foreground mask in correlation-based methods and a nonlinear rank-order filter to merge the foreground cues from multiple viewpoints. We also derive recommendations regarding the camera positioning and the buffering needs of a real-time acquisition system.

...read moreread less

Journal Article•DOI•

[...]

Pedro Bibiloni¹, Manuel González-Hidalgo¹, Sebastia Massanet¹•Institutions (1)

University of the Balearic Islands¹

FPGA implementation of an adaptive window size image impulse noise suppression system

TL;DR: A real-time algorithm based on fuzzy morphological techniques is introduced to segment vessels in retinal images based on the fuzzy black top-hat transform, which proves to be a simple yet very effective technique.

...read moreread less

Abstract: The detection of vessels is the first step towards an automatic diagnosis and in-depth study of retinal images to aid ophthalmologists. In this paper, a real-time algorithm based on fuzzy morphological techniques is introduced to segment vessels in retinal images. This framework provides a good trade-off between expressive power and computational requirements, since the information in the local neighbourhood is quickly processed by combining a series of fast procedures. Specifically, this method is based on the fuzzy black top-hat transform, which proves to be a simple yet very effective technique. The algorithm processes images of the DRIVE and STARE datasets, in average, in 37 and 57 ms, respectively. Thus, it can be employed while a patient is being examined, embedded into more complex systems or as a pre-screening method for large volumes of data. It outstands when it is compared with other state-of-the-art methodologies in terms of its real-time processing time and its competitive performance.

...read moreread less

Journal Article•DOI•

[...]

Parham Taghinia Jelodari¹, Mojtaba Parsa Kordasiabi¹, Samad Sheikhaei¹, Behjat Forouzandeh¹•Institutions (1)

University of Tehran¹