scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Real-time Image Processing in 2019"


Journal ArticleDOI
TL;DR: The dynamic mode decomposition is a regression technique that integrates two of the leading data analysis methods in use today: Fourier transforms and singular value decomposition and the quality of the resulting background model is competitive, quantified by the F-measure, recall and precision.
Abstract: We introduce the method of compressed dynamic mode decomposition (cDMD) for background modeling. The dynamic mode decomposition is a regression technique that integrates two of the leading data analysis methods in use today: Fourier transforms and singular value decomposition. Borrowing ideas from compressed sensing and matrix sketching, cDMD eases the computational workload of high-resolution video processing. The key principal of cDMD is to obtain the decomposition on a (small) compressed matrix representation of the video feed. Hence, the cDMD algorithm scales with the intrinsic rank of the matrix, rather than the size of the actual video (data) matrix. Selection of the optimal modes characterizing the background is formulated as a sparsity-constrained sparse coding problem. Our results show that the quality of the resulting background model is competitive, quantified by the F-measure, recall and precision. A graphics processing unit accelerated implementation is also presented which further boosts the computational performance of the algorithm.

103 citations


Journal ArticleDOI
TL;DR: A vehicle type classification system based on deep learning is proposed that uses Faster R-CNN to solve the task and is tested on an NVDIA Jetson TK1 board with 192 CUDA cores.
Abstract: Vehicle type classification technology plays an important role in the intelligent transport systems nowadays. With the development of image processing, pattern recognition and deep learning, vehicle type classification technology based on deep learning has raised increasing concern. In the last few years, convolutional neural network, especially Faster Region-convolutional neural networks (Faster R-CNN) has shown great advantages in image classification and object detection. It has superiority to traditional machine learning methods by a large margin. In this paper, a vehicle type classification system based on deep learning is proposed. The system uses Faster R-CNN to solve the task. Experimental results show that the method is not only time-saving, but also has more robustness and higher accuracy. Aimed at cars and trucks, it reached 90.65 and 90.51% accuracy. At last, we test the system on an NVDIA Jetson TK1 board with 192 CUDA cores that is envisioned to be forerunner computational brain for computer vision, robotics and self-driving cars. Experimental results show that it costs around 0.354 s to detect an image and keeps high accurate rate with the network embedded on NVDIA Jetson TK1.

67 citations


Journal ArticleDOI
TL;DR: The paper introduces three of currently defined profiles in JPEG XT, each constraining the common decoder architecture to a subset of allowable configurations, and assess the coding efficiency of each profile extensively through subjective assessments, using 24 naïve subjects to evaluate 20 images and objective evaluations.
Abstract: Standards play an important role in providing a common set of specifications and allowing inter-operability between devices and systems. Until recently, no standard for high-dynamic-range (HDR) image coding had been adopted by the market, and HDR imaging relies on proprietary and vendor-specific formats which are unsuitable for storage or exchange of such images. To resolve this situation, the JPEG Committee is developing a new coding standard called JPEG XT that is backward compatible to the popular JPEG compression, allowing it to be implemented using standard 8-bit JPEG coding hardware or software. In this paper, we present design principles and technical details of JPEG XT. It is based on a two-layer design, a base layer containing a low-dynamic-range image accessible to legacy implementations, and an extension layer providing the full dynamic range. The paper introduces three of currently defined profiles in JPEG XT, each constraining the common decoder architecture to a subset of allowable configurations. We assess the coding efficiency of each profile extensively through subjective assessments, using 24 naive subjects to evaluate 20 images, and objective evaluations, using 106 images with five different tone-mapping operators and at 100 different bit rates. The objective results (based on benchmarking with subjective scores) demonstrate that JPEG XT can encode HDR images at bit rates varying from 1.1 to 1.9 bit/pixel for estimated mean opinion score (MOS) values above 4.5 out of 5, which is considered as fully transparent in many applications. This corresponds to 23-times bitstream reduction compared to lossless OpenEXR PIZ compression.

65 citations


Journal ArticleDOI
TL;DR: An RGB-D single-object tracker, built upon the extremely fast RGB-only KCF tracker that is able to exploit depth information to handle scale changes, occlusions, and shape changes is proposed.
Abstract: We propose an RGB-D single-object tracker, built upon the extremely fast RGB-only KCF tracker that is able to exploit depth information to handle scale changes, occlusions, and shape changes. Despite the computational demands of the extra functionalities, we still achieve real-time performance rates of 35–43 fps in MATLAB and 187 fps in our C++ implementation. Our proposed method includes fast depth-based target object segmentation that enables, (1) efficient scale change handling within the KCF core functionality in the Fourier domain, (2) the detection of occlusions by temporal analysis of the target’s depth distribution, and (3) the estimation of a target’s change of shape through the temporal evolution of its segmented silhouette allows. Finally, we provide an in-depth analysis of the factors affecting the throughput and precision of our proposed tracker and perform extensive comparative analysis. Both the MATLAB and C++ versions of our software are available in the public domain.

51 citations


Journal ArticleDOI
TL;DR: This work proposes a novel Canny edge detection algorithm in block level to detect edges without any loss that uses sobel operator, approximation methods to compute gradient magnitude and orientation for replacing complex operations with reduced hardware cost.
Abstract: Implementation of Canny edge detection algorithm significantly outperforms the existing edge detection techniques in many computer vision algorithms. However, Canny edge detection algorithm is complex, time-consuming process with high hardware cost. To overcome these issues, a novel Canny edge detection algorithm is proposed in block level to detect edges without any loss. It uses sobel operator, approximation methods to compute gradient magnitude and orientation for replacing complex operations with reduced hardware cost, existing non-maximum suppression, block classification for adaptive thresholding and existing hysteresis thresholding. Pipelining is introduced to reduce latency. The proposed algorithm is implemented on Xilinx Virtex-5 FPGA and it provides better performance compared to frame-level Canny edge detection algorithm. The synthesized architecture reduces execution time by 6.8 % and utilizes less resource to detect edges of 512 × 512 image compared to existing distributed Canny edge detection algorithm.

49 citations


Journal ArticleDOI
TL;DR: This paper proposes the implementation in reconfigurable hardware of the principal component analysis (PCA) algorithm to carry out the dimensionality reduction in hyperspectral images, and demonstrates that the hardware version of the PCA algorithm significantly outperforms a commercial software version.
Abstract: Remotely sensed hyperspectral imaging is a very active research area, with numerous contributions in the recent scientific literature. The analysis of these images represents an extremely complex procedure from a computational point of view, mainly due to the high dimensionality of the data and the inherent complexity of the state-of-the-art algorithms for processing hyperspectral images. This computational cost represents a significant disadvantage in applications that require real-time response, such as fire tracing, prevention and monitoring of natural disasters, chemical spills, and other environmental pollution. Many of these algorithms consider, as one of their fundamental stages to fully process a hyperspectral image, a dimensionality reduction in order to remove noise and redundant information in the hyperspectral images under analysis. Therefore, it is possible to significantly reduce the size of the images, and hence, alleviate data storage requirements. However, this step is not exempt of computationally complex matrix operations, such as the computation of the eigenvalues and the eigenvectors of large and dense matrices. Hence, for the aforementioned applications in which prompt replies are mandatory, this dimensionality reduction must be considerably accelerated, typically through the utilization of high-performance computing platforms. For this purpose, reconfigurable hardware solutions such as field-programmable gate arrays have been consolidated during the last years as one of the standard choices for the fast processing of hyperspectral remotely sensed images due to their smaller size, weight and power consumption when compared with other high-performance computing systems. In this paper, we propose the implementation in reconfigurable hardware of the principal component analysis (PCA) algorithm to carry out the dimensionality reduction in hyperspectral images. Experimental results demonstrate that our hardware version of the PCA algorithm significantly outperforms a commercial software version, which makes our reconfigurable system appealing for onboard hyperspectral data processing. Furthermore, our implementation exhibits real-time performance with regard to the time that the targeted hyperspectral instrument takes to collect the image data.

45 citations


Journal ArticleDOI
TL;DR: The proposed system ‘Deepgender’ has registered 98% accuracy by combined use of both databases with the specific preprocess procedure, i.e., exhibiting alignment before resizing, and Experiments suggest that accuracy is nearly 100% with frontal and nonblurred facial images.
Abstract: Face recognition, expression identification, age determination, racial binding and gender classification are common examples of image processing computerization. Gender classification is very straightforward for us like we can tell by the person’s hair, nose, eyes, mouth and skin whether that person is a male or female with a relatively high degree of confidence and accuracy; however, can we program a computer to perform just as well at gender classification? The very problem is the main focus of this research. The conventional sequence for recent real-time facial image processing consists of five steps: face detection, noise removal, face alignment, feature representation and classification. With the aim of human gender classification, face alignment and feature vector extraction stages have been re-examined keeping in view the application of the system on smartphones. Face alignment has been made by spotting out 83 facial landmarks and 3-D facial model with the purpose of applying affine transformation. Furthermore, ‘feature representation’ is prepared through proposed modification in multilayer deep neural network, and hence we name it Deepgender. This convolutional deep neural network consists of some locally connected hidden layers without common weights of kernels as previously followed in legacy layered architecture. This specific case study involves deep learning as four convolutional layers, three max-pool layers (for downsizing of unrelated data), two fully connected layers (connection of outcome to all inputs) and a single layer of ‘multinomial logistic regression.’ Training has been made using CAS-PEAL and FEI which contain 99,594 face images of 1040 people and 2800 face images of 200 individuals, respectively. These images are either in different poses or taken under uncontrolled conditions which are close to real-time input facial image for gender classification application. The proposed system ‘Deepgender’ has registered 98% accuracy by combined use of both databases with the specific preprocess procedure, i.e., exhibiting alignment before resizing. Experiments suggest that accuracy is nearly 100% with frontal and nonblurred facial images. State-of-the-art steps have been taken to overcome memory and battery constraints in mobiles.

38 citations


Journal ArticleDOI
TL;DR: A novel kernel-based method is proposed for fast, highly accurate and numerically stable computations of polar harmonic transforms (PHT) in polar coordinates which results in highly improved image reconstruction capabilities.
Abstract: A novel kernel-based method is proposed for fast, highly accurate and numerically stable computations of polar harmonic transforms (PHT) in polar coordinates. Euler formula is used to derive a novel trigonometric formula where the later one is used in the kernel generation. The simplified radial and angular kernels are used in efficient computation PHTs. The proposed method removes the numerical approximation errors involved in conventional methods and provides highly accurate PHTs coefficients which results in highly improved image reconstruction capabilities. Numerical experiments are performed where the results are compared with those of the recent existing methods. In addition to the tremendous reduction in computational times, the obtained results of the proposed method clearly show a significant improvement in rotational invariance.

37 citations


Journal ArticleDOI
TL;DR: A multi-purpose method based on densely connected convolutional neural networks for simultaneous detection of 11 different types of image manipulations and can achieve better overall performance when tested on different databases as well as better robustness against JPEG compression even under low-quality JPEG compression.
Abstract: Multi-purpose forensics is attracting increasing attention worldwide. In this paper, we propose a multi-purpose method based on densely connected convolutional neural networks (CNNs) for simultaneous detection of 11 different types of image manipulations. An efficient CNN structure has been specifically designed for forensics by considering vital architecture components, including the number of convolutional layers, the size of convolutional kernels, the nonlinear activations, and the type of pooling layer. The dense connectivity pattern, which has better parameter efficiency than the traditional pattern, is explored to strengthen the propagation of features related to image manipulation detection. When compared with four state-of-the-art methods, our experiments demonstrate that the proposed CNN architecture can achieve better performance in multiple operation detections for different image sizes, especially on small image patches. Consequently, the proposed method can accurately detect local image manipulations. The proposed method can achieve better overall performance when tested on different databases as well as better robustness against JPEG compression even under low-quality JPEG compression.

34 citations


Journal ArticleDOI
TL;DR: A robust computer vision approach for identifying abnormal activity at ATM premises in real time is proposed in which different Window size is used to record magnitude of pixel intensity using root of sum of square method.
Abstract: Automated Teller Machines (ATM) transactions are quick and convenient, but the machines and the areas surrounding them make people and ATM vulnerable to felonious activities if not properly put under the protection. Responsibility for providing security needs to be fixed, however, most machines have very less or no security. It is imminent to develop security framework that would identify event as their happening. In this paper we propose a robust computer vision approach for identifying abnormal activity at ATM premises in real time. For effective identification of activity, we propose a novel method in which different Window size is used to record magnitude of pixel intensity using root of sum of square method. To describe this pattern, histogram of gradients is used. Further random forest is applied to infer the most likely class. The average accuracy of our security system is 93.1 %. For validation of our approach we have tested it on two standard datasets, HMDB and Caviar. Our approach achieved 52.12 % accuracy on HMDB dataset and 81.48 % on Caviar dataset.

31 citations


Journal ArticleDOI
TL;DR: The current paper builds up on the characteristics of the L-SEABI SR method to introduce parallelization techniques for GPUs and FPGAs for super-resolution reconstruction and confirms the benefits of the proposed acceleration techniques by employing them on a different category of image processing algorithms.
Abstract: Super-Resolution (SR) techniques constitute a key element in image applications, which need high- resolution reconstruction while in the worst case only a single low-resolution observation is available. SR techniques involve computationally demanding processes and thus researchers are currently focusing on SR performance acceleration. Aiming at improving the SR performance, the current paper builds up on the characteristics of the L-SEABI Super-Resolution (SR) method to introduce parallelization techniques for GPUs and FPGAs. The proposed techniques accelerate GPU reconstruction of Ultra-High Definition content, by achieving three (3x) times faster than the real-time performance on mid-range and previous generation devices and at least nine times (9x) faster than the real-time performance on high-end GPUs. The FPGA design leads to a scalable architecture performing four (4x) times faster than the real-time on low-end Xilinx Virtex 5 devices and sixty-nine times (69x) faster than the real-time on the Virtex 2000t. Moreover, we confirm the benefits of the proposed acceleration techniques by employing them on a different category of image-processing algorithms: on window-based Disparity functions, for which the proposed GPU technique shows an improvement over the CPU performance ranging from 14 times (14x) to 64 times (64x) while the proposed FPGA architecture provides 29x acceleration.

Journal ArticleDOI
TL;DR: This paper presents an approach for real-time raindrops detection which is based on cellular neural networks (CNN) and support vector machines (SVM) and proves the possibility of transforming the support vectors into CNN templates.
Abstract: A core aspect of advanced driver assistance systems (ADAS) is to support the driver with information about the current environmental situation of the vehicle. Bad weather conditions such as rain might occlude regions of the windshield or a camera lens and therefore affect the visual perception. Hence, the automated detection of raindrops has a significant importance for video-based ADAS. The detection of raindrops is highly time critical since video pre-processing stages are required to improve the image quality and to provide their results in real-time. This paper presents an approach for real-time raindrops detection which is based on cellular neural networks (CNN) and support vector machines (SVM). The major idea is to prove the possibility of transforming the support vectors into CNN templates. The advantage of CNN is its ultra fast precessing on embedded platforms such as FPGAs and GPUs. The proposed approach is capable to detect raindrops that might negatively affect the vision of the driver. Different classification features were extracted to evaluate and compare the performance between the proposed approach and other approaches.

Journal ArticleDOI
TL;DR: A family of switching filters designed for the impulsive noise removal in color images is analyzed and shows that the novel filters outperform the existing techniques in terms of both denoising accuracy and computational complexity.
Abstract: In the paper, a family of switching filters designed for the impulsive noise removal in color images is analyzed. The framework of the proposed denoising techniques is based on the concept of cumulated distances between the processed pixel and its neighbors. To increase the filtering efficiency, a robust scheme, in which the sum of distances to only the most similar pixels of the neighborhood serves as a measure of impulsiveness, was elaborated. As this trimmed measure is dependent on the image local structure, an adaptive mechanism was also incorporated. Additionally, a very fast design, which enables image denoising in practical applications, is proposed and the choice of the filter output, which is used to replace the noisy pixels, is discussed. The described family of filters was evaluated on a large set of natural test images and compared with the state-of-the-art restoration methods. The analysis of the achieved results shows that the novel filters outperform the existing techniques in terms of both denoising accuracy and computational complexity. In this way, the proposed techniques can be recommended for the application in various image and video enhancement tasks.

Journal ArticleDOI
TL;DR: The key observation is that, in each block, the modification for the two prediction errors is independent without exploiting the correlation between them, although they are closely correlated with each other, which guarantees the reversibility of the PVO-based RDH method.
Abstract: Pixel-value-ordering (PVO) is an efficient technique of reversible data hiding (RDH). By PVO, the cover image is first divided into non-overlapping blocks with equal size. Then, the pixel values in each block are sorted in ascending order. Next, take the second largest/samllest pixel value as a prediction of the largest/samllest pixel value to derive two prediction errors. Finally, the data embedding is constructed by modifying the generated prediction errors of each block. After data embedding, the PVO of each block is unchanged, which guarantees the reversibility. Our key observation is that, in each block, the modification for the two prediction errors is independent without exploiting the correlation between them, although they are closely correlated with each other. In light of this, an improved PVO-based RDH method is proposed in this work. The two prediction errors of each block are considered as a pair, and the pairs are modified for data embedding based on adaptive two-dimensional histogram modification. The proposed method is experimentally verified better than the original PVO-based method and some of its improvements.

Journal ArticleDOI
TL;DR: This paper presents a CUDA-accelerated implementation of the BM3D algorithm, which increased the image processing speed significantly and brings theBM3D method much closer to real applications.
Abstract: Denoising photographs and video recordings is an important task in the domain of image processing. In this paper, we focus on block-matching and 3D filtering (BM3D) algorithm, which uses self-similarity of image blocks to improve the noise-filtering process. Even though this method has achieved quite impressive results in the terms of denoising quality, it is not being widely used. One of the reasons is a fact that the method is extremely computationally demanding. In this paper, we present a CUDA-accelerated implementation which increased the image processing speed significantly and brings the BM3D method much closer to real applications. The GPU implementation of the BM3D algorithm is not as straightforward as the implementation of simpler image processing methods, and we believe that some parts (especially the block-matching) can be utilized separately or provide guidelines for similar algorithms.

Journal ArticleDOI
TL;DR: A multi-FPGA architecture for real-time TFM imaging using the full matrix capture (FMC) and performs real- time FMC-TFM imaging with a good characterization of defects.
Abstract: Real-time imaging, using ultrasound techniques, is a complex task in non-destructive evaluation. In this context, fast and precise control systems require design of specialized parallel architectures. Total focusing method (TFM) for ultrasound imaging has many advantages in terms of flexibility and accuracy in comparison to traditional imaging techniques. However, one major drawback is the high number of data acquisitions and computing requirements for this imaging technique. Due to those constraints, the TFM algorithm was earlier classified in the field of post-processing tasks. This paper describes a multi-FPGA architecture for real-time TFM imaging using the full matrix capture (FMC). In the acquisition process, data are acquired using a phased array and processed with synthetic focusing techniques such as the TFM algorithm. The FMC-TFM architecture consists of a set of interconnected FPGAs integrated on an embedded system. Initially, this imaging system was dedicated to data acquisition using a phased array. The algorithm was reviewed and partitioned to parallelize processing tasks on FPGAs. The architecture was entirely described using VHDL language, synthesized and implemented on a V5FX70T Xilinx FPGA for the control and high-level processing tasks and four V5SX95T Xilinx FPGAs for the acquisition and low-level processing tasks. The designed architecture performs real-time FMC-TFM imaging with a good characterization of defects.

Journal ArticleDOI
TL;DR: In this paper, a real-time video stabilization algorithm for UAVs is proposed, which avoids the necessity of estimating the most general motion model, projective transformation, and considers simpler motion models such as rigid transformation and similarity transformation.
Abstract: This paper describes the development of a novel algorithm to tackle the problem of real-time video stabilization for unmanned aerial vehicles (UAVs). There are two main components in the algorithm: (1) By designing a suitable model for the global motion of UAV, the proposed algorithm avoids the necessity of estimating the most general motion model, projective transformation, and considers simpler motion models, such as rigid transformation and similarity transformation; (2) to achieve a high processing speed, optical flow-based tracking is employed in lieu of conventional tracking and matching methods used by state-of-the-art algorithms. These two new ideas resulted in a real-time stabilization algorithm, developed over two phases. Stage I considers processing the whole sequence of frames in the video while achieving an average processing speed of 50 fps on several publicly available benchmark videos. Next, Stage II undertakes the task of real-time video stabilization using a multi-threading implementation of the algorithm designed in Stage I.

Journal ArticleDOI
TL;DR: Comparison results illustrate that the proposed RDH-EI algorithm with shifting block histogram can achieve efficient data embedding with high payload and correctly recover image, and outperforms some state-of-the-art algorithms in terms of embedding rate, visual quality and computational time.
Abstract: Reversible data hiding in encrypted image (RDH-EI) is a hot topic of data hiding in recent years. Most RDH-EI algorithms do not reach desirable embedding rate and their computational costs are not suitable for real-time applications. Aiming at these problems, we propose an efficient RDH-EI algorithm with shifting block histogram of pixel differences in homomorphic encrypted domain. A key step of our RDH-EI algorithm is the block-based encryption scheme with additive homomorphism, which can preserve spatial correlation of plaintext image in homomorphic encrypted domain. In addition, our proposed technique of shifting block histogram can achieve efficient data embedding with high payload and correctly recover image. Extensive experiments are conducted to validate performance of our RDH-EI algorithm. Comparison results illustrate that our RDH-EI algorithm outperforms some state-of-the-art algorithms in terms of embedding rate, visual quality and computational time.

Journal ArticleDOI
TL;DR: This work proposes the implementation of the HEVC ME block in hardware based on a new memory scan order, and a new adder tree structure, which supports asymmetric partitioning modes in a fast and efficient way to reduce the overall video encoding time.
Abstract: High-Efficiency Video Coding (HEVC) was developed to improve its predecessor standard, H264/AVC, by doubling its compression efficiency. As in previous standards, Motion Estimation (ME) is one of the encoder critical blocks to achieve significant compression gains. However, it demands an overwhelming complexity cost to accurately remove video temporal redundancy, especially when encoding very high-resolution video sequences. To reduce the overall video encoding time, we propose the implementation of the HEVC ME block in hardware. The proposed architecture is based on (a) a new memory scan order, and (b) a new adder tree structure, which supports asymmetric partitioning modes in a fast and efficient way. The proposed system has been designed in VHDL (VHSIC Hardware Description Language), synthesized and implemented by means of the Xilinx FPGA, Virtex-7 XC7VX550T-3FFG1158. Our design achieves encoding frame rates up to 116 and 30 fps at 2 and 4K video formats, respectively.

Journal ArticleDOI
TL;DR: An efficient block-based steganographic method for halftone images is proposed, based on optimal dispersion degree (DD), which can measure the complexity of the region texture and choose blocks with complex texture to reduce the visual distortion.
Abstract: Halftone images are usually used in facsimile and halftone image steganography can be used for facsimile channel In recent years, real-time image processing becomes more and more important In this paper, an efficient block-based steganographic method for halftone images is proposed This method is based on optimal dispersion degree (DD), which can measure the complexity of the region texture To reduce the visual distortion, the blocks with complex texture can be selected as carriers according to the dispersion degree Finally, the secret messages are embedded by flipping the pixels that can minimize the changes of texture structure The experiments demonstrate that the proposed scheme maintains a good image visual quality and realizes acceptable statistical security with high capacity

Journal ArticleDOI
TL;DR: Improved ORB (Oriented FAST and Rotated BRIEF) based real-time image registration and target localization algorithm for high-resolution video images is proposed, focusing on the parallelization of three of the most time-consuming parts: improved ORB feature extraction, feature matching based on Hamming distance for matching rough points, and Random Sample Consensus algorithm for precise matching and achieving transformation model parameters.
Abstract: High-resolution video images contain huge amount of data so that the real-time capability of image registration and target localization algorithm is difficult to be achieved when operated on central processing units (CPU). In this paper, improved ORB (Oriented FAST and Rotated BRIEF, FAST, which means “Features from Accelerated Segment Test”, is a corner detection method used for feature points extraction. BRIEF means “Binary Robust Independent Elementary Features”, and it’s a binary bit string used to describe features) based real-time image registration and target localization algorithm for high-resolution video images is proposed. We focus on the parallelization of three of the most time-consuming parts: improved ORB feature extraction, feature matching based on Hamming distance for matching rough points, and Random Sample Consensus algorithm for precise matching and achieving transformation model parameters. Realizing Compute Unified Device Architecture (CUDA)-based real-time image registration and target localization parallel algorithm for high-resolution video images is also emphasized on. The experimental results show that when the registration and localization effect is similar, image registration and target localization algorithm for high-resolution video images achieved by CUDA is roughly 20 times faster than by CPU implementation, meeting the requirement of real-time processing.

Journal ArticleDOI
TL;DR: Experimental results show that the proposed lane detection method is able to provide shadow, illumination and road defects invariant performance compared to the existing methods.
Abstract: In this work, a novel lane detection method using a single input image is presented. The proposed method adopts a color and shadow invariant preprocessing stage including a feature region detection method called as maximally stable extremal regions. Next, candidate lane regions are examined according to their structural properties such as width–height ratio and orientation. This stage is followed by a template matching-based approach to decide final candidates for lane markings. At the final stage of the proposed method, outliers are eliminated using the random sample consensus approach. The proposed method is computationally lightweight, and thus, it is possible to execute it in real-time on consumer-grade mobile devices. Experimental results show that the proposed method is able to provide shadow, illumination and road defects invariant performance compared to the existing methods.

Journal ArticleDOI
TL;DR: The research work presented here introduces a correction factor that, once multiplied with the output of the conventional normalization algorithm, will enhance only the feature region of the image while avoiding the background area entirely.
Abstract: Global techniques do not produce satisfying and definitive results for fingerprint image normalization due to the non-stationary nature of the image contents. Local normalization techniques are employed, which are a better alternative to deal with local image statistics. Conventional local normalization techniques involve pixelwise division by the local variance and thus have the potential to amplify unwanted noise structures, especially in low-activity background regions. To counter the background noise amplification, the research work presented here introduces a correction factor that, once multiplied with the output of the conventional normalization algorithm, will enhance only the feature region of the image while avoiding the background area entirely. In essence, its task is to provide the job of foreground segmentation. A modified local normalization has been proposed along with its efficient hardware structure. On the way to achieve real-time hardware implementation, certain important computationally efficient approximations are deployed. Test results show an improved speed for the hardware architecture while sustaining reasonable enhancement benchmarks.

Journal ArticleDOI
TL;DR: This study presents and discusses an implementation of a highly efficient algorithm for image noise smoothing based on general purpose computing on graphics processing units techniques, which facilitates the quick and efficient smoothing of images corrupted by noise, even when performed on large-dimensional data sets.
Abstract: Medical imaging is fundamental for improvements in diagnostic accuracy. However, noise frequently corrupts the images acquired, and this can lead to erroneous diagnoses. Fortunately, image preprocessing algorithms can enhance corrupted images, particularly in noise smoothing and removal. In the medical field, time is always a very critical factor, and so there is a need for implementations which are fast and, if possible, in real time. This study presents and discusses an implementation of a highly efficient algorithm for image noise smoothing based on general purpose computing on graphics processing units techniques. The use of these techniques facilitates the quick and efficient smoothing of images corrupted by noise, even when performed on large-dimensional data sets. This is particularly relevant since GPU cards are becoming more affordable, powerful and common in medical environments.

Journal ArticleDOI
TL;DR: An efficient algorithm was proposed, which operates in two processing steps: the detection and the recognition of road signs while the vehicle is moving, which can achieve real-time video processing while assuring constraints and high-level accuracy in terms of detection and recognition rates.
Abstract: This paper presents a design methodology of a real-time embedded system that processes the detection and recognition of road signs while the vehicle is moving. An efficient algorithm was proposed, which operates in two processing steps: the detection and the recognition. Regions of interest were extracted by using the Maximally Stable Extremal Regions Method. For the recognition phase, Oriented FAST and Rotated BRIEF features were used. A hardware system based on the Xilinx Zynq platform was developed. The designed system can achieve real-time video processing while assuring constraints and a high-level accuracy in terms of detection and recognition rates.

Journal ArticleDOI
TL;DR: The design, architecture and VLSI implementation of an image compression algorithm for high-frame rate, multi-view wireless endoscopy, by operating directly on Bayer color filter array image, achieves both high overall energy efficiency and low implementation cost is presented.
Abstract: The design, architecture and VLSI implementation of an image compression algorithm for high-frame rate, multi-view wireless endoscopy is presented. By operating directly on Bayer color filter array image the algorithm achieves both high overall energy efficiency and low implementation cost. It uses two-dimensional discrete cosine transform to decorrelate image values in each $$4\times 4$$ block. Resulting coefficients are encoded by a new low-complexity yet efficient entropy encoder. An adaptive deblocking filter on the decoder side removes blocking effects and tiling artifacts on very flat image, which enhance the final image quality. The proposed compressor, including a 4 KB FIFO, a parallel to serial converter and a forward error correction encoder, is implemented in 180 nm CMOS process. It consumes 1.32 mW at 50 frames per second (fps) and only 0.68 mW at 25 fps at 3 MHz clock. Low silicon area 1.1 mm $$\times$$ 1.1 mm, high energy efficiency (27 $$\upmu$$ J/frame) and throughput offer excellent scalability to handle image processing tasks in new, emerging, multi-view, robotic capsules.

Journal ArticleDOI
TL;DR: An adaptive error prediction method based on multiple linear regression (MLR) algorithm that can provide a comparatively spare prediction-error image for data embedding, and thus can improve the performance of reversible data hiding.
Abstract: To improve the prediction accuracy, this paper proposes an adaptive error prediction method based on multiple linear regression (MLR) algorithm. The MLR matrix function that indicates the inner correlations between the pixels and their neighbors is established adaptively according to the consistency of pixels in local area of a natural image, and thus the objected pixel is predicted accurately with the achieved MLR function that denotes the consistency of the neighboring pixels. Compared with the conventional methods that predict the objected pixel with fixed predictors through simple arithmetic combination of its surroundings pixel, the proposed method can provide a comparatively spare prediction-error image for data embedding, and thus can improve the performance of reversible data hiding. Experimental results show that the proposed method outperforms most state-of-the-art error prediction algorithms.

Journal ArticleDOI
TL;DR: It is concluded that randomized consensus-based methods are competitive compared to the alternative deterministic graph-based solutions, while offering the additional advantage to naturally extend to the cost-effective single-view scenario.
Abstract: This paper considers the detection of the ball in team sport scenes observed with still or motion-compensated calibrated cameras. Foreground masks do provide primary cues to identify circular moving objects in the scene, but are shown to be too noisy to achieve reliable detections of weakly contrasted balls, especially when a single viewpoint is available, as often desired for reduced deployment cost. In those cases, trajectory analysis has been shown to provide valuable complementary information to differentiate true and false positives among the candidates detected by the foreground mask(s). In this paper, we focus on the detection of ball trajectory segments, exclusively from visual cues, without considering semantic reasoning about team play to connect those segments into long trajectories. We revisit several recent works, and introduce a publicly available dataset to compare them. We conclude that randomized consensus-based methods are competitive compared to the alternative deterministic graph-based solutions, while offering the additional advantage to naturally extend to the cost-effective single-view scenario. As an original contribution, we also introduce a procedure to efficiently clean up the foreground mask in correlation-based methods and a nonlinear rank-order filter to merge the foreground cues from multiple viewpoints. We also derive recommendations regarding the camera positioning and the buffering needs of a real-time acquisition system.

Journal ArticleDOI
TL;DR: A real-time algorithm based on fuzzy morphological techniques is introduced to segment vessels in retinal images based on the fuzzy black top-hat transform, which proves to be a simple yet very effective technique.
Abstract: The detection of vessels is the first step towards an automatic diagnosis and in-depth study of retinal images to aid ophthalmologists. In this paper, a real-time algorithm based on fuzzy morphological techniques is introduced to segment vessels in retinal images. This framework provides a good trade-off between expressive power and computational requirements, since the information in the local neighbourhood is quickly processed by combining a series of fast procedures. Specifically, this method is based on the fuzzy black top-hat transform, which proves to be a simple yet very effective technique. The algorithm processes images of the DRIVE and STARE datasets, in average, in 37 and 57 ms, respectively. Thus, it can be employed while a patient is being examined, embedded into more complex systems or as a pre-screening method for large volumes of data. It outstands when it is compared with other state-of-the-art methodologies in terms of its real-time processing time and its competitive performance.

Journal ArticleDOI
TL;DR: An adaptive switching median-based (ASM) algorithm is used in this paper for noise suppression, modified to achieve a higher PSNR, especially for low noise densities, and improved to obtain higher operating speed in hardware implementation, for real-time applications.
Abstract: The conventional method for image impulse noise suppression is standard median filter utilization, which is satisfying for low noise densities, but not for medium to high noise densities. Adding a noise detection step, as proposed in the literature, makes this algorithm suitable for higher noises, but may degrade the performance at low noise densities. An adaptive switching median-based (ASM) algorithm has been used in this paper for noise suppression. First, the algorithm is modified to achieve a higher PSNR, especially for low noise densities. Then, the structure of the modified algorithm is improved to obtain higher operating speed in hardware implementation, for real-time applications. The implemented algorithm works in two steps, detection and filtering. The noise detection method is enhanced, by merging the amount of memory used for the algorithm implementation. As a result, less hardware resources are required, while the chance of false noise detection is reduced, due to the improvement made in the algorithm. In the filtering step, an adaptive window size is used, based on the measured noise density. This improved algorithm is adopted for more efficient hardware implementation. In addition, high parallelism is utilized to boost the operating frequency, and meanwhile, clock gating is used to lower power consumption. This architecture, then, has been implemented physically on an FPGA, and an operating frequency of 93 MHz is achieved. The hardware requirement is approximately 10,000 4-input LUTs, and the processing time for a 512 × 512 pixels image is measured at 12 ms.