scispace - formally typeset
Search or ask a question

Showing papers in "Eurasip Journal on Image and Video Processing in 2012"


Journal ArticleDOI
TL;DR: This article uses optic disc of the first four retinal images in DRIVE dataset to extract the histograms of each color component as template for localizing the center of optic disc.
Abstract: In this article, we propose a new method for localizing optic disc in retinal images. Localizing the optic disc and its center is the first step of most vessel segmentation, disease diagnostic, and retinal recognition algorithms. We use optic disc of the first four retinal images in DRIVE dataset to extract the histograms of each color component. Then, we calculate the average of histograms for each color as template for localizing the center of optic disc. The DRIVE, STARE, and a local dataset including 273 retinal images are used to evaluate the proposed algorithm. The success rate was 100, 91.36, and 98.9%, respectively.

116 citations


Journal ArticleDOI
TL;DR: This article presents a fusion-based contrast-enhancement technique which integrates information to overcome the limitations of different contrast- enhancement algorithms and shows the efficiency of the method in enhancing details without affecting the colour balance or introducing saturation artefacts.
Abstract: The goal of contrast enhancement is to improve visibility of image details without introducing unrealistic visual appearances and/or unwanted artefacts. While global contrast-enhancement techniques enhance the overall contrast, their dependences on the global content of the image limit their ability to enhance local details. They also result in significant change in image brightness and introduce saturation artefacts. Local enhancement methods, on the other hand, improve image details but can produce block discontinuities, noise amplification and unnatural image modifications. To remedy these shortcomings, this article presents a fusion-based contrast-enhancement technique which integrates information to overcome the limitations of different contrast-enhancement algorithms. The proposed method balances the requirement of local and global contrast enhancements and a faithful representation of the original image appearance, an objective that is difficult to achieve using traditional enhancement methods. Fusion is performed in a multi-resolution fashion using Laplacian pyramid decomposition to account for the multi-channel properties of the human visual system. For this purpose, metrics are defined for contrast, image brightness and saturation. The performance of the proposed method is evaluated using visual assessment and quantitative measures for contrast, luminance and saturation. The results show the efficiency of the method in enhancing details without affecting the colour balance or introducing saturation artefacts and illustrate the usefulness of fusion techniques for image enhancement applications.

104 citations


Journal ArticleDOI
TL;DR: A new method that utilizes both texture and geometric information of facial fiducial points is presented, which investigates Gauss–Laguerre wavelets, which have rich frequency extraction capabilities, to extract texture information of various facial expressions.
Abstract: Facial expressions are a valuable source of information that accompanies facial biometrics. Early detection of physiological and psycho-emotional data from facial expressions is linked to the situational awareness module of any advanced biometric system for personal state re/identification. In this article, a new method that utilizes both texture and geometric information of facial fiducial points is presented. We investigate Gauss–Laguerre wavelets, which have rich frequency extraction capabilities, to extract texture information of various facial expressions. Rotation invariance and the multiscale approach of these wavelets make the feature extraction robust. Moreover, geometric positions of fiducial points provide valuable information for upper/lower face action units. The combination of these two types of features is used for facial expression classification. The performance of this system has been validated on three public databases: the JAFFE, the Cohn-Kanade, and the MMI image.

91 citations


Journal ArticleDOI
TL;DR: This study presents two real-time architectures using resource constrained FPGA and GPU devices for the computation of a new algorithm which performs tone mapping, contrast enhancement, and glare mitigation.
Abstract: Low-level computer vision algorithms have high computational requirements. In this study, we present two real-time architectures using resource constrained FPGA and GPU devices for the computation of a new algorithm which performs tone mapping, contrast enhancement, and glare mitigation. Our goal is to implement this operator in a portable and battery-operated device, in order to obtain a low vision aid specially aimed at visually impaired people who struggle to manage themselves in environments where illumination is not uniform or changes rapidly. This aid device processes in real-time, with minimum latency, the input of a camera and shows the enhanced image on a head mounted display (HMD). Therefore, the proposed operator has been implemented on battery-operated platforms, one based on the GPU NVIDIA ION2 and another on the FPGA Spartan III, which perform at rates of 30 and 60 frames per second, respectively, when working with VGA resolution images (640 × 480).

85 citations


Journal ArticleDOI
TL;DR: A new color image segmentation method, based on multilevel thresholding and data fusion techniques which aim at combining different data sources associated to the same color image in order to increase the information quality and to get a more reliable and accurate segmentation result.
Abstract: In this article, we present a new color image segmentation method, based on multilevel thresholding and data fusion techniques which aim at combining different data sources associated to the same color image in order to increase the information quality and to get a more reliable and accurate segmentation result. The proposed segmentation approach is conceptually different and explores a new strategy. In fact, instead of considering only one image for each application, our technique consists in combining many realizations of the same image, together, in order to increase the information quality and to get an optimal segmented image. For segmentation, we proceed in two steps. In the first step, we begin by identifying the most significant peaks of the histogram. For this purpose, an optimal multi-level thresholding is used based on the two-stage Otsu optimization approach. In the second step, the evidence theory is employed to merge several images represented in different color spaces, in order to get a final reliable and accurate segmentation result. The notion of mass functions, in the Dempster-Shafer (DS) evidence theory, is linked to the Gaussian distribution, and the final segmentation is achieved, on an input image, expressed in different color spaces, by using the DS combination rule and decision. The algorithm is demonstrated through the segmentation of medical color images. The classification accuracy of the proposed method is evaluated and a comparative study versus existing techniques is presented. The experiments were conducted on an extensive set of color images. Satisfactory segmentation results have been obtained showing the effectiveness and superiority of the proposed method.

55 citations


Journal ArticleDOI
TL;DR: A region-level semantic mining approach where images are segmented into several parts using an improved segmentation algorithm, each with homogeneous spectral and textural characteristics, and then a uniform region-based representation for each image is built.
Abstract: As satellite images are widely used in a large number of applications in recent years, content-based image retrieval technique has become important tools for image exploration and information mining; however, their performances are limited by the semantic gap between low-level features and high-level concepts. To narrow this semantic gap, a region-level semantic mining approach is proposed in this article. Because it is easier for users to understand image content by region, images are segmented into several parts using an improved segmentation algorithm, each with homogeneous spectral and textural characteristics, and then a uniform region-based representation for each image is built. Once the probabilistic relationship among image, region, and hidden semantic is constructed, the Expectation Maximization method can be applied to mine the hidden semantic. We implement this approach on a dataset consisting of thousands of satellite images and obtain a high retrieval precision, as demonstrated through experiments.

35 citations


Journal ArticleDOI
TL;DR: The noise present in automotive applications, can impact the quality of the depth information output from more complex algorithms resulting that in practice the disparity maps produced are comparable with those of simpler approaches such as Block Matching and Semi-Global Matching which empirically perform better in the automotive environment test sequences.
Abstract: In this work we evaluate the use of several real-time dense stereo algorithms as a passive 3D sensing technology for potential use as part of a driver assistance system or autonomous vehicle guidance. A key limitation in prior work in this area is that although significant comparative work has been done on dense stereo algorithms using de facto laboratory test sets only limited work has been done on evaluation in real world environments such as that found in potential automotive usage. This comparative study aims to provide an empirical comparison using automotive environment video imagery and compare this against dense stereo results drawn on standard test sequences in addition to considering the computational requirement against performance in real-time. We evaluate five chosen algorithms: Block Matching, Semi-Global Matching, No-Maximal Disparity, Cross-Based Local Approach, Adaptive Aggregation with Dynamic Programming. Our comparison shows a contrast between the results obtained on standard test sequences and those for automotive application imagery where a Semi-Global Matching approach gave the best empirical performance. From our study we can conclude that the noise present in automotive applications, can impact the quality of the depth information output from more complex algorithms (No-Maximal Disparity, Cross-Based Local Approach, Adaptive Aggregation with Dynamic Programming) resulting that in practice the disparity maps produced are comparable with those of simpler approaches such as Block Matching and Semi-Global Matching which empirically perform better in the automotive environment test sequences. This empirical result on automotive environment data contradicts the comparative result found on standard dense stereo test sequences using a statistical comparison methodology leading to interesting observations regarding current relative evaulation approaches.

33 citations


Journal ArticleDOI
TL;DR: On the basis of the Euclidean distance measure, an effective image resizing algorithm combining Seam Carving with Scaling is proposed and Experiments show that the algorithm is able to avoid the damage and distortion of image content and preserve both the local structure and the global visual effect of the image graciously.
Abstract: On the basis of the Scale Invariant Feature Transform (SIFT) feature, we research the distance measure in the process of image resizing. Through extracting SIFT features from the original image and the resized one, respectively, we match the SIFT features between two images, and calculate the distance for SIFT feature vectors to evaluate the degree of similarity between the original and the resized image. On the basis of the Euclidean distance measure, an effective image resizing algorithm combining Seam Carving with Scaling is proposed. We first resize an image using Seam Carving, and calculate the similarity distance between the original image and its resized one. Before the salient object and content are damaged obviously, we stop Seam Carving and transfer residual task to Scaling. Experiments show that our algorithm is able to avoid the damage and distortion of image content and preserve both the local structure and the global visual effect of the image graciously.

32 citations


Journal ArticleDOI
TL;DR: This study identifies a screenshot using the characteristics of combing artifacts that appear to be shaped like horizontal jagged noise and can be found around the edges, and proposes a screenshot identification scheme using the trace of screen capture.
Abstract: As screenshots of copyrighted video content are spreading through the Internet without any regulation, cases of copyright infringement have been observed. Further, it is difficult to use existing forensic techniques for determining whether or not a given image was captured from a screen. Thus, we propose a screenshot identification scheme using the trace of screen capture. Since most television systems and camcorders use interlaced scanning, many screenshots are taken from interlaced videos. Consequently, these screenshots contain the trace of interlaced videos, combing artifacts. In this study, we identify a screenshot using the characteristics of combing artifacts that appear to be shaped like horizontal jagged noise and can be found around the edges. To identify a screenshot, the edge areas are extracted using the gray level co-occurrence matrix (GLCM). Then, the amount of combing artifacts is calculated in the extracted edge areas by using the similarity ratio (SR), the ratio of the horizontal noise to the vertical noise. By analyzing the directional inequality of noise components, the proposed scheme identifies the source of an input image. In the experiments conducted, the identification accuracy is measured in various environments. The results prove that the proposed identification scheme is stable and performs well.

18 citations


Journal ArticleDOI
TL;DR: Experimental results indicate that the proposed shadow removal algorithm with background difference method is easy to be realized and can determine the direction of the shadow adaptively, then eliminate the shadow and extract the whole moving object accurately, especially when the chrominance invariant principle is ineffective.
Abstract: This article presents a shadow removal algorithm with background difference method based on shadow position and edges attributes. First, a novel background subtraction method is proposed to obtain moving objects. This method mainly includes three parts, namely detecting the moving regions approximately by calculating the inter-frames differences of symmetrical frames and counting the static index of each probable moving point; modeling for background by the statistics of brightness information and updating this model combining motion templates; then extracting moving objects and its edges. Second, based on the above processing, we suppress shadows in the HSV color space first, then the direction of shadow is determined by shadow edges and positions combining with the horizontal and vertical projections of the edge image, respectively, the position of the shadow is located accurately through proportion method, the shadow can be removed finally. Experimental results indicate that the proposed method is easy to be realized and can determine the direction of the shadow adaptively, then eliminate the shadow and extract the whole moving object accurately, especially when the chrominance invariant principle is ineffective.

13 citations


Journal ArticleDOI
TL;DR: Different energy functional and PS function are introduced to search for the optimal PS approximation of the original image to show that the proposed algorithm is robust to initialization or even free of manual initialization.
Abstract: We propose a novel image segmentation algorithm using piecewise smooth (PS) approximation to image. The proposed algorithm is inspired by four well-known active contour models, i.e., Chan and Vese’ piecewise constant (PC)/smooth models, the region-scalable fitting model, and the local image fitting model. The four models share the same algorithm structure to find a PC/smooth approximation to the original image; the main difference is how to define the energy functional to be minimized and the PC/smooth function. In this article, pursuing the same idea we introduce different energy functional and PS function to search for the optimal PS approximation of the original image. The initial function with our model can be chosen as a constant function, which implies that the proposed algorithm is robust to initialization or even free of manual initialization. Experiments show that the proposed algorithm is very appropriate for a wider range of images, including images with intensity inhomogeneity and infrared ship images with low contrast and complex background.

Journal ArticleDOI
TL;DR: A framework for objective image quality metrics applied to natural images captured by digital cameras is proposed and the mean performance for predicting subjective sharpness was clearly higher than that of the state-of-the-art algorithm and test-target sharpness metrics.
Abstract: Image quality is a vital criterion that guides the technical development of digital cameras. Traditionally, the image quality of digital cameras has been measured using test-targets and/or subjective tests. Subjective tests should be performed using natural images. It is difficult to establish the relationship between the results of artificial test targets and subjective data, however, because of the different test image types. We propose a framework for objective image quality metrics applied to natural images captured by digital cameras. The framework uses reference images captured by a high-quality reference camera to find image areas with appropriate structural energy for the quality attribute. In this study, the framework was set to measure sharpness. Based on the results, the mean performance for predicting subjective sharpness was clearly higher than that of the state-of-the-art algorithm and test-target sharpness metrics.

Journal ArticleDOI
TL;DR: A prediction error preprocessor based on the just noticeable distortion (JND) for the color image compression scheme is presented and simulation results show that the bit rate required by the compression scheme with the preprocessor is lower at high visual quality of the reconstructed color image.
Abstract: In this article, a prediction error preprocessor based on the just noticeable distortion (JND) for the color image compression scheme is presented. The dynamic range of prediction error signals we can reduce, the lower bit rate of the reconstructed image we can obtain at high visual quality. We propose a color JND estimator that is incorporated into the design of the preprocessor in the compression scheme. The color JND estimator is carried out in the wavelet domain to present good estimates to the available amount masking. The estimated JND is used to preprocess the signal and is also used to incorporate into the design of the quantization stage in the compression scheme for higher performance. Simulation results show that the bit rate required by the compression scheme with the preprocessor is lower at high visual quality of the reconstructed color image. The preprocessor is further applied to the input color image of the JPEG and JPEG2000 coders for better performance.

Journal ArticleDOI
TL;DR: The connected components’ relation tree is proposed to find the spatiotemporal relationship between the connected components in consecutive frames for suitable features extraction and the results reveal that the proposed algorithm increases the recognition rate by more than 9.34% in comparison with existing methods.
Abstract: In this article, a new method for the recognition of obscene video contents is presented. In the proposed algorithm, different episodes of a video file starting by key frames are classified independently by using the proposed features. We present three novel sets of features for the classification of video episodes, including (1) features based on the information of single video frames, (2) features based on 3D spatiotemporal volume (STV), and (3) features based on motion and periodicity characteristics. Furthermore, we propose the connected components’ relation tree to find the spatiotemporal relationship between the connected components in consecutive frames for suitable features extraction. To divide an input video into video episodes, a new key frame extraction algorithm is utilized, which combines color histogram of the frames with the entropy of motion vectors. We compare the results of the proposed algorithm with those of other methods. The results reveal that the proposed algorithm increases the recognition rate by more than 9.34% in comparison with existing methods.

Journal ArticleDOI
TL;DR: The proposed NNGVF snake expresses the gradient vector flow as a convolution with a neighborhood-extending Laplacian operator augmented by a noise-smoothing mask to provide better segmentation and an enlarged capture range.
Abstract: We propose a novel external force for active contours, which we call neighborhood-extending and noise-smoothing gradient vector flow (NNGVF). The proposed NNGVF snake expresses the gradient vector flow (GVF) as a convolution with a neighborhood-extending Laplacian operator augmented by a noise-smoothing mask. We find that the NNGVF snake provides better segmentation than the GVF snake in terms of noise resistance, weak edge preservation, and an enlarged capture range. The NNGVF snake accomplishes this with a reduced computational cost while maintaining other desirable properties of the GVF snake, such as initialization insensitivity and good convergences at concavities. We demonstrate the advantages of NNGVF on synthetic and real images.

Journal ArticleDOI
TL;DR: Experiments indicate better speckle reduction and effective preservation of edges and local details in second-order diffusion-based methods.
Abstract: This article proposes a technique for speckle reduction in medical ultrasound (US) imaging which preserves the point and linear features with the added advantage of energy condensation regulator. Whatever be the post processing task on US image, the image should undergo a preprocessing step called despeckling. Nowadays, though the US machines are available with built-in speckle reduction facility, they are suffered by many practical limitations such as limited dynamic range of the display, limited number of unique directions that an US beam scan follow to average an image and limited size of transducer, etc. The proposed diffusion model can be used as a visual enhancement tool for interpretation as well as a preprocessing task for further diagnosis. This method incorporates two terms: diffusion and regulator. The anisotropic diffusion preserves and enhances edges and local details. The regularization enables the correction of feature broadening distortion which is the common problem in second-order diffusion-based methods. In this scheme, the diffusion matrix is designed using local coordinate transformation and the feature broadening correction term is derived from energy function. Performance of the proposed method has been illustrated using synthetic and real US data. Experiments indicate better speckle reduction and effective preservation of edges and local details.

Journal ArticleDOI
TL;DR: A novel DVS algorithm that compensates the camera jitters applying an adaptive fuzzy filter on the global motion of video frames is proposed.
Abstract: Digital video stabilization (DVS) allows acquiring video sequences without disturbing jerkiness, removing unwanted camera movements. A good DVS should remove the unwanted camera movements while maintains the intentional camera movements. In this article, we propose a novel DVS algorithm that compensates the camera jitters applying an adaptive fuzzy filter on the global motion of video frames. The adaptive fuzzy filter is a simple infinite impulse response filter which is tuned by a fuzzy system adaptively to the camera motion characteristics. The fuzzy system is also tuned during operation according to the amount of camera jitters. The fuzzy system uses two inputs which are quantitative representations of the unwanted and the intentional camera movements. The global motion of video frames is estimated based on the block motion vectors which resulted by video encoder during motion estimation operation. Experimental results indicate a good performance for the proposed algorithm.

Journal ArticleDOI
TL;DR: A robust approach to tracking multiple vehicles with integration of multiple visual features with integrated appearance model embedded in a particle filter tracking framework and a new model updating algorithm based on the PF.
Abstract: This article presents a robust approach to tracking multiple vehicles with integration of multiple visual features. The observation is modeled by democratic integration strategies according to the reliability of the information in the current multi-visual features to adjust their weights. The appearance model is also embedded in a particle filter (PF) tracking framework. Furthermore, we propose a new model updating algorithm based on the PF. In order to avoid incorrect results caused by "model drift" introduced into the observation model, model updating should only be controlled in a reliable manner, and the rate of updating is based on reliability. This article also presents the experiments using a real video sequence to verify the proposed method.

Journal ArticleDOI
TL;DR: A real-time facial points tracking method using a modified particle filter based on Harris corner samples which is optimized and combined with an Active Appearance Model (AAM) approach and a combination of rule-based scheme with Probabilistic Actively Learned Support Vector Machines is developed to classify the features calculated from the related tracked facial points.
Abstract: Facial expressions (FE) are one of the important cognitive load markers in the context of car driving Any muscular activity can be coded as an action unit (AU) which are the building blocks of FE Precise facial point tracking is crucial since it is a necessary step for AU detection Here, we present our progress in FE analysis based on AU detection on face infrared videos in the context of a car driving simulator First, we propose a real-time facial points tracking method (HCPF-AAM) using a modified particle filter (PF) based on Harris corner samples which is optimized and combined with an Active Appearance Model (AAM) approach Robustness of PF, precision of Harris corner-based samples, and optimization of AAM result in a powerful facial points tracking on very low-contrast images acquired under near-infrared (NIR) illumination Second, detection of the most common AUs in the context of car driving, identified by a certified Facial Action Coding System coder is presented For detection of each specified AU, the spatio-temporal analysis of related tracked facial points is performed Then, a combination of rule-based scheme with Probabilistic Actively Learned Support Vector Machines is developed to classify the features calculated from the related tracked facial points Results show that with such a scheme, we can obtain more than 91% of precision in the detection of the five most common AUs for low-contrast NIR images and 90% of precision in the MMI dataset

Journal ArticleDOI
TL;DR: A novel scheme for the design of wavelet is proposed to reduce the effect of aliasing terms as much as possible in the general framework of DWT to demonstrate the efficiency of the designed wavelets in the term of shift insensitivity and nonredundancy.
Abstract: It is well known that discrete wavelet transform (DWT) is sensitive to shift, which means a slight shift of feature in the original signal may cause unpredictable changes in the analysis subbands. Some modified versions of DWT can reduce the shift sensitivity, however, they are all redundant. In this article, we shows the shift sensitivity is caused by the aliasing terms formed in the downsampling operation during analysis process. A novel scheme for the design of wavelet is proposed to reduce the effect of aliasing terms as much as possible in the general framework of DWT. A few of biorthogonal wavelets have been designed and applied in the simulation examples. The results of examples demonstrate the efficiency of the designed wavelets in the term of shift insensitivity and nonredundancy.

Journal ArticleDOI
TL;DR: Numerical simulations show that the frame layer optimal encoding procedure brings advantages in terms of several characteristics of the streamed video, encompassing enhanced rate-distortion, reduced transmission buffer occupancy, equalization of the transmission delays, and more efficient switching.
Abstract: Mobile video streaming services are challenging, as they obey several system constraints, such as random access facilities, efficient server storage, and flexible rate adaptation. Rate adaptation can be performed by means of seamless switching among different encoded bitstreams. The H.264 video coding standard explicitly supports bitstream switching using specific frame coding modes, namely switching pictures (SP). Locations of SP frames affect the overall bit rate and quality of streamed video. In this study, we address the issue of optimal joint selection of the SP frames locations and bit budget allocation at frame layer. The optimization is carried out via a game theoretic approach under assigned system constraints on the overall streaming rate and the maximum random access delay. Numerical simulations show that our frame layer optimal encoding procedure brings advantages in terms of several characteristics of the streamed video, encompassing enhanced rate-distortion, reduced transmission buffer occupancy, equalization of the transmission delays, and more efficient switching.

Journal ArticleDOI
TL;DR: This research presents a novel probabilistic approach to estimating the response of the immune system to laser-spot assisted, 3D image analysis of EMT.
Abstract: Reference EPFL-ARTICLE-177930doi:10.1186/1687-5281-2012-5View record in Web of Science Record created on 2012-06-08, modified on 2017-05-10