scispace - formally typeset
Search or ask a question

Showing papers presented at "International Symposium on Image and Signal Processing and Analysis in 2011"


Proceedings Article
Fengqing Zhu1, Marc Bosch1, Nitin Khanna1, Carol J. Boushey1, Edward J. Delp1 
04 Sep 2011
TL;DR: It is demonstrated that the proposed method can be used as part of a new dietary assessment tool to automatically identify and locate the foods in a variety of food images captured during different user studies.
Abstract: Given a dataset of images, we seek to automatically identify and locate perceptually similar objects. We combine two ideas to achieve this: a set of segmented objects can be partitioned into perceptually similar object classes based on global and local features; and perceptually similar object classes can be used to assess the accuracy of image segmentation. These ideas are implemented by generating multiple segmentations of each image and then learning the object class by combining different segmentations to generate optimal segmentation. We demonstrate that the proposed method can be used as part of a new dietary assessment tool to automatically identify and locate the foods in a variety of food images captured during different user studies.

38 citations


Proceedings Article
17 Oct 2011
TL;DR: A color correction method based on the subtraction is proposed and tested on CoLIP approach and Logarithmic hUe eXtension (LUX) approach, alsobased on the LIP theory, on differently illuminated images.
Abstract: The Logarithmic Image Processing (LIP) approach is a mathematical framework developed for the representation and processing of images valued in a bounded intensity range. The LIP theory is physically and psychophysically well justified since it is consistent with several laws of human brightness perception and with the multiplicative image formation model. In this paper, the so-called Color Logarithmic Image Processing (CoLIP) framework is introduced. This novel framework expands the LIP theory to color images in the context of the human color visual perception. Color images are represented by their color tone functions that can be combined by means of basic operations, addition, scalar multiplication and subtraction, opening new pathways for color image processing. In order to highlight the CoLIP relevance with color constancy, a color correction method based on the subtraction is proposed and tested on CoLIP approach and Logarithmic hUe eXtension (LUX) approach, also based on the LIP theory, on differently illuminated images: underwater images with a blue illuminant, and indoor images with yellow illuminant.

27 citations


Proceedings Article
17 Oct 2011
TL;DR: Investigation of the use of camera distance in famous movie scenes, highlighting the relations between the employed shot types and the affective responses by a large audience suggests that patterns of shot types constitute a key element in inducing affective reactions in the audience.
Abstract: In film-making, the distance from the camera to the subject greatly affects the narrative power of a shot By the alternate use of Long shots, Medium and Close-ups the director is able to provide emphasis on key passages of the filmed scene, thus boosting the process of identification of viewers with the film characters On this basis, we here investigate the use of camera distance in famous movie scenes, highlighting the relations between the employed shot types and the affective responses by a large audience Results obtained by using statistical classifiers suggest that patterns of shot types constitute a key element in inducing affective reactions in the audience, with strong evidences especially on the arousal dimension Findings are applicable to support systems for media affective analysis, and to better define emotional models for video content understanding

27 citations


Proceedings Article
17 Oct 2011
TL;DR: The processing pipeline gives promising voxel-wise GFR estimates, but there are challenges related to numerical stability, calculation of concentration curves from signal intensities, and validation issues.
Abstract: In vivo estimation of glomerular filtration rate (GFR) in the kidney is a valuable tool for the clinician for diagnostics, treatment planning and follow up. In this work we present a pipeline for voxelwise computation of in vivo GFR using DCE-MRI. We apply fully automated elastic image registration with normalized gradient fields for motion correction. Kidney segmentation is performed using semi-automated nearest-neighbor segmentation. GFR is computed from pharmacokinetic modeling using two different models, and the obtained values are compared across time points and subjects, and between left and right kidney. The processing pipeline gives promising voxel-wise GFR estimates, but there are challenges related to numerical stability, calculation of concentration curves from signal intensities, and validation issues.

27 citations


Proceedings Article
17 Oct 2011
TL;DR: This work proposes a new method for the calibration of the extrinsic and intrinsic parameters of the projector camera pair that requires only a two-sided plane having a checkerboard at one of the sides, and does not depend on the SL algorithm used for the 3D reconstruction.
Abstract: Structured light (SL) systems make use of a projector and a camera pair to reconstruct the 3D shape of an object. To this end a previous geometric calibration is required. Many camera and projector calibration algorithms have been proposed during decades. However, the necessity to have an easy to use system, non linked to the SL algorithm developed, still remains unsolved. This work proposes a new method for the calibration of the extrinsic and intrinsic parameters of the projector camera pair. The algorithm requires only a two-sided plane having a checkerboard at one of the sides, and does not depend on the SL algorithm used for the 3D reconstruction. Linear and non linear distortion is considered in the calibration of both devices thus obtaining good calibration results, as is shown in the experimental results.

24 citations


Proceedings Article
17 Oct 2011
TL;DR: In this survey, it is explained how the Canonical Polyadic Decomposition of higher-order tensors is connected to different types of Factor Analysis and Blind Source Separation.
Abstract: In this survey we explain how the Canonical Polyadic Decomposition of higher-order tensors is connected to different types of Factor Analysis and Blind Source Separation

23 citations


Proceedings Article
17 Oct 2011
TL;DR: A novel method, PRW based Gait Energy Image (PRWGEI), to reduce the effect of covariate factors in gait feature representation by projecting features into a low dimensional space by a Linear Discriminant Analysis method to improve the discriminative power of the extracted features.
Abstract: Recently, gait recognition has received much increased attention from biometrics researchers. Most of the literature shows that existing appearance based gait feature representation methods, however, suffer from clothing and carrying object covariate factors. Some new gait feature representations are proposed to overcome the issue of clothing and carrying covariate factors, e.g. Gait Entropy Image (GEnI). Even though these methods provide a good recognition rate for clothing and carrying covariate gait sequences, there is still a possibility of obtain the better recognition rate by using better appearance based gait feature representations. To the best of our knowledge, a Poison Random Walk (PRW) approach has not been considered to overcome the issue of clothing and carrying covariate factors' effects in gait feature representations. In this paper, we propose a novel method, PRW based Gait Energy Image (P RW GEI), to reduce the effect of covariate factors in gait feature representation. These P RW GEI features are projected into a low dimensional space by a Linear Discriminant Analysis (LDA) method to improve the discriminative power of the extracted features. The experimental results on the CASIA gait database (dataset B) show that our proposed method achieved a better recognition rate than other methods in the literature for clothing and carrying covariate factors.

20 citations


Proceedings Article
17 Oct 2011
TL;DR: This work evaluates how well methods used in past work perform using a one-class SVM with images exhibiting common endoscopic image degradations such as blur, noise, light reflections and bubbles with celiac data only.
Abstract: The prevalence data of celiac disease have been continuously corrected upwards in the last years. An automated decision support system could improve the diagnosis and safety of the endoscopic procedure. An approach towards such a system is based on a one-class classifier (such as SVM) trained on celiac data only. By doing so, no special treatment of distorted image areas is needed. However, the performance of such a system is highly dependent on the discriminative power of the extracted features within an unconstrained environment such as the human bowel. Towards such a system we evaluate how well methods used in past work perform using a one-class SVM with images exhibiting common endoscopic image degradations such as blur, noise, light reflections and bubbles.

19 citations



Proceedings Article
17 Oct 2011
TL;DR: The experiments have shown that the proposed similarity measure is capable of describing complex relationship between image intensity values while offering a favorable speed-performance trade-off as compared to other known similarity measures.
Abstract: The image similarity measure is very important to determine the correspondence between images in order to quantify the accuracy of image registration. The selection of the image similarity measure requires a trade-off between speed and performance. In the current state of the art, fast similarity measures are unable to cope with complex relationships between image intensity values. Currently, the most popular image similarity measures are based on information theory because of their property to find predictable relationship between image intensity values. In this paper, we present a novel image similarity measure and compare it to others such as mean square difference, correlation, correlation coefficient, joint entropy, mutual information, and normalized mutual information. The experiments have shown that the proposed similarity measure is capable of describing complex relationship between image intensity values while offering a favorable speed-performance trade-off as compared to other known similarity measures.

18 citations


Proceedings Article
17 Oct 2011
TL;DR: A modification of log-Gabor wavelets that allows eliminate DC component and can yield a fairly uniform coverage of the frequency domain in an octave scale scheme and preserve redundancy at the same time is presented.
Abstract: Texture analysis is a significant challenge for computer vision but not yet solved. Since image textures possess spatial continuity at both local and global scales, they are widely used to perform tasks such as object segmentation or surface analysis. Based on the fact that the human visual system (HVS) can segment textures robustly, many texture segmentation schemes use biological models. Modern theories about the functioning of the HVS lead us think that the visual process takes advantage of image redundancy at different scales and orientations, therefore it can be modeled by a bank of Gabor wavelets. Despite the fact that Gabor wavelets optimize the theoretical limit of joint resolution between space and frequency domain, they do not have zero-mean, which induces a DC component in the coefficient of any frequency band. In addition, they do not have a uniform coverage of the frequency domain. These drawbacks may cause errors in the extraction of the appropriate texture features. In this paper, we present a modification of log-Gabor wavelets that allows eliminate DC component. They can yield a fairly uniform coverage of the frequency domain in an octave scale scheme and preserve redundancy at the same time. We analyzed performance of both Gabor and log-Gabor wavelets using a modification of Jain's unsupervised texture segmentation algorithm [1] and we compared results using confusion matrices.

Proceedings Article
17 Oct 2011
TL;DR: An ensemble-based system is proposed to improve the detection of exudates in diabetic retinopathy, and optimal combination of preprocessing methods and exudate candidate extractors are found and organized into a voting system for this aim.
Abstract: Diabetic retinopathy causes blindness to millions in the world. Exudates are early lesions of this disease so the automatic detection is very important to slow down the progression of retinopathy. In this paper, an ensemble-based system is proposed to improve the detection. Optimal combination of preprocessing methods and exudate candidate extractors are found and organized into a voting system for this aim. Our results show that in this way we outperform the individual exudate detector algorithms.

Proceedings Article
17 Oct 2011
TL;DR: The image processing aspects of the popular edge-based approaches to crater detection in spacecraft navigation are reviewed and their performance and inherent shortcomings are discussed and an alternative, edge-free approach is suggested.
Abstract: In this paper we review the image processing aspects of the popular edge-based approaches to crater detection in spacecraft navigation and discuss their performance and inherent shortcomings and the constraints these impose on mission parameters. Key weaknesses are identified and an alternative, edge-free approach is suggested. No comprehensive evaluation of detection rates and ratio of false positives to correct identifications is provided, as this is a work in progress, but example images for Lunar, asteroid and artificial settings are included to give an idea of the method's performance. Finally, the laboratory facility TRON (Testbed for Robotic Optical Navigation) used for testing and evaluating our algorithms is introduced.

Proceedings Article
17 Oct 2011
TL;DR: This work proposes the method of human identification based on the reduced kinematic data of the gait, and uses preprocessing filters to detect the main double step and to scale time domain to the given number of motion frames.
Abstract: We propose the method of human identification based on the reduced kinematic data of the gait. In the first stage the pose descriptions of the given skeleton model are reduced by the linear principal component analysis. We obtain the ndimensional motion trajectories of principal components. Afterwards, we use two approaches: feature extraction and dynamic time warping. In the feature extraction the Fourier transform with low pass filtering is applied. To suppress the gait dynamic Fourier components for the velocities and accelerations are calculated. Such processing transforms gait's data into the vector features space, in which the supervised learning is used to identify humans. To discover most valuable features - principal and Fourier components, PCA values, velocities and accelerations and to improve the classification, we prepare the features selection scenarios and observe the identification efficiency. To evaluate the proposed method we have collected gait database in the motion capture laboratory consisting of 353 motions of the 25 different people. We use preprocessing filters to detect the main double step and to scale time domain to the given number of motion frames. We have obtained satisfactory results with classification accuracy above 98%.


Proceedings Article
17 Oct 2011
TL;DR: The paper presents a subjective viewing study, where the proposed compression method is compared with two other coding techniques: full-resolution symmetric and mixed-resolution stereoscopic video coding, and shows that the average subjective viewing experience ratings of the proposed method are higher than those of the other tested methods.
Abstract: A novel asymmetric stereoscopic video coding method is presented in this paper. The proposed coding method is based on uneven sample domain quantization for different views and is typically applied together with a reduction of spatial resolution for one of the views. Any transform-based video compression, such as the Advanced Video Coding (H.264/AVC) standard, can be used with the proposed method. We investigate whether the binocular vision masks the coded views of different types of degradations caused by the proposed method. The paper presents a subjective viewing study, where the proposed compression method is compared with two other coding techniques: full-resolution symmetric and mixed-resolution stereoscopic video coding. We show that the average subjective viewing experience ratings of the proposed method are higher than those of the other tested methods in six out of eight test cases.

Proceedings Article
17 Oct 2011
TL;DR: This work addresses the problem of interactive video segmentation and introduces a 2-step segmentation scheme that is able to deal with multiple objects, robust to errors introduced by the automatic segmentation step, and does not require to perform again the whole segmentation process each time the user provides some feedback.
Abstract: Video data is continuously increasing in personal databases and Web repositories To exploit such data, a prior segmentation is often needed in order to get the objects-of-interest to be further processed However, the segmentation of a given video is often not unique and indeed depends on user needs Personalized segmentation may be achieved using interactive methods but only if their computational cost stays reasonable to enable user interactivity We address here the problem of interactive video segmentation and introduce a 2-step segmentation scheme: 1) offline processing to automatically extract quasi-flat zones from video data, and 2) online processing to interactively gather quasi-flat zones and build objects-of-interest Our approach is able to deal with multiple objects, robust to errors introduced by the automatic segmentation step, and does not require to perform again the whole segmentation process each time the user provides some feedback

Proceedings Article
17 Oct 2011
TL;DR: A new method for indoor localization of moving objects based on RFID localization improved by motion segmentation in video sequence captured with static or PTZ video camera is presented and the localization accuracy is increased significantly.
Abstract: A new method for indoor localization of moving objects based on RFID localization improved by motion segmentation in video sequence captured with static or PTZ video camera is presented. Data acquired by an RFID reader are used for region of interest extraction. Motion vectors are estimated for blocks within the region of interest and used for moving object segmentation. Segmentation is improved by morphological post processing. Centroid of segmented moving object is calculated and used for its position estimation. The conducted experiments show that the localization accuracy is increased significantly using the proposed method.

Proceedings Article
14 Sep 2011
TL;DR: This paper presents and evaluates different techniques for wildfire smoke detection based on visible spectrum video analysis and suggests the most important are terrestrial video based wildfire detection systems.
Abstract: Wildfires represent a constant threat to ecological systems, infrastructure and human lives. Since they can do so much damage and have devastating consequences, great efforts have been put into the development of systems for their early detection, because fast and appropriate intervention is of primarily importance for wildfire damage minimization. A traditional method for early wildfire detection is human based wildfire surveillance, but the modern technologies based on ICT become more and more important. In the last 20 years various wildfire detection systems has been proposed but from the practical point of view the most important are terrestrial video based wildfire detection systems. A focus of this paper is to present and evaluate different techniques for wildfire smoke detection based on visible spectrum video analysis.

Proceedings Article
17 Oct 2011
TL;DR: A memetic algorithm for the reconstruction of binary images represented on a triangular grid which uses a switch and compactness operators for the triangular grid to improve the quality of the image in each generation.
Abstract: In this paper we present a memetic algorithm for the reconstruction of binary images represented on a triangular grid. We define the crossover and mutation evolutionary algorithm operators introduced on this grid. We propose a discrete tomography algorithm which uses a switch and compactness operators for the triangular grid to improve the quality of the image in each generation. The algorithm was tested on some images for its effectiveness.

Proceedings Article
04 Sep 2011
TL;DR: In this paper, a new edge detector based on anisotropic linear filtering, local maximization and gamma correction is proposed, which is based on the use of two elongated and oriented filters in two different directions.
Abstract: In this paper we propose a new edge detector based on anisotropic linear filtering, local maximization and gamma correction The novelty of this approach resides in the mixing of ideas coming both from directional recursive linear filtering and gamma correction A peculiarity of our anisotropic edge detector is that it is based on the use of two elongated and oriented filters in two different directions We show in this paper that unlike classical edge detection methods, gamma correction does not perturb the edge detection but enhances clearly the resulting contours obtained, especially in over-exposed or under-exposed areas of the image Consequently, we propose a new edge operator enabling very precise detection of edge points involved in large structures This detector has been tested successfully on various image types presenting difficult problems for classical edge detection methods

Proceedings Article
17 Oct 2011
TL;DR: This paper presents a framework for the reconstruction of digital elevation maps (DEM) from hyperspectral imagery and preexisting elevation data of lower lateral resolution, which consists of a combined photoclinometry and shape from shading scheme, which is extended towards photometric stereo using multiple images.
Abstract: In lunar remote sensing, the analysis of the spectral reflectance of iron-bearing minerals is a widely used method. For an accurate calibration, elevation information about the surface has to be known. In this paper we present a framework for the reconstruction of digital elevation maps (DEM) from hyperspectral imagery and preexisting elevation data of lower lateral resolution. The reconstruction algorithm consists of a combined photoclinometry and shape from shading scheme, which is extended towards photometric stereo using multiple images. In order to register images of the same surface region acquired under different illumination conditions, an image registration approach is derived from the DEM construction methods. Image registration is performed in the illumination-invariant surface gradient space and is therefore able to cope with images taken under different illumination conditions. Finally, the calibration of hyperspectral imagery based on the DEM is demonstrated by the analysis of extracted spectral features.

Proceedings Article
17 Oct 2011
TL;DR: Construction of the watermark from the image itself, dispenses with the need for embedding an extraneous watermark which must be made known to the user separately, a significant contribution of the present work in image quality assessment.
Abstract: The provision of blind quality assessment is an important requirement for modern multimedia and communication systems. Fragile watermarking techniques have been proposed earlier for this purpose. This paper proposes a novel approach which makes use of both fragile and robust watermarking techniques. The embedded fragile watermark is used to assess the degradation undergone by the transmitted images. Robust image features, on the other hand, are used to construct the reference watermark from the received image, for assessing the amount of degradation of the fragile watermark. Construction of the watermark from the image itself, dispenses with the need for embedding an extraneous watermark which must be made known to the user separately, a significant contribution of the present work in image quality assessment. Another contribution is the use of Singular Value Decomposition (SVD) for the extraction of the fragile watermark. The validity of the proposed approach is veri#ed through extensive simulations using different kinds of gray-scale and color images.

Proceedings Article
17 Oct 2011
TL;DR: A method to compensate for the stabilisation when extracting the noise pattern from digital video clips is presented, and a better result was achieved with the proposed method.
Abstract: Comparing the noise pattern from two digital video clips is a way to investigate if the same camcorder has been used to shoot both clips. Many camcorders have functions to compensate for shaky movements, and some of these functions affect the possibility to extract the noise pattern. This paper presents a method to compensate for the stabilisation when extracting the noise pattern. Eight clips with different characteristics were recorded by the same camcorder. When the noise patterns extracted from these clips were compared to three clips from different camcorders, a better result was achieved with the proposed method.

Proceedings Article
17 Oct 2011
TL;DR: Evaluation on a set of ecological videos indicate that the proposed approach is faster and more flexible to adapt to changes in domain descriptions than specialized components written from scratch by image processing experts.
Abstract: This paper outlines the automatic construction of video processing solutions using multiple software components as opposed to traditional monolithic approaches undertaken by image processing experts. A combined top-down and bottom-up methodology was adopted for the derivation of a suitable level of granularity for a subset of image processing components that implement video classification, object detection, counting and tracking tasks. 90% of these components are generic and could be applied to any video processing task, indicating a high level of reusability for a spectrum of video analyses. Domain-specific video analysis approaches (that exploit combinations of the above components) are built by using an automatic workflow composition module that relies on decomposition-based planning and ontologies. Evaluation on a set of ecological videos indicate that the proposed approach is faster and more flexible to adapt to changes in domain descriptions than specialized components written from scratch by image processing experts.

Proceedings Article
17 Oct 2011
TL;DR: A robust method for vehicle categorization in aerial images that relies on a multiple-classifier system that merges the answers of classifiers applied at various camera angle incidences is presented.
Abstract: We present a robust method for vehicle categorization in aerial images. This approach relies on a multiple-classifier system that merges the answers of classifiers applied at various camera angle incidences. The single classifiers are built by matching 3D-templates to the vehicle silhouettes with a local projection model that is compatible with the assumption of the little knowledge that we have of the viewing-condition parameters. We assess the validity of our approach on a challenging dataset of images captured in real-world conditions.

Proceedings Article
17 Oct 2011
TL;DR: A high-resolution, optical, non-invasive measurement device from the area of surface quality measure-ment and combining it with pattern-recognition tech-niques might allow to finally solve this important research challenge.
Abstract: Determining the age of a latent fingerprint trace found at a crime scene is an unresolved research issue since decades. Adopting a high-resolution, optical, non-invasive measurement device (FRT-MicroProf 200 CWL 600) from the area of surface quality measure-ment and combining it with pattern-recognition tech-niques might allow us to finally solve this important research challenge.


Proceedings Article
17 Oct 2011
TL;DR: Gesture-based Human-Computer Interfaces constitute an interesting option for allowing surgeons to control such equipment without breaking asepsis-preservation rules.
Abstract: Asepsis preservation in operating rooms is essential for limiting patient infection by hospital-acquired diseases. For health reasons, surgeons may not be directly in contact with sterile equipment surrounding them, and must instead rely on assistants to interact with these in their place. Gesture-based Human-Computer Interfaces constitute an interesting option for allowing surgeons to control such equipment without breaking asepsis-preservation rules.

Proceedings Article
17 Oct 2011
TL;DR: By the comparison with other advanced segmentation methods such as level set and active contour, the proposed double thresholding has been found as the simplest strategy with shortest processing time as well as highest accuracy.
Abstract: Image analysis of cancer cells is important for cancer diagnosis and therapy, because it recognized as the most efficient and effective way to observe its proliferation. For the purpose of adaptive and accurate cancer cell image segmentation, a double threshold segmentation method is proposed in this paper. Based on a single gray-value histogram of the RGB color space, a double threshold, the key parameters of threshold segmentation can be fixed by a fitted-curve of the RGB component histogram. As reasonable thresholds confirmed, binary segmentation dependent on two thresholds, will be put into practice and result in binary image. With the post-processing of mathematical morphology and division of whole image, the better segmentation result can be finally achieved. By the comparison with other advanced segmentation methods such as level set and active contour, the proposed double thresholding has been found as the simplest strategy with shortest processing time as well as highest accuracy. The proposed method can be effectively used in the detection and recognition of cancer stem cells in images.