scispace - formally typeset
Search or ask a question

Showing papers in "International Journal of Pattern Recognition and Artificial Intelligence in 2009"


Journal ArticleDOI
TL;DR: This paper provides a review of the classification of imbalanced data regarding the application domains, the nature of the problem, the learning difficulties with standard classifier learning algorithms; the learning objectives and evaluation measures; the reported research solutions; and the class imbalance problem in the presence of multiple classes.
Abstract: Classification of data with imbalanced class distribution has encountered a significant drawback of the performance attainable by most standard classifier learning algorithms which assume a relatively balanced class distribution and equal misclassification costs. This paper provides a review of the classification of imbalanced data regarding: the application domains; the nature of the problem; the learning difficulties with standard classifier learning algorithms; the learning objectives and evaluation measures; the reported research solutions; and the class imbalance problem in the presence of multiple classes.

1,268 citations


Journal ArticleDOI
TL;DR: A novel object tracking algorithm is presented in this paper by using the joint color-texture histogram to represent a target and then applying it to the mean shift framework, which improves greatly the tracking accuracy and efficiency with fewer mean shift iterations than standard mean shift tracking.
Abstract: A novel object tracking algorithm is presented in this paper by using the joint color-texture histogram to represent a target and then applying it to the mean shift framework. Apart from the conventional color histogram features, the texture features of the object are also extracted by using the local binary pattern (LBP) technique to represent the object. The major uniform LBP patterns are exploited to form a mask for joint color-texture feature selection. Compared with the traditional color histogram based algorithms that use the whole target region for tracking, the proposed algorithm extracts effectively the edge and corner features in the target region, which characterize better and represent more robustly the target. The experimental results validate that the proposed method improves greatly the tracking accuracy and efficiency with fewer mean shift iterations than standard mean shift tracking. It can robustly track the target under complex scenes, such as similar target and background appearance, o...

207 citations


Journal ArticleDOI
TL;DR: Texture is an important visual attribute used to describe the pixel organization in an image that is easily identified by humans, and its analysis process demands a high level of sophistic...
Abstract: Texture is an important visual attribute used to describe the pixel organization in an image. As well as it being easily identified by humans, its analysis process demands a high level of sophistic...

149 citations


Journal ArticleDOI
TL;DR: Graphs provide us with a powerful and flexible representation formalism for pattern classification and the vast majority of classification algorithms proposed in the literature are inadequate.
Abstract: Graphs provide us with a powerful and flexible representation formalism for pattern classification. Many classification algorithms have been proposed in the literature. However, the vast majority o...

94 citations


Journal ArticleDOI
TL;DR: A real, unique and pioneer video surveillance system for boat traffic monitoring, ARGOS, which runs continuously 24 hours a day, 7 days a week, day and night in the city of Venice (Italy) since 2007 and it is able to build a reliable background model of the water channel and to track the boats navigating the channel with good accuracy in real-time.
Abstract: Visual surveillance in dynamic scenes is currently one of the most active research topics in computer vision, many existing applications are available. However, difficulties in realizing effective video surveillance systems that are robust to the many different conditions that arise in real environments, make the actual deployment of such systems very challenging. In this article, we present a real, unique and pioneer video surveillance system for boat traffic monitoring, ARGOS. The system runs continuously 24 hours a day, 7 days a week, day and night in the city of Venice (Italy) since 2007 and it is able to build a reliable background model of the water channel and to track the boats navigating the channel with good accuracy in real-time. A significant experimental evaluation, reported in this article, has been performed in order to assess the real performance of the system.

71 citations


Journal ArticleDOI
TL;DR: The literature on the topic has shown a strong correlation between the degree of precision of face localization and the face recognition performance, and there is a need for precise facial recognition performance tuning.
Abstract: The literature on the topic has shown a strong correlation between the degree of precision of face localization and the face recognition performance. Hence, there is a need for precise facial featu...

53 citations


Journal ArticleDOI
TL;DR: This work proposes a novel approach for driver fatigue detection from facial image sequences, which is based on multiscale dynamic features, and aims to provide real-time information about driver fatigue in the event of a crash.
Abstract: Driver fatigue is a significant factor in many traffic accidents. We propose a novel approach for driver fatigue detection from facial image sequences, which is based on multiscale dynamic features...

52 citations


Journal ArticleDOI
TL;DR: This paper describes the approach and presents a thorough empirical examination of the parameters needed to achieve practical levels of performance, including the size of the training database,size of the detector's receptive fields and methods for information integration.
Abstract: Localizing facial features is a critical component in many computer vision applications such as expression recognition, face recognition, face tracking, animation, and red-eye correction. Practical applications require detectors that operate reliably under a wide range of conditions, including variations in illumination, pose, ethnicity, gender and age. One challenge for the development of such detectors is the inherent trade-off between robustness and precision. Robust detectors tend to provide poor localization and detectors sensitive to small changes in local structure, which are needed for precise localization, generate a large number of false alarms. Here we present an approach to this trade-off based on context dependent inference. First, robust detectors are used to detect contexts in which target features occur, then precise detectors are trained to localize the features given the detected context. This paper describes the approach and presents a thorough empirical examination of the parameters needed to achieve practical levels of performance, including the size of the training database, size of the detector's receptive fields and methods for information integration. The approach operates in real time and achieves, to our knowledge, the most accurate localization performance to date.

51 citations


Journal ArticleDOI
TL;DR: In this article, a local dependency descriptor, called local dependency histogram (LDH), is proposed to model the spatial dependencies between a pixel and its neighboring pixels, and an effective approach to dynamic background subtraction is proposed in which each pixel is modeled as a group of weighted LDHs.
Abstract: Traditional background subtraction methods perform poorly when scenes contain dynamic backgrounds such as waving tree, spouting fountain, illumination changes, camera jitters, etc. In this paper, a novel and effective dynamic background subtraction method is presented with three contributions. First, we present a novel local dependency descriptor, called local dependency histogram (LDH), to effectively model the spatial dependencies between a pixel and its neighboring pixels. The spatial dependencies contain substantial evidence for dynamic background subtraction. Second, based on the proposed LDH, an effective approach to dynamic background subtraction is proposed, in which each pixel is modeled as a group of weighted LDHs. Labeling the pixel as foreground or background is done by comparing the new LDH computed in current frame against its model LDHs. The model LDHs are adaptively updated by the new LDH. Finally, unlike traditional approaches which use a fixed threshold to define whether a pixel matches to its model, an adaptive thresholding technique is also proposed. Experimental results on a diverse set of dynamic scenes validate that the proposed method significantly outperforms traditional methods for dynamic background subtraction.

47 citations


Journal ArticleDOI
TL;DR: A two-stage artificial neural network based general scheme to recognize pin-code numbers written in any of the two scripts and a water reservoir concept based feature to identify the script by which a word/city name is written is proposed.
Abstract: In this paper, we present a system towards Indian postal automation based on pin-code and city name recognition. Here, at first, using Run Length Smoothing Approach (RLSA), non-text blocks (postal stamp, postal seal, etc.) are detected and using positional information, Destination Address Block (DAB) is identified from postal documents. Next, lines and words of the DAB are segmented. In India, the address part of a postal document may be written by a combination of two scripts: Latin (English) and a local (State/region) script. It is very difficult to identify the script by which pin-code part is written. To overcome this problem on pin-code part, we have used a two-stage artificial neural network based general scheme to recognize pin-code numbers written in any of the two scripts. To identify the script by which a word/city name is written, we propose a water reservoir concept based feature. For recognition of city names, we propose an NSHP-HMM (Non-Symmetric Half Plane-Hidden Markov Model) based technique. At present, the accuracy of the proposed digit numeral recognition module is 93.14% while that of city name recognition scheme is 86.44%.

45 citations


Journal ArticleDOI
TL;DR: This paper proposes a missing-allowable (k, n) scheme which is fast and with a reasonable pixel expansion rate (per), and generates n extremely-noisy shadow images for the given secret color image A, and any k out of these n shadows can recover A loss-freely.
Abstract: In secret image sharing, a polynomial interpolation technique heavy experiences a computation load when the secret image is retrieved later. To the contrary, fast approaches often need larger storage space due to pixel expansion property. This paper proposes a missing-allowable (k, n) scheme which is fast and with a reasonable pixel expansion rate (per). The scheme generates n extremely-noisy shadow images for the given secret color image A, and any k out of these n shadows can recover A loss-freely. In average, to decode a color pixel of A, the retrieval uses only three exclusion-OR operations among 24-bit numbers. Hence, the new method has very fast decoding speed, and its pixel expansion rate is always acceptable (0 < per < 2).

Journal ArticleDOI
TL;DR: A new facial biometric scheme is proposed in this paper and three steps are included, including a new nontensor product bivariate wavelet utilized to get different facial frequency components.
Abstract: A new facial biometric scheme is proposed in this paper. Three steps are included. First, a new nontensor product bivariate wavelet is utilized to get different facial frequency components. Then a ...

Journal ArticleDOI
TL;DR: In this paper, feature selection experiments for online handwriting recognition were described, and a set of 25 online and pseudo-offline features were investigated to find out which features are important.
Abstract: In this paper, we describe feature selection experiments for online handwriting recognition. We investigated a set of 25 online and pseudo-offline features to find out which features are important ...

Journal ArticleDOI
TL;DR: Intelligent surveillance refers to using Artificial Intelligence techniques in order to improve surveillance and deal with semantic information obtained from low-level security devices.
Abstract: Intelligent surveillance refers to using Artificial Intelligence techniques in order to improve surveillance and deal with semantic information obtained from low-level security devices. In this con...

Journal ArticleDOI
TL;DR: The MOST (Multiple Operators using Statistical Tests) framework that incrementally modifies the structure and checks for improvement using cross-validation is proposed and shows that MOST variants generally find simpler networks having lower or comparable error rates than DNC and CC.
Abstract: We define the problem of optimizing the architecture of a multilayer perceptron (MLP) as a state space search and propose the MOST (Multiple Operators using Statistical Tests) framework that incrementally modifies the structure and checks for improvement using cross-validation. We consider five variants that implement forward/backward search, using single/multiple operators and searching depth-first/breadth-first. On 44 classification and 30 regression datasets, we exhaustively search for the optimal and evaluate the goodness based on: (1) Order, the accuracy with respect to the optimal and (2) Rank, the computational complexity. We check for the effect of two resampling methods (5 × 2, ten-fold cv), four statistical tests (5 × 2 cv t, ten-fold cv t, Wilcoxon, sign) and two corrections for multiple comparisons (Bonferroni, Holm). We also compare with Dynamic Node Creation (DNC) and Cascade Correlation (CC). Our results show that: (1) On most datasets, networks with few hidden units are optimal, (2) forward searching finds simpler architectures, (3) variants using single node additions (deletions) generally stop early and get stuck in simple (complex) networks, (4) choosing the best of multiple operators finds networks closer to the optimal, (5) MOST variants generally find simpler networks having lower or comparable error rates than DNC and CC.

Journal ArticleDOI
TL;DR: This paper examines the classification capability of different Gabor representations for human face recognition by analyzing Gabor filter responses for eight orientations and five scales for each orie.
Abstract: This paper examines the classification capability of different Gabor representations for human face recognition. Usually, Gabor filter responses for eight orientations and five scales for each orie...

Journal ArticleDOI
TL;DR: This paper analyzes the behavior of the evolutionary prototype selection strategy, considering a complexity measure for classification problems based on overlapping, and analyzed different k values for the nearest neighbour classifier in this domain of study to see its influence on the results of PS methods.
Abstract: Evolutionary prototype selection has shown its effectiveness in the past in the prototype selection domain. It improves in most of the cases the results offered by classical prototype selection algorithms but its computational cost is expensive. In this paper, we analyze the behavior of the evolutionary prototype selection strategy, considering a complexity measure for classification problems based on overlapping. In addition, we have analyzed different k values for the nearest neighbour classifier in this domain of study to see its influence on the results of PS methods. The objective consists of predicting when the evolutionary prototype selection is effective for a particular problem, based on this overlapping measure.

Journal ArticleDOI
TL;DR: The proposed combined viewpoint selection and viewpoint fusion approach is able to significantly improve the recognition rates compared to passive object recognition with randomly chosen views.
Abstract: Object recognition problems in computer vision are often based on single image data processing. In various applications this processing can be extended to a complete sequence of images, usually received passively. In contrast, we propose a method for active object recognition, where a camera is selectively moved around a considered object. Doing so, we aim at reliable classification results with a clearly reduced amount of necessary views by optimizing the camera movement for the access of new viewpoints (viewpoint selection). Therefore, the optimization criterion is the gain of class discriminative information when observing the appropriate next image. We show how to apply an unsupervised reinforcement learning algorithm to that problem. Specifically, we focus on the modeling of continuous states, continuous actions and supporting rewards for an optimized recognition. We also present an algorithm for the sequential fusion of gathered image information and we combine all these components into a single framework. The experimental evaluations are split into results for synthetic and real objects with one- or two-dimensional camera actions, respectively. This allows the systematic evaluation of the theoretical correctness as well as the practical applicability of the proposed method. Our experiments showed that the proposed combined viewpoint selection and viewpoint fusion approach is able to significantly improve the recognition rates compared to passive object recognition with randomly chosen views.

Journal ArticleDOI
TL;DR: A new data model containing two main types of extracted video contents: physical objects and events is proposed, and a new rich and flexible query language is presented that works at different abstraction levels, provides both exact and approximate matching and takes into account users' interest.
Abstract: In this paper, we propose an approach for surveillance video indexing and retrieval. The objective of this approach is to answer five main challenges we have met in this domain: (1) the lack of means for finding data from the indexed databases, (2) the lack of approaches working at different abstraction levels, (3) imprecise indexing, (4) incomplete indexing, (5) the lack of user-centered search. We propose a new data model containing two main types of extracted video contents: physical objects and events. Based on this data model, we present a new rich and flexible query language. This language works at different abstraction levels, provides both exact and approximate matching and takes into account users' interest. In order to work with the imprecise indexing, two new methods respectively for object representation and object matching are proposed. Videos from two projects which have been partially indexed are used to validate the proposed approach. We have analyzed both query language usage and retrieval results. The obtained retrieval results analyzed by the average normalized ranks are promising. The retrieval results at the object level are compared with another state of the art approach.

Journal ArticleDOI
TL;DR: This work proposes compound diversity functions which combine the diversities with the performance of each individual classifier, and shows that there is a strong correlation between the proposed functions and ensemble accuracy.
Abstract: An effective way to improve a classification method's performance is to create ensembles of classifiers Two elements are believed to be important in constructing an ensemble: (a) the performance of each individual classifier; and (b) diversity among the classifiers Nevertheless, most works based on diversity suggest that there exists only weak correlation between classifier performance and ensemble accuracy We propose compound diversity functions which combine the diversities with the performance of each individual classifier, and show that there is a strong correlation between the proposed functions and ensemble accuracy Calculation of the correlations with different ensemble creation methods, different problems and different classification algorithms on 0624 million ensembles suggests that most compound diversity functions are better than traditional diversity measures The population-based Genetic Algorithm was used to search for the best ensembles on a handwritten numerals recognition problem and to evaluate 4224 million ensembles The statistical results indicate that compound diversity functions perform better than traditional diversity measures, and are helpful in selecting the best ensembles

Journal ArticleDOI
TL;DR: A feature-based scheme for detecting different genres of video shot transitions based on spatio-temporal analysis and model parameter estimation is proposed.
Abstract: In this paper, we propose a feature-based scheme for detecting different genres of video shot transitions based on spatio-temporal analysis and model parameter estimation. In feature extraction, th...

Journal ArticleDOI
TL;DR: A unified mathematical model of ICP based on Lie group representation is established and a fast algorithm by solving an iterative linear system is designed for the optimization problem on Lie groups.
Abstract: The iterative closet point (ICP) method is a dominant method for data registration that has attracted extensive attention. In this paper, a unified mathematical model of ICP based on Lie group representation is established. Under the framework, the registration problem is formulated into an optimization problem over a certain Lie group. In order to simplify the model and to reduce the dimension of parameter space, the translation part of geometric transformation is eliminated by calibrating the centers of two data sets under registration. As a result, a fast algorithm by solving an iterative linear system is designed for the optimization problem on Lie groups. Moreover, PCA and ICA methods are jointly applied to estimate the initial registration to achieve the global minimum. Finally, several illustrations and comparison experiments are presented to test the performance of the proposed algorithm.

Journal ArticleDOI
TL;DR: An integrated face recognition system that combines a Maximum Likelihood estimator with Gabor graphs for face detection under varying scale and in-plane rotation and matching as well as matching as wel...
Abstract: We present an integrated face recognition system that combines a Maximum Likelihood (ML) estimator with Gabor graphs for face detection under varying scale and in-plane rotation and matching as wel...

Journal ArticleDOI
TL;DR: This paper proposes a new methodology to extract biometric features of plant leaf structures by combining computer vision techniques and plant taxonomy protocols.
Abstract: This paper proposes a new methodology to extract biometric features of plant leaf structures. Combining computer vision techniques and plant taxonomy protocols, these methods are capable of identif...

Journal ArticleDOI
TL;DR: This paper addresses the problem of dynamic time warping (DTW) causing unintended matching correspondences when it is employed for online two-dimensional handwriting signals, and proposes a solution to this problem.
Abstract: This paper addresses the problem of dynamic time warping (DTW) causing unintended matching correspondences when it is employed for online two-dimensional (2D) handwriting signals, and proposes the ...

Journal ArticleDOI
TL;DR: In the field of handwriting recognition, classifier combination received much more interest than the study of powerful individual classifiers, mainly due to the enormous variability among classifiers.
Abstract: In the field of handwriting recognition, classifier combination received much more interest than the study of powerful individual classifiers. This is mainly due to the enormous variability among t...

Journal ArticleDOI
TL;DR: Blood cell classification is widely used in the biomedical research and Appropriate separation between touching and overlapping blood cells is of great importance for the successful classification.
Abstract: Blood cell classification is widely used in the biomedical research. Appropriate separation between touching and overlapping blood cells is of great importance for the successful classification. In...

Journal ArticleDOI
TL;DR: A metric which induces a conceptually correct topology for periodic attributes is embedded into the K-means algorithm, which requires solving a non-convex minimization problem in the maximization step.
Abstract: The K-means algorithm is very popular in the machine learning community due to its inherent simplicity. However, in its basic form, it is not suitable for use in problems which contain periodic attributes, such as oscillator phase, hour of day or directional heading. A commonly used technique of trigonometrically encoding periodic input attributes to artificially generate the required topology introduces a systematic error. In this paper, a metric which induces a conceptually correct topology for periodic attributes is embedded into the K-means algorithm. This requires solving a non-convex minimization problem in the maximization step. Results of numerical experiments comparing the proposed algorithm to K-means with trigonometric encoding on synthetically generated data are reported. The advantage of using the proposed K-means algorithm is also shown on a real example using gas load data to build simple predictive models.

Journal ArticleDOI
TL;DR: Automatic face recognition is a challenging problem in the biometrics area, where the dimension of the sample space is typically larger than the number of samples in the training set and consequent cost-effectiveness concerns.
Abstract: Automatic face recognition is a challenging problem in the biometrics area, where the dimension of the sample space is typically larger than the number of samples in the training set and consequent...

Journal ArticleDOI
TL;DR: The paper presents mathematical underpinnings of the locally linear embedding technique for data dimensionality reduction and shows that a cogent framework for describing the method is that of optimization on a Grassmann manifold.
Abstract: The paper presents mathematical underpinnings of the locally linear embedding technique for data dimensionality reduction It is shown that a cogent framework for describing the method is that of optimization on a Grassmann manifold The solution delivered by the algorithm is characterized as a constrained minimizer for a problem in which the cost function and all the constraints are defined on such a manifold The role of the internal gauge symmetry in solving the underlying optimization problem is illuminated