scispace - formally typeset
Search or ask a question

Showing papers in "Pattern Recognition Letters in 2001"


Journal ArticleDOI
TL;DR: The advantages and shortcomings of the performance measures currently used in CBIR are discussed and proposals for a standard test suite similar to that used in IR at the annual Text REtrieval Conference (TREC), are presented.
Abstract: Evaluation of retrieval performance is a crucial problem in content-based image retrieval (CBIR). Many different methods for measuring the performance of a system have been created and used by researchers. This article discusses the advantages and shortcomings of the performance measures currently used. Problems such as defining a common image database for performance comparisons and a means of getting relevance judgments (or ground truth) for queries are explained. The relationship between CBIR and information retrieval (IR) is made clear, since IR researchers have decades of experience with the evaluation problem. Many of their solutions can be used for CBIR, despite the differences between the fields. Several methods used in text retrieval are explained. Proposals for performance measures and means of developing a standard test suite for CBIR, similar to that used in IR at the annual Text REtrieval Conference (TREC), are presented.

598 citations


Journal ArticleDOI
TL;DR: The effects of five feature normalization methods on retrieval performance are discussed and two likelihood ratio-based similarity measures that perform significantly better than the commonly used geometric approaches like the Lp metrics are described.
Abstract: Distance measures like the Euclidean distance are used to measure similarity between images in content-based image retrieval. Such geometric measures implicitly assign more weighting to features with large ranges than those with small ranges. This paper discusses the effects of five feature normalization methods on retrieval performance. We also describe two likelihood ratio-based similarity measures that perform significantly better than the commonly used geometric approaches like the Lp metrics.

450 citations


Journal ArticleDOI
TL;DR: A two-phase clustering algorithm for outliers detection is proposed, which first modify the traditional k-means algorithm in Phase 1 by using a heuristic “if one new input pattern is far enough away from all clusters' centers, then assign it as a new cluster center”.
Abstract: In this paper, a two-phase clustering algorithm for outliers detection is proposed. We first modify the traditional k-means algorithm in Phase 1 by using a heuristic “if one new input pattern is far enough away from all clusters' centers, then assign it as a new cluster center”. It results that the data points in the same cluster may be most likely all outliers or all non-outliers. And then we construct a minimum spanning tree (MST) in Phase 2 and remove the longest edge. The small clusters, the tree with less number of nodes, are selected and regarded as outlier. The experimental results show that our process works well.

345 citations


Journal ArticleDOI
TL;DR: This work describes a scheme that is able to classify audio segments into seven categories consisting of silence, single speaker speech, music, environmental noise, multiple speakers' speech, simultaneous speech and music, and speech and noise, and shows that cepstral-based features such as the Mel-frequency cep stral coefficients (MFCC) and linear prediction coefficients (LPC) provide better classification accuracy compared to temporal and spectral features.
Abstract: In this paper, we address the problem of classification of continuous general audio data (GAD) for content-based retrieval, and describe a scheme that is able to classify audio segments into seven categories consisting of silence, single speaker speech, music, environmental noise, multiple speakers' speech, simultaneous speech and music, and speech and noise. We studied a total of 143 classification features for their discrimination capability. Our study shows that cepstral-based features such as the Mel-frequency cepstral coefficients (MFCC) and linear prediction coefficients (LPC) provide better classification accuracy compared to temporal and spectral features. To minimize the classification errors near the boundaries of audio segments of different type in general audio data, a segmentation–pooling scheme is also proposed in this work. This scheme yields classification results that are consistent with human perception. Our classification system provides over 90% accuracy at a processing speed dozens of times faster than the playing rate.

315 citations


Journal ArticleDOI
TL;DR: A new standard is currently being developed, the JPEG2000, which is not only intended to provide rate-distortion and subjective image quality performance superior to existing standards, but also to provide functionality that current standards can either not address efficiently or not address at all.
Abstract: With the increasing use of multimedia technologies, image compression requires higher performance as well as new features To address this need in the specific area of still image encoding, a new standard is currently being developed, the JPEG2000 It is not only intended to provide rate-distortion and subjective image quality performance superior to existing standards, but also to provide functionality that current standards can either not address efficiently or not address at all

269 citations


Journal ArticleDOI
TL;DR: The classification shows an improvement of the online experiment and the temporal determination of minimal classification error compared to linear classification methods.
Abstract: Hidden Markov models (HMMs) are presented for the online classification of single trial EEG data during imagination of a left or right hand movement. The classification shows an improvement of the online experiment and the temporal determination of minimal classification error compared to linear classification methods.

251 citations


Journal ArticleDOI
TL;DR: Experimental results indicate that the incorporation of colour information enhances the performance of the texture analysis techniques examined, and the classification accuracy is determined using a neural network classifier based on Learning Vector Quantization.
Abstract: In this paper we focus on the classification of colour texture images. The main objective is to determine the contribution of colour information to the overall classification performance. Three relevant approaches to grey scale texture analysis, namely local linear transforms, Gabor filtering and the co-occurrence approach are extended to colour images. They are evaluated in a quantitative manner by means of a comparative experiment on a set of colour images. We also investigate the effect of using different colour spaces and the contribution of colour and texture features separately and collectively. The evaluation criteria is the classification accuracy using a neural network classifier based on Learning Vector Quantization. Experimental results indicate that the incorporation of colour information enhances the performance of the texture analysis techniques examined.

230 citations


Journal ArticleDOI
TL;DR: A new graph distance metric is proposed for measuring similarities between objects represented by attributed relational graphs that can be computed by a straightforward extension of any algorithm that implements error-correcting graph matching, when run under an appropriate cost function, and the extension only takes time linear in the size of the graphs.
Abstract: The relationship between two important problems in pattern recognition using attributed relational graphs, the maximum common subgraph and the minimum common supergraph of two graphs, is established by means of simple constructions, which allow to obtain the maximum common subgraph from the minimum common supergraph, and vice versa. On this basis, a new graph distance metric is proposed for measuring similarities between objects represented by attributed relational graphs. The proposed metric can be computed by a straightforward extension of any algorithm that implements error-correcting graph matching, when run under an appropriate cost function, and the extension only takes time linear in the size of the graphs.

214 citations


Journal ArticleDOI
TL;DR: Experiments show that the factors of PTF are easier to interpret than those produced by methods based on the singular value decomposition, which might contain negative values.
Abstract: A novel fixed point algorithm for positive tensor factorization (PTF) is introduced. The update rules efficiently minimize the reconstruction error of a positive tensor over positive factors. Tensors of arbitrary order can be factorized, which extends earlier results in the literature. Experiments show that the factors of PTF are easier to interpret than those produced by methods based on the singular value decomposition, which might contain negative values. We also illustrate the tendency of PTF to generate sparsely distributed codes.

191 citations


Journal ArticleDOI
TL;DR: It is proved that these combination rules are equivalent when using two classifiers and the sum of the estimates of the a posteriori probabilities is equal to one.
Abstract: This paper presents a comparative study of the performance of arithmetic and geometric means as rules to combine multiple classifiers. For problems with two classes, we prove that these combination rules are equivalent when using two classifiers and the sum of the estimates of the a posteriori probabilities is equal to one. We also prove that the case of a two class problem and a combination of two classifiers is the only one where such equivalence occurs. We present experiments illustrating the equivalence of the rules under the above mentioned assumptions.

173 citations


Journal ArticleDOI
TL;DR: In this article, the 3D and grey level comparison algorithms were designed to be integrated in security applications in which individuals cooperate, and the residual error after 3D matching was used as a first similarity measure.
Abstract: We address in this paper automatic face verification from 3D facial surface and grey level analysis. 3D acquisition is performed by a structured light system, adapted to face capture and allowing grey level acquisition in alignment. The 3D facial shapes are compared and the residual error after 3D matching is used as a first similarity measure. A second similarity measure is derived from grey level comparison. As expected, fusing 3D and intensity information increases verification performances. The acquisition system, the 3D and grey level comparison algorithms were designed to be integrated in security applications in which individuals cooperate.

Journal ArticleDOI
TL;DR: This paper investigates a method for two-dimensional image fusion based on a novel multi-resolution transform called steerable pyramids, which combines the multi-scale decomposition with differential measurements, which is very useful for feature extraction.
Abstract: This paper investigates a method for two-dimensional image fusion based on a novel multi-resolution transform called steerable pyramids. Such a transform combines the multi-scale decomposition with differential measurements, which is very useful for feature extraction. An iterative fusion scheme is introduced.

Journal ArticleDOI
TL;DR: An existing graph distance metric based on maximum common subgraph has been extended by a proposal to define the problem size with the union of the two graphs being measured, rather than the larger of theTwo graphs used in the existing metric.
Abstract: An existing graph distance metric based on maximum common subgraph has been extended by a proposal to define the problem size with the union of the two graphs being measured, rather than the larger of the two graphs used in the existing metric. For some applications the graph distance measure is more appropriate if the graph union approach is used. This graph distance measure is shown to be a metric.

Journal ArticleDOI
TL;DR: A novel singular value decomposition (SVD)- and vector quantization (VQ)-based image hiding scheme to hide image data is presented, showing good compression ratio and satisfactory image quality.
Abstract: This paper presents a novel singular value decomposition (SVD)- and vector quantization (VQ)-based image hiding scheme to hide image data. Plugging the VQ technique into the SVD-based compression method, the proposed scheme leads to good compression ratio and satisfactory image quality. Experimental results show that the embedding image is visually indistinguishable from the stego-image.

Journal ArticleDOI
TL;DR: A comparison between the new deslanting technique and the method proposed by Bozinovic and Srihari was made by measuring the performance of both methods within a word recognition system tested on different databases.
Abstract: This paper presents new techniques for slant and slope removal in cursive handwritten words. Both methods require neither heuristics nor parameter tuning. This avoids the heavy experimental effort required to find the optimal configuration of a parameter set. A comparison between the new deslanting technique and the method proposed by Bozinovic and Srihari was made by measuring the performance of both methods within a word recognition system tested on different databases. The proposed technique is shown to improve the recognition rate by 10.8% relative to traditional normalization methods. Moreover, a long exploration of the parameter space is avoided.

Journal ArticleDOI
TL;DR: Experiments show that the new features proposed can catch salient edge/structure information and improve the retrieval performance and are more generally applicable than texture or shape features.
Abstract: This paper proposes structural features for content-based image retrieval (CBIR), especially edge/structure features extracted from edge maps. The feature vector is computed through a “water-filling algorithm” applied on the edge map of the original image. The purpose of this algorithm is to efficiently extract information embedded in the edges. The new features are more generally applicable than texture or shape features. Experiments show that the new features can catch salient edge/structure information and improve the retrieval performance.

Journal ArticleDOI
TL;DR: A lower bound of the box size is found and the reason for having it is provided and indicates the need for limiting the box sizes within certain bounds.
Abstract: Fractal geometry has gradually established its importance in the study of image characteristics. There are many techniques to estimate the dimensions of fractal surfaces. A famous technique to calculate fractal dimension is the grid dimension method popularly known as box-counting method. In this paper, we have found out a lower bound of the box size and provided the reason for having it. The study indicates the need for limiting the box sizes within certain bounds.

Journal ArticleDOI
TL;DR: This paper presents a novel, information-theoretic algorithm for feature selection, which finds an optimal set of attributes by removing both irrelevant and redundant features and is applicable to datasets of a mixed nature.
Abstract: Feature selection is used to improve the efficiency of learning algorithms by finding an optimal subset of features. However, most feature selection techniques can handle only certain types of data. Additional limitations of existing methods include intensive computational requirements and inability to identify redundant variables. In this paper, we present a novel, information-theoretic algorithm for feature selection, which finds an optimal set of attributes by removing both irrelevant and redundant features. The algorithm has a polynomial computational complexity and is applicable to datasets of a mixed nature. The method performance is evaluated on several benchmark datasets by using a standard classifier (C4.5).

Journal ArticleDOI
TL;DR: A mixture-of-Gaussians modeling of the color space, provides a robust representation that can accommodate large color variations, as well as highlights and shadows, in face-color modeling and segmentation.
Abstract: In this paper, we propose a general methodology for face-color modeling and segmentation. One of the major difficulties in face detection and retrieval is partial face extraction due to highlights, shadows and lighting variations. We show that a mixture-of-Gaussians modeling of the color space, provides a robust representation that can accommodate large color variations, as well as highlights and shadows. Our method enables to segment within-face regions, and associate semantic meaning to them, and provides statistical analysis and evaluation of the dominant variability within a given archive.

Journal ArticleDOI
TL;DR: Comparisons with other text location methods are presented; indicating that the proposed system has a better accuracy.
Abstract: This paper proposes neural network-based text locations in complex color images. Texture information extracted on several color bands using neural networks is combined and corresponding text location algorithms are then developed. Text extraction filters can be automatically constructed using neural networks. Comparisons with other text location methods are presented; indicating that the proposed system has a better accuracy.

Journal ArticleDOI
TL;DR: It is argued that the use of predictive accuracy for basic probability assignments can improve the overall system performance when compared to `traditional' mass assignment techniques.
Abstract: This paper is concerned with the use of Dempster–Shafer theory in `fusion' classifiers. We argue that the use of predictive accuracy for basic probability assignments can improve the overall system performance when compared to `traditional' mass assignment techniques. We demonstrate the effectiveness of this approach in a case study involving the detection of static thermostatic valve faults in a diesel engine cooling system.

Journal ArticleDOI
TL;DR: The proposed online system distinguishes crop from weeds based on multi-spectal reflectance gathered with an imaging spectrograph under field conditions were recognized herbicide reductions of up to 90%.
Abstract: The proposed online system distinguishes crop from weeds based on multi-spectal reflectance gathered with an imaging spectrograph. Under field conditions, up to 86% of the vegetation samples (80% of crop, 91% of weed) were recognized herbicide reductions of up to 90%.

Journal ArticleDOI
TL;DR: This work addresses the same problem using the framework of heuristic search strategies to find the shortest path in a graph and shows that the complexity of the algorithm is close to O( P 2 ).
Abstract: Perez and Vidal have proposed an optimal algorithm for the polygonal approximation of digitized curves in 1994. The complexity of their algorithm is O( P 2 S ), where P is the number of points and S is the number of segments. We address the same problem using the framework of heuristic search strategies to find the shortest path in a graph and show that the complexity of our algorithm is close to O( P 2 ).

Journal ArticleDOI
Heung-Soo Kim1, Jong-Hwan Kim1
TL;DR: The experimental results demonstrate that the proposed two-step circle detection algorithm using pairs of chords can detect the circles effectively.
Abstract: This paper proposes a two-step circle detection algorithm using pairs of chords. It is shown how a pair of two intersecting chords locates the center of the circle. Based on this idea, in the first step, a 2D Hough transform (HT) method is employed to find the centers of the circles in the image. In the second step, a 1D radius histogram is used to compute the radii. The experimental results demonstrate that the proposed method can detect the circles effectively.

Journal ArticleDOI
TL;DR: A fast and novel technique for color quantization using reduction of color space dimensionality and a fast pixel mapping algorithm based on the proposed data clustering algorithm are presented.
Abstract: This paper describes a fast and novel technique for color quantization using reduction of color space dimensionality. The color histogram is repeatedly sub-divided into smaller and smaller classes. The colors of each class are projected on a carefully selected line, such that the color dis-similarities are preserved. Instead of using the principal axis of each class, the line is defined by the mean color vector and the color of the largest distance away from the mean color. The vector composed of the projection values for each class is then used to cluster the colors into two representative palette colors. As a result, the computation in the quantization process is fast. A fast pixel mapping algorithm based on the proposed data clustering algorithm is also presented in this paper. Experimental results show that the proposed algorithms quantize images with high image quality efficiently.

Journal ArticleDOI
TL;DR: This paper comprises a complete system for content-based retrieval and browsing of news reports; the annotation of the video stream is fully automated and is based both on visual features extracted from video shots and on textual strings extracted from captions and audio tracks.
Abstract: Effective retrieval and browsing by content of videos is based on the association of high-level information with visual data. Automatic extraction of high-level content descriptors requires to exploit the technical characteristics of video types. This paper comprises a complete system for content-based retrieval and browsing of news reports; the annotation of the video stream is fully automated and is based both on visual features extracted from video shots and on textual strings extracted from captions and audio tracks.

Journal ArticleDOI
TL;DR: The scale space image of the distance accumulation showed that the zero crossings of distance accumulation are quite stable and analysis of its relation to planar curvature matched very well with experimental results.
Abstract: In this paper we present a method of calculating a property – which can be regarded as a discrete curvature – of planar digital boundaries. Chord-to-point distance accumulation is computed by accumulating the distance from a point in the boundary to a chord specified by moving end points. According to the shape of the boundary, positive or negative distances are obtained. The values are accumulated as the chord is moved. The distance accumulation is robust with respect to change of chord length compared to planar curvature. The scale space image of the distance accumulation showed that the zero crossings of distance accumulation are quite stable. Experimental results with simulated and real images showed its robustness. Analysis of its relation to planar curvature matched very well with experimental results.

Journal ArticleDOI
TL;DR: This paper presents a machine-printed and hand-written text classification scheme for Bangla and Devnagari, the two most popular Indian scripts, which has an accuracy of 98.6%.
Abstract: There are many types of documents where machine-printed and hand-written texts intermixedly appear. Since the optical character recognition (OCR) methodologies for machine-printed and hand-written texts are different, to achieve optimal performance it is necessary to separate these two types of texts before feeding them to their respective OCR systems. In this paper, we present a machine-printed and hand-written text classification scheme for Bangla and Devnagari, the two most popular Indian scripts. The scheme is based on the structural and statistical features of the machine-printed and hand-written text lines. The classification scheme has an accuracy of 98.6%.

Journal ArticleDOI
TL;DR: A new vector median filter suitable for colour image processing is presented, based on a new ordering of vectors in the HSV colour space, which shows promising results in terms of colour image restoration.
Abstract: In this paper a new vector median filter suitable for colour image processing is presented. It is based on a new ordering of vectors in the HSV colour space. Illustrative and comparative examples of degraded colour image restoration are also provided.

Journal ArticleDOI
TL;DR: This paper proposes a novel method to detect scene cuts adaptively using a difference metric based on the color histograms of successive frames of a video sequence, and applies a refinement procedure to remove false detection.
Abstract: In many applications such as video browsing, indexing of relevant scenes in a video sequence is important for their efficient retrieval. Such indexing is most commonly done by identifying scene cuts which represent the boundary between video shots. Scene cut detection involves the identification of frames at which the content of the scene is significantly different from that of the previously retained frames. This requires computing an appropriate metric that characterizes the change in video content between two frames and a threshold to determine whether the change is large enough for the frame to be defined as a scene cut frame. In this paper, we propose a novel method to detect scene cuts adaptively using a difference metric based on the color histograms of successive frames of a video sequence. An entropic thresholding method is used to obtain the threshold automatically for identifying scene cuts. We further apply a refinement procedure to remove false detection. Experimental results are presented to illustrate the good performance of the method.