scispace - formally typeset
Search or ask a question

Showing papers on "k-nearest neighbors algorithm published in 1994"


ReportDOI
01 Nov 1994
TL;DR: This paper describes the problem of selecting rele- vant features for use in machine learning in terms of heuristic search through a space of feature sets, and identifies four dimensions along which approaches to the problem can vary.
Abstract: In this paper, we review the problem of selecting rele- vant features for use in machine learning. We describe this problem in terms of heuristic search through a space of feature sets, and we identify four dimensions along which approaches to the problem can vary. We consider recent work on feature selection in terms of this framework, then close with some challenges for future work in the area. 1. The Problem of Irrelevant Features accuracy) to grow slowly with the number of irrele- vant attributes. Theoretical results for algorithms that search restricted hypothesis spaces are encouraging. For instance, the worst-case number of errors made by Littlestone's (1987) WINNOW method grows only logarithmically with the number of irrelevant features. Pazzani and Sarrett's (1992) average-case analysis for WHOLIST, a simple conjunctive algorithm, and Lang- ley and Iba's (1993) treatment of the naive Bayesian classifier, suggest that their sample complexities grow at most linearly with the number of irrelevant features. However, the theoretical results are less optimistic for induction methods that search a larger space of concept descriptions. For example, Langley and Iba's (1993) average-case analysis of simple nearest neighbor indicates that its sample complexity grows exponen- tially with the number of irrelevant attributes, even for conjunctive target concepts. Experimental stud- ies of nearest neighbor are consistent with this conclu- sion, and other experiments suggest that similar results hold even for induction algorithms that explicitly se- lect features. For example, the sample complexity for decision-tree methods appears to grow linearly with the number of irrelevants for conjunctive concepts, but exponentially for parity concepts, since the evaluation metric cannot distinguish relevant from irrelevant fea- tures in the latter situation (Langley & Sage, in press). Results of this sort have encouraged machine learn- ing researchers to explore more sophisticated methods for selecting relevant features. In the sections that fol- low, we present a general framework for this task, and then consider some recent examples of work on this important problem.

735 citations


Proceedings ArticleDOI
23 Jan 1994
TL;DR: It is shown that it is possible to preprocess a set of data points in real D-dimensional space in O(kd) time and in additional space, so that given a query point q, the closest point of S to S to q can be reported quickly.

610 citations


Book ChapterDOI
10 Jul 1994
TL;DR: On four datasets, it is shown that only three or four prototypes sufficed to give predictive accuracy equal or superior to a basic nearest neighbor algorithm whose run-time storage costs were approximately 10 to 200 times greater.
Abstract: With the goal of reducing computational costs without sacrificing accuracy, we describe two algorithms to find sets of prototypes for nearest neighbor classification. Here, the term “prototypes” refers to the reference instances used in a nearest neighbor computation — the instances with respect to which similarity is assessed in order to assign a class to a new data item. Both algorithms rely on stochastic techniques to search the space of sets of prototypes and are simple to implement. The first is a Monte Carlo sampling algorithm; the second applies random mutation hill climbing. On four datasets we show that only three or four prototypes sufficed to give predictive accuracy equal or superior to a basic nearest neighbor algorithm whose run-time storage costs were approximately 10 to 200 times greater. We briefly investigate how random mutation hill climbing may be applied to select features and prototypes simultaneously. Finally, we explain the performance of the sampling algorithm on these datasets in terms of a statistical measure of the extent of clustering displayed by the target classes.

510 citations


Journal ArticleDOI
TL;DR: It is shown that all modes of convergence in L 1 are equivalent if the regression variable is bounded and under the additional condition k/log n → ∞ the strong universal consistency of the estimate is obtained.
Abstract: Two results are presented concerning the consistency of the $k$-nearest neighbor regression estimate. We show that all modes of convergence in $L_1$ (in probability, almost sure, complete) are equivalent if the regression variable is bounded. Under the additional conditional $k/\log n \rightarrow \infty$ we also obtain the strong universal consistency of the estimate.

286 citations


Journal ArticleDOI
TL;DR: The analysis of an infectious disease and the eventual extinction of the host population in a lattice-structured population shows that spatially structured population models may give qualitatively different results from conventional population models, such as Lotka-Volterra ones, without spatial structure.
Abstract: We examined the propagation of an infectious disease and the eventual extinction of the host population in a lattice-structured population Both the host colonization and pathogen transmission processes are assumed to be restricted to act between the nearest neighbor sites The model is analyzed by an improved version of pair approximation (IPA) Pair approximation is a technique to trace the dynamics of the number of nearest neighbor pairs having particular states, and IPA takes account of the clustering property of lattice models more precisely The results are checked by computer simulations The analysis shows: (i) in a one-dimensional lattice population, a pathogen cannot invade a host population no matter how large is the transmission rate; (ii) in a two-dimensional lattice population, pathogens will drive the host to extinction if the transmission rate is larger than a threshold These results indicate that spatially structured population models may give qualitatively different results from conventional population models, such as Lotka-Volterra ones, without spatial structure

243 citations


Journal ArticleDOI
01 Mar 1994
TL;DR: These results show that the nearest neighbor decision system performance suffers little degradation when the given large training set is replaced by its much smaller MCS in the operational phase of testing with an independent test set.
Abstract: A new approach is presented in this study for tackling the problem of high computational demands of nearest neighbor (NN) based decision systems. The approach, based on the concept of an optimal subset selection from a given training data set, derives a consistent subset which is aimed to be minimal in size. This minimal consistent subset (MCS) selection, in contrast to most of the other previous attempts of this nature, leads to an unique solution irrespective of the initial order of presentation of the data. Further, consistency property is assured at every iteration. Also, unlike under most prior approaches, the samples are selected here in the order of significance of their contribution for enabling the consistency property. This provides insight into the relative significance of the samples in the training set. Experimental results based on a number of independent training and test data sets are presented and discussed to illustrate the methodology and bring to focus its benefits. These results show that the nearest neighbor decision system performance suffers little degradation when the given large training set is replaced by its much smaller MCS in the operational phase of testing with an independent test set. A direct experimental comparison with a prior approach is also furnished to further strengthen the case for the new methodology. >

197 citations


Journal ArticleDOI
TL;DR: The classification accuracy of four statistical and three neural network classifiers for two image based pattern classification problems is evaluated, these are optical character recognition (OCR) for isolated handprinted digits, and fingerprint classification.

153 citations


Dissertation
01 Jan 1994
TL;DR: It is shown that the k-nearest neighbor algorithm (kNN) outperforms the first nearest neighbor algorithm only under certain conditions, and methods for choosing the value of k for kNN are investigated, and two methods for learning feature weights for a weighted Euclidean distance metric are proposed.
Abstract: Distance-based algorithms are machine learning algorithms that classify queries by computing distances between these queries and a number of internally stored exemplars. Exemplars that are closest to the query have the largest influence on the classification assigned to the query. Two specific distance-based algorithms, the nearest neighbor algorithm and the nearest-hyperrectangle algorithm, are studied in detail. It is shown that the k-nearest neighbor algorithm (kNN) outperforms the first nearest neighbor algorithm only under certain conditions. Data sets must contain moderate amounts of noise. Training examples from the different classes must belong to clusters that allow an increase in the value of k without reaching into clusters of other classes. Methods for choosing the value of k for kNN are investigated. It shown that one-fold cross-validation on a restricted number of values for k suffices for best performance. It is also shown that for best performance the votes of the k-nearest neighbors of a query should be weighted in inverse proportion to their distances from the query. Principal component analysis is shown to reduce the number of relevant dimensions substantially in several domains. Two methods for learning feature weights for a weighted Euclidean distance metric are proposed. These methods improve the performance of kNN and NN in a variety of domains. The nearest-hyperrectangle algorithm (NGE) is found to give predictions that are substantially inferior to those given by kNN in a variety of domains. Experiments performed to understand this inferior performance led to the discovery of several improvements to NGE. Foremost of these is BNGE, a batch algorithm that avoids construction of overlapping hyperrectangles from different classes. Although it is generally superior to NGE, BNGE is still significantly inferior to kNN in a variety of domains. Hence, a hybrid algorithm (KBNGE), that uses BNGE in parts of the input space that can be represented by a single hyperrectangle and kNN otherwise, is introduced. The primary contributions of this dissertation are (a) several improvements to existing distance-based algorithms, (b) several new distance-based algorithms, and (c) an experimentally supported understanding of the conditions under which various distance-based algorithms are likely to give good performance.

139 citations


Journal ArticleDOI
01 Oct 1994-Ecology
TL;DR: A new 2-df chi-square test of spatial segregation is proposed, illustrated with three examples: Pielou's Douglas-fir/ponderosa pine data, a realization of a mother-daughter process, and the locations of male and female water tupelo trees.
Abstract: Segregation of species occurs when a species tends to be found near conspecifics. This is frequently investigated using a contingency table, classifying each point by its species and the species of its nearest neighbor. Pielou proposed using a 1-df chi-square test of independence as a test of segregation. This test is inappropriate if all locations within a study area are mapped. For completely mapped data, I derive the expectations and variances of the cell counts in the nearest-neighbor contingency table under the null hypothesis that species labels are randomly assigned to points. The properties of the cell counts suggest a new 2-df chi-square test of spatial segregation, a pair of species-specific tests, and a pair of species-specific measures of segregation. In small samples, the proposed tests have the appropriate size, unlike the Pielou test. The new test is illustrated with three examples: Pielou's Douglas-fir/ponderosa pine data, a realization of a mother-daughter process, and the locations of male and female water tupelo trees. 24 refs., 4 figs., 6 tabs.

108 citations


Journal ArticleDOI
TL;DR: It is shown that systems built on a simple statistical technique and a large training database can be automatically optimized to produce classification accuracies of 99% in the domain of handwritten digits.
Abstract: Shows that systems built on a simple statistical technique and a large training database can be automatically optimized to produce classification accuracies of 99% in the domain of handwritten digits. It is also shown that the performance of these systems scale consistently with the size of the training database, where the error rate is cut by more than half for every tenfold increase in the size of the training set from 10 to 100,000 examples. Three distance metrics for the standard nearest neighbor classification system are investigated: a simple Hamming distance metric, a pixel distance metric, and a metric based on the extraction of penstroke features. Systems employing these metrics were trained and tested on a standard, publicly available, database of nearly 225,000 digits provided by the National Institute of Standards and Technology. Additionally, a confidence metric is both introduced by the authors and also discovered and optimized by the system. The new confidence measure proves to be superior to the commonly used nearest neighbor distance. >

103 citations


Proceedings ArticleDOI
27 Jun 1994
TL;DR: Two algorithms for the construction of pattern classifier neural architectures are proposed and a comparison with other known similar architectures is given and simulation results are carried out.
Abstract: In this paper two algorithms for the construction of pattern classifier neural architectures are proposed. A comparison with other known similar architectures is given and simulation results are carried out. >

Journal ArticleDOI
TL;DR: Simple data structures are introduced which reduce the time complexity of the Northby algorithm for lattice search from O(n5/3) per move toO(n2/3" per move for ann-atom cluster involving full Lennard-Jones potential function and show that in some cases, the relaxation of a lattice local minimizers with a worse potential function value may lead to a local minimizer with a better potential functionvalue.
Abstract: In 1987, Northby presented an efficient lattice based search and optimization procedure to compute ground states ofn-atom Lennard-Jones clusters and reported putative global minima for 13⩽n⩽150. In this paper, we introduce simple data structures which reduce the time complexity of the Northby algorithm for lattice search fromO(n5/3) per move toO(n2/3) per move for ann-atom cluster involving full Lennard-Jones potential function. If nearest neighbor potential function is used, the time complexity can be further reduced toO(logn) per move for ann-atom cluster. The lattice local minimizers with lowest potential function values are relaxed by a powerful Truncated Newton algorithm. We are able to reproduce the minima reported by Northby. The improved algorithm is so efficient that less than 3 minutes of CPU time on the Cray-XMP is required for each cluster size in the above range. We then further improve the Northby algorithm by relaxingevery lattice local minimizer found in the process. This certainly requires more time. However, lower energy configurations were found with this improved algorithm forn=65, 66, 75, 76, 77 and 134. These findings also show that in some cases, the relaxation of a lattice local minimizer with a worse potential function value may lead to a local minimizer with a better potential function value.

Book ChapterDOI
01 May 1994
TL;DR: In this article, a hybrid method that combines BNGE and the k-Nearest Neighbor algorithm, called KBNGE, is introduced for improved classification accuracy, which achieves generalization accuracies similar to the kNN algorithm at improved classification speed.
Abstract: Algorithms based on Nested Generalized Exemplar (NGE) theory [10] classify new data points by computing their distance to the nearest "generalized exemplar" (i.e. an axis-parallel multidimensional rectangle). An improved version of NGE, called BNGE, was previously shown to perform comparably to the Nearest Neighbor algorithm. Advantages of the NGE approach include compact representation of the training data and fast training and classification. A hybrid method that combines BNGE and the k-Nearest Neighbor algorithm, called KBNGE, is introduced for improved classification accuracy. Results from eleven domains show that KBNGE achieves generalization accuracies similar to the k-Nearest Neighbor algorithm at improved classification speed. KBNGE is a fast and easy to use inductive learning algorithm that gives very accurate predictions in a variety of domains and represents the learned knowledge in a manner that can be easily interpreted by the user.

Journal ArticleDOI
TL;DR: This paper describes how a time series may exhibit chaotic behavior, and presents a multivariate nearest neighbor method capable of representing such behavior and provides an empirical demonstration using store scanner data for a consumer packaged good.

Journal ArticleDOI
TL;DR: Werther et al. as mentioned in this paper used a library of 90,000 spectra and applied four complementary classification methods (k-nearest neighbor, linear discriminant analysis, SIMCA, and a neural network).

Journal ArticleDOI
TL;DR: In this article, the possible ground states of a ternary fcc lattice model with nearest-and next-nearest-neighbor pair interactions are investigated by constructing an eight-dimensional configuration polytope and enumerating its vertices.
Abstract: The possible ground states of a ternary fcc lattice model with nearest- and next-nearest-neighbor pair interactions are investigated by constructing an eight-dimensional configuration polytope and enumerating its vertices. Although a structure could not be constructed for most of the vertices, 31 ternary ground states are found, some of which correspond to structures that have been observed experimentally.

Journal ArticleDOI
TL;DR: A new, parallel, nearest-neighbor (NN) pattern classifier, based on a 2D Cellular Automaton (CA) architecture, is presented, which produces piece-wise linear discriminant curves between clusters of points of complex shape (nonlinearly separable).
Abstract: A new, parallel, nearest-neighbor (NN) pattern classifier, based on a 2D Cellular Automaton (CA) architecture, is presented in this paper. The proposed classifier is both time and space efficient, when compared with already existing NN classifiers, since it does not require complex distance calculations and ordering of distances, and storage requirements are kept minimal since each cell stores information only about its nearest neighborhood. The proposed classifier produces piece-wise linear discriminant curves between clusters of points of complex shape (nonlinearly separable) using the computational geometry concept known as the Voronoi diagram, which is established through CA evolution. These curves are established during an "off-line" operation and, thus, the subsequent classification of unknown patterns is achieved very fast. The VLSI design and implementation of a nearest neighborhood processor of the proposed 2D CA architecture is also presented in this paper. >

Journal ArticleDOI
TL;DR: In this article, four molecular similarity measures have been used to select the nearest neighbor of chemicals in two data sets of 139 hydrocarbons and 15 nitrosamines, respectively, based on calculated graph invariants which include atom pairs, connectivity indices and information theoretic topological indices.
Abstract: Four molecular similarity measures have been used to select the nearest neighbor of chemicals in two data sets of 139 hydrocarbons and 15 nitrosamines, respectively. The similarity methods are based on calculated graph invariants which include atom pairs, connectivity indices and information theoretic topological indices. The property of the selected nearest neighbor by each method was taken as the estimate of the property under investigation. The results show that for these data sets, all four methods give reasonable estimates of the properties studied.

Proceedings ArticleDOI
13 May 1994
TL;DR: In this paper, a rule-based compensation for mask feature dimensions is proposed based on the post-process measurement of select pattern configurations, such as isolated lines, isolated spaces, and lines and spaces ranging from 5 micrometers to below 0.5 micrometer on a test vehicle mask.
Abstract: A rule-based compensation for mask feature dimensions is proposed. This technique is based on the post-process measurement of select pattern configurations. These include isolated lines, isolated spaces, and lines and spaces ranging from 5 micrometers to below 0.5 micrometers on a test vehicle mask. The difference between the coded size and the measured size is then plotted as a function of the target dimension. Using these data in conjunction with some choice `2D'-type patterns, a size correction is made for each distinct feature based upon the feature dimensions and its distance to the nearest neighbor. Given an effective rule representation, pattern corrections can be viably implemented on a chip scale by an automated feature compensation CAD system. The potential causes for proximity effect in our phase-shifting mask fabrication process and effectiveness of the proposed correction technique are also investigated.

Journal ArticleDOI
TL;DR: The most accurate of the two new algorithms is about eight times faster than nearest neighbor interpolation, and the subjective image quality is between nearest neighbor and bilinear interpolation.

Journal ArticleDOI
TL;DR: Experimental results show that the partition generated by the proposed method is more reasonable than that of the well-known c -means algorithm in many complicated object distributions.

Book ChapterDOI
01 Jan 1994
TL;DR: This paper examines the hypothesis that local weighted variants of k-nearest neighbor algorithms can support dynamic control tasks, and hypothesizes that the non-linearities in this task are the cause, and that local regression algorithms may need to be modified to work well under similar conditions.
Abstract: This paper examines the hypothesis that local weighted variants of k-nearest neighbor algorithms can support dynamic control tasks. We evaluated several k-nearest neighbor (k-NN) algorithms on the simulated learning task of catching a flying ball. Previously, local regression algorithms have been advocated for this class of problems. These algorithms, which are variants of k-NN, base their predictions on a (possibly weighted) regression computed from the k nearest neighbors. While they outperform simpler k-NN algorithms on many tasks, they have trouble on this ball-catching task. We hypothesize that the non-linearities in this task are the cause of this behavior, and that local regression algorithms may need to be modified to work well under similar conditions.

Journal ArticleDOI
TL;DR: The convergence rate of posterior risk of the fuzzy generalized NNR is exponentially fast, which means the algorithm can be described as a unified approach to a variety of fuzzy k-NNR's.
Abstract: tizzy k nearest neighbor rule (k-NNR) has been applied in a variety of substantive areas. Yang and Chen (l) described a fuzzy generalized k-NN algorithm which is a unified approach to a variety of fuzzy k-NNR's. They created the strong consistency of posterior risk of the fuzzy generalized NNR. In this paper, we give their convergence rate. That is, the convergence rate of posterior risk of the fuzzy generalized NNR is exponentially fast.

Proceedings ArticleDOI
18 Dec 1994
TL;DR: The paper presents the method of large neighborhood templates realization in a nearest-neighbor connected discrete-time CNN Universal Machine by decomposing an objective template into a sum of two-dimensional 3/spl times/3 template correlations.
Abstract: The paper presents the method of large neighborhood templates realization (ie templates with r>1) in a nearest-neighbor connected (ie r=1) discrete-time CNN Universal Machine This is accomplished by decomposing an objective template into a sum of two-dimensional 3/spl times/3 template correlations An appropriate procedure which ensures a desired circuit operation is given in an algorithmic form >

Proceedings ArticleDOI
02 Mar 1994
TL;DR: This paper concentrates at the discussion of two special characteristics of this novel pattern recognition system: theautomatic feature extraction and the automatic feature competition.
Abstract: When the dimension N of the input vector is much larger than the number M of different training patterns to be learned, a one-layered, hard-limited perceptron with N input nodes and P neurons (P > equals Log2M) is generally sufficient to accomplish the learning- recognition task. The recognition should be very robust and very fast if an optimum noniterative learning scheme is applied to the perceptron learning process. This paper concentrates at the discussion of two special characteristics of this novel pattern recognition system: the automatic feature extraction and the automatic feature competition. An unedited video movie recorded on a series of learning-recognition experiments may demonstrate these characteristics of the novel system in real time.© (1994) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

Proceedings ArticleDOI
06 Jul 1994
TL;DR: This paper presents a technique for prediction without recourse to expensive Monte Carlo simulations of the performance of the nearest neighbor filter that can quantify the dynamic process of tracking divergence as well as the steady state performance.
Abstract: The measurement that is `closest' to the predicted target measurement is known as the `nearest neighbor' measurement in target tracking. A common method currently in wide use for tracking in clutter is the so-called nearest neighbor filter, which uses only the nearest neighbor measurement as if it is the true one. This paper presents a technique for prediction without recourse to expensive Monte Carlo simulations of the performance of the nearest neighbor filter. This technique can quantify the dynamic process of tracking divergence as well as the steady state performance. The technique is based on a general approach to the performance prediction of algorithms with both continuous and discrete uncertainties developed recently by the authors.

Journal ArticleDOI
TL;DR: An improvement of computing two nearest-neighbor problems of images on a reconfigurable array of processors (RAP) by increasing the bus width between processors by making the execution time of the proposed algorithms tunable by the buswidth.

Book ChapterDOI
26 May 1994
TL;DR: Feature selection is an important phase in pattern recognition and has also an important role in neural network methods even though it is often neglected.
Abstract: Feature selection is an important phase in pattern recognition. It has also an important role in neural network methods even though it is often neglected. The fundamental function of feature selection is to find a set of features that will represent the pattern vector in a most optimal way. Only the information that is either redundant or irrelevant to the classification task is removed from the pattern vectors. The dimension of the feature vector is usually smaller than the dimension of the pattern vector so the computation time and the memory requirements are greatly reduced.

Journal ArticleDOI
TL;DR: This paper investigates a typical nearest neighbor balancing strategy, called LAL (Local Average Load), in which the work load of a processor is averaged among its nearest neighbors at discrete time steps, and derives a simple closed-form formula for the variance upper bound as a function of a system size and dimenslon.

Book ChapterDOI
01 Jan 1994
TL;DR: This paper presents first results from a comparison of case-based and symbolic learning systems in Nearest-Neighbor Classification.
Abstract: The Nearest-Neighbor Classification has a long tradition in the area of pattern recognition while knowledge-based systems apply mainly symbolic learning algorithms. There is a strong relationship between Nearest-Neighbor Classification and learning. The increasing number of cases and the adaptation of the similarity measure are used to improve the classification ability. Nowadays, Nearest-Neighbor Classification is applied in knowledge-based systems by a technique called case-based reasoning. In this paper we present first results from a comparison of case-based and symbolic learning systems.