scispace - formally typeset
Search or ask a question

Showing papers presented at "The European Symposium on Artificial Neural Networks in 1999"


Proceedings Article
01 Jan 1999
TL;DR: A formulation of the SVM is proposed that enables a multi-class pattern recognition problem to be solved in a single optimisation and a similar generalization of linear programming machines is proposed.
Abstract: The solution of binary classi cation problems using support vector machines (SVMs) is well developed, but multi-class problems with more than two classes have typically been solved by combining independently produced binary classi ers. We propose a formulation of the SVM that enables a multi-class pattern recognition problem to be solved in a single optimisation. We also propose a similar generalization of linear programming machines. We report experiments using bench-mark datasets in which these two methods achieve a reduction in the number of support vectors and kernel calculations needed. 1. k-Class Pattern Recognition The k-class pattern recognition problem is to construct a decision function given ` iid (independent and identically distributed) samples (points) of an unknown function, typically with noise: (x1; y1); : : : ; (x`; y`) (1) where xi; i = 1; : : : ; ` is a vector of length d and yi 2 f1; : : : ; kg represents the class of the sample. A natural loss function is the number of mistakes made. 2. Solving k-Class Problems with Binary SVMs For the binary pattern recognition problem (case k = 2), the support vector approach has been well developed [3, 5]. The classical approach to solving k-class pattern recognition problems is to consider the problem as a collection of binary classi cation problems. In the one-versus-rest method one constructs k classi ers, one for each class. The n classi er constructs a hyperplane between class n and the k 1 other classes. A particular point is assigned to the class for which the distance from the margin, in the positive direction (i.e. in the direction in which class \one" lies rather than class \rest"), is maximal. This method has been used widely in ESANN'1999 proceedings European Symposium on Artificial Neural Networks Bruges (Belgium), 21-23 April 1999, D-Facto public., ISBN 2-600049-9-X, pp. 219-224

873 citations


Proceedings Article
01 Jan 1999
TL;DR: This paper introduces a new method for data domain description, inspired by the Support Vector Machine by V.Vapnik, which computes a sphere shaped decision boundary with minimal volume around a set of objects and contains support vectors describing the sphere boundary.
Abstract: This paper introduces a new method for data domain description, inspired by the Support Vector Machine by V.Vapnik, called the Support Vector Domain Description (SVDD). This method computes a sphere shaped decision boundary with minimal volume around a set of objects. This data description can be used for novelty or outlier detection. It contains support vectors describing the sphere boundary and it has the possibility of obtaining higher order boundary descriptions without much extra computational cost. By using the di erent kernels this SVDD can obtain more exible and more accurate data descriptions. The error of the rst kind, the fraction of the training objects which will be rejected, can be estimated immediately from the description.

315 citations


Proceedings Article
01 Jan 1999
TL;DR: This work proposes an iterative algorithm that allows to generalize the classical SV method to a generic choice of the basis functions and shows that all these problems can be solved within a unique approach if the method is equipped with a robust method fording a sparse solution of a linear system.
Abstract: Most Support Vector (SV) methods proposed in the recent literature can be viewed in a uni ed framework with great exibility in terms of the choice of the basis functions. We show that all these problems can be solved within a unique approach if we are equipped with a robust method for nding a sparse solution of a linear system. Moreover, for such a purpose, we propose an iterative algorithm that can be simply implemented. This allows us to generalize the classical SV method to a generic choice of the basis functions.

95 citations


Proceedings Article
01 Jan 1999

45 citations


Proceedings Article
01 Jan 1999
TL;DR: An algorithm is proposed that estimates the set of weights and number of hidden units each network in a bagged ensemble should have so that the generalization performance of the ensemble is optimized.
Abstract: In this paper we propose an algorithm we call \NeuralBAG" that estimates the set of weights and number of hidden units each network in a bagged ensemble should have so that the generalization performance of the ensemble is optimized. Experiments performed on noisy synthetic data demonstrate the potential of the algorithm. On average, ensembles trained using NeuralBAG out-perform bagged networks trained using cross-validation by 53% and individual networks trained using \cheating"

38 citations


Proceedings Article
01 Jan 1999
TL;DR: The kernel-Adatron, a fast, simple, and robust alternative to SVM classifiers providing arbitrary, large-margin discriminant functions iteratively, so avoiding the intensive QP computations of the SVM, is developed, yielding a general, non-linear adaptive mapping device via an algorithm with well-documented properties.
Abstract: By expanding a function in series form it can be represented to an arbitrary degree of accuracy by taking enough terms. It is therefore possible, in principle, to conduct a linear regression on a new set of variables, transformed by a fixed mapping. This leads to a large computational burden and to the need for an infeasible amount of data from which the coefficients must be estimated and is not generally practical for function approximation. The algorithm studied in [1] is a linear Perceptron, which is computed implicitly in a space of infinite-dimension (the linearisation space) using potential (kernel) functions. These have been further exploited in the Support Vector Machine (SVM) [2], principal component analysis [3], linear programming machines [4] and clustering [5]. The kernel-Adatron [4,6,7], provides a fast, simple, and robust alternative to SVM classifiers providing arbitrary, large-margin discriminant functions iteratively, so avoiding the intensive QP computations of the SVM. We now use kernels to develop a non-linear version of the Adaline [8], yielding a general, non-linear adaptive mapping device via an algorithm with well-documented properties. Selecting an appropriate kernel and its parameter specifies the mapping to the linearisation space, which can be done empirically, via cross validation.

23 citations


Proceedings Article
01 Jan 1999
TL;DR: The integrated rule-based and neural network architectures developed as a number of di erent fraud detection tools for GSM networks are integrated into a hybrid detection tool and the performance of the hybrid system is optimized in terms of the number of subscribers raising alarms.
Abstract: During the course of the European project \Advanced Security for Personal Communication Technologies" (ASPeCT), we have developed some rule-based and neural network architectures as a number of di erent fraud detection tools for GSM networks We have now integrated these di erent techniques into a hybrid detection tool We optimized the performance of the hybrid system in terms of the number of subscribers raising alarms More precisely, we optimized performance curves showing the trade-o between the percentage of correctly identi ed fraudsters versus the percentage of new subscribers raising alarms We report here on a common suite of experiments we performed on these di erent sys-

23 citations


Proceedings Article
01 Jan 1999
TL;DR: This paper applies ID OTPM, a new algorithm for ID estimation based on Optimally Topology Preserving Maps to image sequences and suggests that the inter-band dimension db of the AVIRIS data set is between one and two, whereas the spectral dimension ds is about four.
Abstract: Estimating the intrinsic dimensionality (ID) of an intrinsically low (d-) dimensional data set embedded in a high (n-) dimensional input space by conventional Principal Component Analysis (PCA) is computationally hard because PCA scales cubic (O(n)) with the input dimension [11]. Besides this computational drawback, global PCA will overestimate the ID if the data manifold is curved. In this paper we apply ID OTPM [1], a new algorithm for ID estimation based on Optimally Topology Preserving Maps [7] to image sequences. In particular, we utilize ID OTPM for ID estimation of an AVIRIS data set, a hyperspectral remote sensing image cube, with input dimension of the individual image planes n = 257880. Most interestingly, our experiments suggest that the inter-band dimension db of the AVIRIS data set is between one and two, whereas the spectral dimension ds is about four. These results provide important clues for compression, visualization and classi cation of the the AVIRIS data set.

19 citations


Proceedings Article
01 Jan 1999
TL;DR: A hybrid model consisting of an hidden Markov chain and MLPs to model piecewise stationary series is presented and it is shown that, at least on the classical laser time series, the model is more parcimonious and give better segmentation of the series.
Abstract: We present a hybrid model consisting of an hidden Markov chain and MLPs to model piecewise stationary series. We compare our results with the model of gating networks (A.S. Weigend et al. [6]) and we show than, at least on the classical laser time series, our model is more parcimonious and give better segmentation of the series.

18 citations


Proceedings Article
01 Jan 1999
TL;DR: Several approximation results for folding networks { a generalization of partial recurrent neural networks such that not only time sequences but arbitrary trees can serve as input: Any measurable function can be approximated in probability.
Abstract: In this paper we show several approximation results for folding networks { a generalization of partial recurrent neural networks such that not only time sequences but arbitrary trees can serve as input: Any measurable function can be approximated in probability. Any continuous function can be approximated in the maximum norm on inputs with restricted height, but the resources necessarily increase at least exponentially in the input height. In general, approximation on arbitrary inputs is not possible in the maximum norm.

18 citations


Proceedings Article
01 Jan 1999
TL;DR: Airborne and satellite-borne spectral imaging has become one of the most advanced tools for collecting vital information about the surface covers of Earth and other planets and ANNs hold the promise to revolutionize this area by overcoming many of the mathematical obstacles that traditional techniques fail at.
Abstract: Utilization of remote sensing multiand hyperspectral imagery has been rapidly increasing in numerous areas of economic and scienti c signi cance. Hyperspectral sensors, in particular, provide the detailed information that is known from laboratory measurements to characterize and identify minerals, soils, rocks, plants, water bodies, and other surface materials. This opens up tremendous possibilities for resource exploration and management, environmental monitoring, natural hazard prediction, and more. However, exploitation of the wealth of information in spectral images has yet to match up to the sensors' capabilities, as conventional methods often prove inadequate. ANNs hold the promise to revolutionize this area by overcoming many of the mathematical obstacles that traditional techniques fail at. By providing high speed when implemented in parallel hardware, (near-)real time processing of extremely high data volumes, typical in remote sensing spectral imaging, will also be possible. 1. Challenges in remote spectral image analyses Airborne and satellite-borne spectral imaging has become one of the most advanced tools for collecting vital information about the surface covers of Earth and other planets. The utilization of these data includes areas such as mineral exploration, land use, forestry, natural hazard assessments, water resources, environmental contamination, ecosystem management, biomass and productivity assessment, and many other activities of economic signi cance, as well as prime scienti c pursuits such as looking for possible sources of past or present life on other planets. The number of applications has dramatically increased in the past ten years with the advent of imaging spectrometers, which greatly surpass traditional multi-spectral imagers (e.g., Landsat Thematic Mapper) in that they can resolve the detailed spectral features that are known to characterize minerals, soils, rocks, and vegetation, from laboratory measurements. While a multispectral sensor samples the given wavelength window (typically the 0.4 { 2.5 m range in the case of Visible and Near-Infrared surface re ectance imaging) with several broad bandpasses, leaving large gaps between the bands, spectral Work funded by NASA, Applied Information Systems Research Program, NAG54001; ESANN'1999 proceedings European Symposium on Artificial Neural Networks Bruges (Belgium), 21-23 April 1999, D-Facto public., ISBN 2-600049-9-X, pp. 93-98

Proceedings Article
01 Jan 1999
TL;DR: A new simple algorithm is used to completely define the structure of the RBF classifier and has the major advantage to require only the training set (no step learning, threshold or other parameters as in other methods).
Abstract: This paper describes a global approach to the construction of Radial Basis Function (RBF) neural net classifier We used a new simple algorithm to completely define the structure of the RBF classifier This algorithm has the major advantage to require only the training set (no step learning, threshold or other parameters as in other methods) Tests on several benchmark datasets showed, despite its simplicity, that this algorithm provides a robust and efficient classifier The results of this built RBF classifier are compared to those obtained with three other classifiers : a classic one and two neural ones The robustness and efficiency of this kind of RBF classifier make the proposed algorithm very attractive

Proceedings Article
01 Jan 1999
TL;DR: An AI-based retrieval system inspired by the WEBSOM-algorithm is proposed, where each document is characterised by comparing the concepts found in it, to those present in the concept space.
Abstract: An AI-based retrieval system inspired by the WEBSOM-algorithm is proposed. Contrary to the WEBSOM however, we introduce a system using only the index of every document. The knowledge extraction process results into a so-called Associative Conceptual Space where the words as found in the documents are organised using a Hebbian-type of (un)learning. Next, ’ (i.e.wordclusters) are identified using the SOM-algorithm. Thereupon, each document is characterised by comparing the concepts found in it, to those present in the concept space. Applying the characterisations, all documents can be clustered such that semantically similar documents lie close together on a SelfOrganising Map.

Proceedings Article
01 Jan 1999
TL;DR: Evaluating the performance of Support Vector Machines and Multi-Layer Perceptrons on two problems of Particle Identification in High Energy Physics experiments indicates that SVMs and MLPs tend to perform very similarly.
Abstract: In this paper we evaluate the performance of Support Vector Machines (SVMs) and Multi-Layer Perceptrons (MLPs) on two di erent problems of Particle Identi cation in High Energy Physics experiments. The obtained results indicate that SVMs and MLPs tend to perform very similarly.

Proceedings Article
01 Jan 1999
TL;DR: This paper benchmarked SVMs on a face identi cation problem and proposed two approaches incorporating SV classi ers, one of which achieves the best result known on ORL database.
Abstract: The Support Vector Machine (SVM) is a statistic learning technique proposed by Vapnik and his research group [8]. In this paper, we benchmark SVMs on a face identi cation problem and propose two approaches incorporating SV classi ers. The rst approach maps the images in to a low dimensional features vector via a local Principal Component Analysis (PCA), features vectors are then used as the inputs of a SVM. The second algorithm is a direct SV classi er with invariances. Both approaches are tested on the freely available ORL database. The SV classi er with invariances achieves an error of 1.5%, which is the best result known on ORL database.


Proceedings Article
01 Jan 1999
TL;DR: It is shown that even if the approaches used are used only for quantization, the SOM algorithm can be successfully used to accelerate in a very large proportion the speed of convergence of the classical Simple Competitive Learning Algorithm (SCL).
Abstract: In a previous paper ([1], ESANN’97), we compared the Kohonen algorithm (SOM) to Simple Competitive Learning Algorithm (SCL) when the goal is to reconstruct an unknown density. We showed that for that purpose, the SOM algorithm quickly provides an excellent approximation of the initial density, when the frequencies of each class are taken into account to weight the quantifiers of the classes. Another important property of the SOM is the well known topology conservation, which implies that neighbor data are classified into the same class (as usual) or into neighbor classes. In this paper, we study another interesting property of the SOM algorithm, that holds for any fixed number of quantifiers. We show that even we use those approaches only for quantization, the SOM algorithm can be successfully used to accelerate in a very large proportion the speed of convergence of the classical Simple Competitive Learning Algorithm (SCL).

Proceedings Article
01 Jan 1999
TL;DR: An LVQ neural network is used for the clustering and classification of marble slabs according to their texture based on the Sum and Difference Histograms, a faster version of the Co-occurrence Matrices.
Abstract: This article describes the use of an LVQ neural network for the clustering and classification of marble slabs according to their texture. The method used for the recognition of textures is based on the Sum and Difference Histograms, a faster version of the Co-occurrence Matrices. The input of the network is a vector of statistical parameters which characterize the pattern shown to the net, and the desired output is the class to which the pattern belongs (supervised learning). The samples chosen for testing the algorithms have been marble slabs of type “Crema Marfil Sierra de la Puerta”. The neural network has been implemented using MATLAB.

Proceedings Article
01 Jan 1999
TL;DR: VRP iru dssolfdwlrqv lq vdwhoolwh uhprwh vhqvrulqj surfhvvlqj v|vwhpv dqg jlyh vrph yduldqwv wr lpsuryh wkh shuirupdqfh ri wkh VRP1 Wkhuhe| zh hpskdvl
Abstract: Wkh surfhvv ri vdwhoolwh uhprwh vhqvrulqj xvxdoo| lv lqglfdwhg e| yhu| odujh gdwd vhwv/ kljk0glphqvlrqdo gdwd vsdfhv dqg fruuhodwhg dqg qrlv| gdwd1 Wkhvh idfwv suhihu wkh dssolfdwlrq ri qhxudo pdsv iru lqyhvwljdwlrqv1 Wkhuhe| wkh| pd| xvhg dv suhsurfhvvlqj wrrov dv zhoo dv qdo dssolfdwlrqv ^4/ 43`1 Vhoi0rujdql}lqj pdsv +VRP, ^;` dv vshfldo nlqg ri qhxudo pdsv surmhfw gdwd iurp vrph +srvvleo| kljk0glphqvlrqdo, lqsxw vsdfh Y ?Y rqwr d srvlwlrq lq vrph rxwsxw vsdfh/ vxfk wkdw d frqwlqxrxv fkdqjh ri d sdudphwhu ri wkh lqsxw gdwd vkrxog ohdg wr d frqwlqxrxv fkdqjh ri wkh srvlwlrq ri d orfdol}hg h{flwd0 wlrq lq wkh qhxudo pds1 Wklv surshuw| ri qhljkerukrrg suhvhuydwlrq ghshqgv rq dq lpsruwdqw ihdwxuh ri wkh VRP/ lwv rxwsxw vsdfh wrsrorj|/ zklfk kdv wr eh vshfl hg sulru wr ohduqlqj1 Xvxdoo| wkh rxwsxw vsdfh D ri wkh VRP lv d GD0 glphqvlrqdo uhfwdqjxodu julg1+k|shufxeh,1 Li wkh wrsrorj|/ l1h1 glphqvlrqdolw| dqg hgjh ohqjwk udwlrv/ ri D grhv qrw pdwfk wkdw ri wkh gdwd vkdsh/ qhljkeru0 krrg ylrodwlrqv duh lqhylwdeoh ^45`1 D kljkhu ghjuhh ri wrsrorj| suhvhuydwlrq/ lq jhqhudo/ lpsuryhv wkh dffxudf| ri wkh pds ^6`1 Lq wkh suhvhqw sdshu zh frqvlghu wkh dgydqwdjhv dqg olplwv ri wkh VRP iru dssolfdwlrqv lq vdwhoolwh uhprwh vhqvrulqj surfhvvlqj v|vwhpv dqg jlyh vrph yduldqwv wr lpsuryh wkh shuirupdqfh ri wkh VRP1 Wkhuhe| zh hpskdvl}h wkh dvshfw ri d frqwlqxrxv pdsslqj/ l1h1 wkh frqwuro ri wkh wrsrorj| suhvhuydwlrq1


Proceedings Article
01 Jan 1999
TL;DR: A new artificial neural network architecture for learning and classifying multivalued input patterns has been introduced, called Supervised ART-II, which represents a new supervision approach for ART modules.
Abstract: A new artificial neural network (ANN) architecture for learning and classifying multivalued input patterns has been introduced, called Supervised ART-II. It represents a new supervision approach for ART modules. It is quicker in learning than Supervised ART-I when the number of category nodes is large, and it requires less memory. The architecture, learning, and testing of the newly developed ANN have been discussed.

Proceedings Article
01 Jan 1999
TL;DR: Using a large set of 156 features, the GA is able to select a set of 6 features that give 100% recognition accuracy, allowing the creation of compact, highly accurate networks that require comparatively little preprocessing.
Abstract: Arti cial Neural Networks (ANNs) can be used successfully to detect faults in rotating machinery, using statistical estimates of the vibration signal as input features. One of the main problems facing the use of ANNs is the selection of the best inputs to the ANN, allowing the creation of compact, highly accurate networks that require comparatively little preprocessing. This paper examines the use of a Genetic Algorithm (GA) to select the most signi cant input features from a large set of possible features in machine condition monitoring contexts. Using a large set of 156 di erent features, the GA is able to select a set of 6 features that give 100% recognition accuracy.

Proceedings Article
01 Jan 1999
TL;DR: The stable attractors of the CLM provide consistent and unambiguous labelings in the sense of RL and an efficient stochastic simulation procedure is given for their identification.
Abstract: We discuss the relation of the Competitive Layer Model (CLM) to Relaxation Labeling (RL) with regard to feature binding and labeling problems. The CLM uses cooperative and competitive interactions to partition a set of input features into groups by energy minimization. As we show, the stable attractors of the CLM provide consistent and unambiguous labelings in the sense of RL and we give an efficient stochastic simulation procedure for their identification. In addition to binding the CLM exhibits contextual activity modulation to rep- resent stimulus salience. We incorporate deterministic annealing for avoidance of local minima and show how figure-ground segmentation and g rouping can be combined for the CLM application of contour grouping on a real image. Proceedings European Symposium on Artificial Neural Networ ks, Bruges 1999, pages 295-300

Proceedings Article
01 Jan 1999
TL;DR: An e cient procedure is proposed for initializing two-layer perceptrons and for determining the optimal number of hidden neurons, based on the Orthogonal Least Squares method, which is typical of RBF as well as Wavelet networks.
Abstract: An e cient procedure is proposed for initializing two-layer perceptrons and for determining the optimal number of hidden neurons. This is based on the Orthogonal Least Squares method, which is typical of RBF as well as Wavelet networks. Some experiments are discussed, in which the proposed method is coupled with standard backpropagation training and compared with random initialization.

Proceedings Article
01 Jan 1999
TL;DR: An algorithm based on deterministic annealing is described, which is able to cluster various types of data, and is applied to instances of three types of MLP's, trained to predict the time of death of ovarian cancer patients.
Abstract: Although training an ensemble of neural network solutions increases the amount of information obtained from a system, large ensembles may be hard to analyze. Since data clustering is a good method to summarize large bodies of data, we will show in this paper how to use clustering on instances of neural networks. We will describe an algorithm based on deterministic annealing, which is able to cluster various types of data. As an example, we will apply the algorithm to instances of three di erent types of MLP's, trained to predict the time of death of ovarian cancer patients.

Proceedings Article
01 Jan 1999
TL;DR: The application of neural network techniques to the paper-making industry, particularly for the prediction of paper "curl", are described, and are widely applicable to industry.
Abstract: This paper describes the application of neural network techniques to the paper-making industry, particularly for the prediction of paper "curl". Paper curl is a common problem and can only be measured reliably off-line, after man- ufacture. Model development is carried out using imperfect data, typical of that collected in many manufacturing environments, and addresses issues pertinent to real-world use. Predictions then are presented in terms that are relevant to the machine operator, as a measure of paper acceptability, a direct prediction of the quality measure, and always with a measure of prediction confidence. Therefore, the techniques described in this paper are widely applicable to industry.

Proceedings Article
01 Jan 1999
TL;DR: Through numerical simulations and computational complexity evaluations, it is shown the {APEX algorithms exhibit superior capability and interesting features.
Abstract: We present a comparison of three neural PCA techniques: the GHA by Sanger, the APEX by Kung and Diamataras, and the { APEX rst proposed by the present authors. Through numerical simulations and computational complexity evaluations we show the {APEX algorithms exhibit superior capability and interesting features.

Proceedings Article
01 Jan 1999
TL;DR: A novel approach to human posture analysis and recognition using standard image processing techniques as well as hybrid neural information processing is presented and a reliable and robust person localization module is developed via a combination of oriented lters and threedimensional dynamic neural elds.
Abstract: This paper describes the preliminary results of the research work currently ongoing at our department and carried out as part of a project founded by the Commission of the European Union . In this paper a novel approach to human posture analysis and recognition using standard image processing techniques as well as hybrid neural information processing is presented. We rst develop a reliable and robust person localization module via a combination of oriented lters and threedimensional dynamic neural elds. Then we focus on the view-based recognition of the user's static gestural instructions from a prede ned vocabulary based on both a skin color model and statistical normalized moment invariants. The segmentation of the postures occurs by means of the skin color model based on the Mahalanobis metric. From the resulting binary image containing only regions which have been classi ed as skin candidates we extract translation and scale invariant moments. They are used as input for two di erent neural classi ers whose results are then compared. To train and test the neural classi ers we gathered the data from ve people performing 18 repetitions of each of ve postures (our vocabulary): stop, go left, go right, hello left and hello right. The system is currently under development with constant updates and new developments. It uses input from a color video camera and is user-independent. The aim is to build a real-time system able to deal with dynamic gestures.

Proceedings Article
01 Jan 1999
TL;DR: Folding architecture networks and the closely related concept of recursive neural networks are applied to the problem of learning search-control heuristics for automated deduction systems and show a considerable performance improvement.
Abstract: During the last years, folding architecture networks and the closely related concept of recursive neural networks have been developed for solving supervised learning tasks on data structures. In this paper, these networks are applied to the problem of learning search-control heuristics for automated deduction systems. Experimental results with the automated deduction system Setheo in an algebraic domain show a considerable performance improvement. Controlled by heuristics which had been learned from simple problems in this domain the system is able to solve several problems from the same domain which had been out of reach for the original system.

Proceedings Article
01 Apr 1999
TL;DR: The principle of learning by specialization within a cortically-inspired framework is presented and Adaptations will be discussed, in light of experiments with the cortical model addressing causality learning from perceptive sequences.
Abstract: In this paper we present the principle of learning by specialization within a cortically-inspired framework. Specialization of neurons in the cortex has been observed, and many models are using such "cortical-like" learning mechanisms, adapted for computational efficiency. Adaptations will be discussed, in light of experiments with our cortical model addressing causality learning from perceptive sequences.