scispace - formally typeset
Search or ask a question

Showing papers on "Deep learning published in 2002"


Journal ArticleDOI
TL;DR: The algorithm developed outperformed the other methods by achieving higher classification accuracy on all the problems tested and compared the approach with five other feature selection methods, each of which banks on a different concept.

325 citations


Book
11 Nov 2002
TL;DR: This second edition takes account of important new developments in the field of machine learning and deals extensively with the theory of learning control systems, now comparably mature to learning of neural networks.
Abstract: How does a machine learn a new concept on the basis of examples? This second edition takes account of important new developments in the field It also deals extensively with the theory of learning control systems, now comparably mature to learning of neural networks

190 citations


Proceedings ArticleDOI
07 Aug 2002
TL;DR: This paper study the interpretation of cost estimation models based on a backpropagation three layer perceptron network based on the COCOMO'81 dataset and proposes a method that maps this neural network to a fuzzy rule based system.
Abstract: Software development effort estimation with the aid of neural networks has generally been viewed with skepticism by a majority of the software cost estimation community. Although, neural networks have shown their strengths in solving complex problems, their shortcoming of being 'black boxes' models has prevented them from being accepted as a common practice for cost estimation. In this paper, we study the interpretation of cost estimation models based on a backpropagation three layer perceptron network. Our proposed idea comprises mainly of the use of a method that maps this neural network to a fuzzy rule based system. Consequently, if the obtained fuzzy rules are easily interpreted, the neural network will also be easy to interpret. Our case study is based on the COCOMO'81 dataset.

156 citations


01 Jan 2002
TL;DR: A popular neural network algorithm is adapted for multi-instance learning through employing a specific error function and experiments show that the adapted algorithm achieves good result on the drug activity prediction data.
Abstract: Multi-instance learning originates from the investigation on drug activity prediction, where the task is to predict whether an unseen molecule could be used to make some drug. Such a problem is difficult because a molecule may have many alternative shapes with low energy, yet only one of those shapes may be responsible for the qualification of the molecule to make the drug. Because of its unique characteristics and extensive existence, multi-instance learning is regarded as a new machine learning framework parallel to supervised learning, unsupervised learning, and reinforcement learning. In this paper, an open problem of this area is addressed. That is, a popular neural network algorithm is adapted for multi-instance learning through employing a specific error function. Experiments show that the adapted algorithm achieves good result on the drug activity prediction data.

138 citations


Dissertation
19 Jun 2002
TL;DR: The continuous TD(lambda) algorithm is refined to handle situations with discontinuous states and controls, and the vario-eta algorithm is proposed as a simple but efficient method to perform gradient descent.
Abstract: This thesis is a study of practical methods to estimate value functions with feedforward neural networks in model-based reinforcement learning. Focus is placed on problems in continuous time and space, such as motor-control tasks. In this work, the continuous TD(lambda) algorithm is refined to handle situations with discontinuous states and controls, and the vario-eta algorithm is proposed as a simple but efficient method to perform gradient descent. The main contributions of this thesis are experimental successes that clearly indicate the potential of feedforward neural networks to estimate high-dimensional value functions. Linear function approximators have been often preferred in reinforcement learning, but successful value function estimations in previous works are restricted to mechanical systems with very few degrees of freedom. The method presented in this thesis was tested successfully on an original task of learning to swim by a simulated articulated robot, with 4 control variables and 12 independent state variables, which is significantly more complex than problems that have been solved with linear function approximators so far.

108 citations


Proceedings ArticleDOI
07 Aug 2002
TL;DR: A new method is proposed for detecting novel behavior through the use of autoassociative neural network encoders, which can be shown to implicitly learn the nature of the underlying "normal" system behavior.
Abstract: When the situation arises that only "normal" behavior is known about a system, it is desirable to develop a system based solely on that behavior which enables the user to determine when that system behavior falls outside of that range of normality. A new method is proposed for detecting such novel behavior through the use of autoassociative neural network encoders, which can be shown to implicitly learn the nature of the underlying "normal" system behavior.

79 citations


Journal ArticleDOI
TL;DR: This article considers unsupervised learning from the point of view of applying neural computation on signal and data analysis problems by considering the problem in the framework of probabilistic generative models.

68 citations


Proceedings ArticleDOI
18 Nov 2002
TL;DR: This paper investigates existence conditions of energy functions for a class of fully connected complex-valued neural networks and proposes an energy function, analogous to those of real-valued Hopfield-type neural networks, and shows that, similar to the real- valued ones, the energy function enables us to analyze qualitative behaviors of the complex- valued neural networks.
Abstract: Recently models of neural networks that can deal with complex numbers, complex-valued neural networks, have been proposed and several studies on their abilities of information processing have been done. In this paper we investigate existence conditions of energy functions for a class of fully connected complex-valued neural networks and propose an energy function, analogous to those of real-valued Hopfield-type neural networks. It is also shown that, similar to the real-valued ones, the energy function enables us to analyze qualitative behaviors of the complex-valued neural networks. We present dynamic properties of the complex-valued neural networks obtained by qualitative analysis using the energy function. A synthesis method of complex-valued associative memories by utilizing the analysis results is also discussed.

53 citations


Proceedings ArticleDOI
18 Nov 2002
TL;DR: The properties of the critical points caused by the hierarchical structure of complex-valued neural networks are investigated and it is shown that if the loss function used is not regular as a complex function, the critical Points are all saddle points.
Abstract: The properties of the critical points caused by the hierarchical structure of complex-valued neural networks are investigated. If the loss function used is not regular as a complex function, the critical points caused by the hierarchical structure are all saddle points.

49 citations


Journal ArticleDOI
TL;DR: It is shown that this recurrent network of Xia et al. (1996) contains some unnecessary circuits which can fail to provide the correct value of one of the SVM parameters and how to avoid these drawbacks is suggested.
Abstract: The recurrent network of Xia et al. (1996) was proposed for solving quadratic programming problems and was recently adapted to support vector machine (SVM) learning by Tan et al. (2000). We show that this formulation contains some unnecessary circuits which, furthermore, can fail to provide the correct value of one of the SVM parameters and suggest how to avoid these drawbacks.

45 citations


Journal ArticleDOI
TL;DR: Two different models are developed, one extracting nine scale parameters with image processing, and the other using an unsupervised artificial neural network to extract features automatically, which are determined in accordance with the complexity of the scale structure and the accuracy of the model.
Abstract: Artificial neural networks (ANN) are increasingly used to solve many problems related to pattern recognition and object classification. In this paper, we report on a study using artificial neural networks to classify two kinds of animal fibers: merino and mohair. We have developed two different models, one extracting nine scale parameters with image processing, and the other using an unsupervised artificial neural network to extract features automatically, which are determined in accordance with the complexity of the scale structure and the accuracy of the model. Although the first model can achieve higher accuracy, it requires more effort for image processing and more prior knowledge, since the accuracy of the ANN largely depends on the parameters selected. The second model is more robust than the first, since only raw images are used. Because only ordinary optical images taken with a microscope are employed, we can use the approach for many textile applications without expensive equipment such as scanning electron microscopy.

Journal ArticleDOI
TL;DR: It is argued that firms can be viewed as learning algorithms, and model the firm as a type of artificial neural network (ANN), to show which types of networks maximize the net return to computation given different environments.
Abstract: This paper proposes using computational learning theory (CLT) as a framework for analyzing the information processing behavior of firms; we argue that firms can be viewed as learning algorithms. The costs and benefits of processing information are linked to the structure of the firm and its relationship with the environment. We model the firm as a type of artificial neural network (ANN). By a simulation experiment, we show which types of networks maximize the net return to computation given different environments.

Journal ArticleDOI
TL;DR: It has been demonstrated through several experiments that very promising results are obtained as compared to presently available techniques in the literature.
Abstract: The objective of the paper is the application of an adaptive constructive one-hidden-layer feedforward neural networks (OHL-FNNs) to image compression. Comparisons with fixed structure neural networks are performed to demonstrate and illustrate the training and the generalization capabilities of the proposed adaptive constructive networks. The influence of quantization effects as well as comparison with the baseline JPEG scheme are also investigated. It has been demonstrated through several experiments that very promising results are obtained as compared to presently available techniques in the literature.

Journal ArticleDOI
TL;DR: It is proposed that the main reason that recurrent neural networks have not worked well in engineering applications is that they implicitly rely on a very simplistic likelihood model, and the diffusion network approach proposed here is much richer and may open new avenues for applications of recurrent Neural networks.
Abstract: We present a Monte Carlo approach for training partially observable diffusion processes. We apply the approach to diffusion networks, a stochastic version of continuous recurrent neural networks. The approach is aimed at learning probability distributions of continuous paths, not just expected values. Interestingly, the relevant activation statistics used by the learning rule presented here are inner products in the Hilbert space of square integrable functions. These inner products can be computed using Hebbian operations and do not require backpropagation of error signals. Moreover, standard kernel methods could potentially be applied to compute such inner products. We propose that the main reason that recurrent neural networks have not worked well in engineering applications (e.g., speech recognition) is that they implicitly rely on a very simplistic likelihood model. The diffusion network approach proposed here is much richer and may open new avenues for applications of recurrent neural networks. We present some analysis and simulations to support this view. Very encouraging results were obtained on a visual speech recognition task in which neural networks outperformed hidden Markov models.


Patent
15 Nov 2002
TL;DR: A plausible neural network (PLANN) as discussed by the authors is an artificial neural network with weight connection given by mutual information, which has the capability and learning, and yet retains many characteristics of a biological neural network.
Abstract: A plausible neural network (PLANN) is an artificial neural network with weight connection given by mutual information, which has the capability and learning, and yet retains many characteristics of a biological neural network. The learning algorithm (300, 301, 302, 304, 306, 308) is based on statistical estimation, which is faster than the gradient decent approach currently used. The network after training becomes a fuzzy/belief network; the inference and weight are exchangeable, and as a result, knowledge extraction becomes simple. PLANN performs associative memory, supervised, semi-supervised, unsupervised learning and function/relation approximation in a single network architecture. This network architecture can easily be implemented by analog VLSI circuit design.

Journal Article
TL;DR: This work attempts to determine the important input nodes in the proposed network pruning and extraction algorithms and compares the results with those from See5 to demonstrate the effectiveness of the proposed algorithms.
Abstract: Despite their diverse applications in many domains, neural networks are difficult to interpret owning the lack of mathematical models to express the training result. While adopting the rule extract...

Proceedings ArticleDOI
26 Aug 2002
TL;DR: QNN combines the advantages of neural modeling and fuzzy theoretic principles and its application to recognition of continuous digits shows that more than 15% error reduction is achieved on a speaker-independent continuous digits recognition task compared with the backpropagation (BP) network.
Abstract: This paper describes a new kind of neural network - quantum neural network (QNN) and its application to recognition of continuous digits. QNN combines the advantages of neural modeling and fuzzy theoretic principles. Experiment results show that more than 15% error reduction is achieved on a speaker-independent continuous digits recognition task compared with the backpropagation (BP) network.

Proceedings ArticleDOI
18 Nov 2002
TL;DR: This paper presents a new algorithm for the construction and training of an RBF neural network with unbalanced data to improve the classification accuracy of minority classes while maintaining the overall classification performance.
Abstract: This paper presents a new algorithm for the construction and training of an RBF neural network with unbalanced data. In applications, minority classes with much fewer samples are often present in data sets. The learning process of a neural network usually is biased towards classes with majority populations. Our study focused on improving the classification accuracy of minority classes while maintaining the overall classification performance.

Proceedings ArticleDOI
04 Nov 2002
TL;DR: The NeuroWeb project is an Internet-based framework for the simulation of neural networks that aims for using the Internet as a transparent environment to allow users the exchange of information and the exploit of available computing resources for neural network specific tasks.
Abstract: The NeuroWeb project is an Internet-based framework for the simulation of neural networks. It aims for using the Internet as a transparent environment to allow users the exchange of information (neural network objects, neural network paradigms) and the exploit of available computing resources for neural network specific tasks (specifically training of neural networks). NeuroWeb's design principles are acceptance, homogeneity, and efficiency.

Proceedings ArticleDOI
18 Nov 2002
TL;DR: It is found that complex value and the quantum states share some natural representation suitable for the parallel computation.
Abstract: The paper presents the approach of the quantum complex-valued backpropagation neural network or QCBPN. The challenge of our research is the expected results from the development of the quantum neural network using complex-valued backpropagation learning algorithm to solve classification problems. The concept of QCBPN emerged from the quantum circuit neural network research and the complex-valued backpropagation algorithm. We found that complex value and the quantum states share some natural representation suitable for the parallel computation. The quantum circuit neural network provides a qubit-like neuron model based on quantum mechanics with quantum backpropagation-learning rule, while the complex-valued backpropagation algorithm modifies standard backpropagation algorithm to learn complex number pattern in a natural way. The quantum complex-valued neuron model and the QCBPN learning algorithm are described. Finally, the realization of the QCBPN is exploited with a simple pattern recognition problem.

Journal ArticleDOI
TL;DR: An optimization-based learning algorithm for feedforward neural networks is presented, in which the network weights are determined by minimizing a sliding-window cost and an analysis of its convergence and robustness properties is made.
Abstract: An optimization-based learning algorithm for feedforward neural networks is presented, in which the network weights are determined by minimizing a sliding-window cost. The algorithm is particularly well suited for batch learning and allows one to deal with large data sets in a computationally efficient way. An analysis of its convergence and robustness properties is made. Simulation results confirm the effectiveness of the algorithm and its advantages over learning based on backpropagation and extended Kalman filter.

Journal ArticleDOI
TL;DR: The relationships between artificial neural networks and graph theory are considered in detail andGraph theory is used to study the pattern classification problem on the discrete type feedforward neural networks, and the stability analysis of feedback Artificial neural networks etc.
Abstract: The relationships between artificial neural networks and gra ph theory are considered in detail. The applications of artificial neural networks to many difficult problems of graph theory, especially NP-complete problems, and the applications of graph theory to artificial neural networks are discussed. For example graph theory is used to study the pattern classification problem on the discrete type feedforward neural networks, and the stability analysis of feedback artificial neural networks etc.

Journal ArticleDOI
TL;DR: In the prediction of Mackey Glass chaotic time series, the networks designed by the proposed approach prove to be competitive, or even superior, to traditional learning algorithms for the multi layer Perceptron networks and radial basis function networks.
Abstract: In this paper, we propose a genetic algorithm based design procedure for a multi-layer feed-forward neural network. A hierarchical genetic algorithm is used to evolve both the neural network's topo...

Proceedings ArticleDOI
28 Oct 2002
TL;DR: The proposed adaptive activation function for multilayer feedforward neural networks is based upon the backpropagation (BP) algorithm and its learning speed is much faster than that of traditional networks with fixed activation function.
Abstract: The aim of this paper is to propose a new adaptive activation function for multilayer feedforward neural networks Based upon the backpropagation (BP) algorithm, an effective learning method is derived to adjust the free parameters in the activation function as well as the connected weights between neurons Its performance is demonstrated by the N-parity and two-spiral problems The simulation results showed that the proposed method is more suitable to the pattern classification problems and its learning speed is much faster than that of traditional networks with fixed activation function

01 Nov 2002
TL;DR: The elements of a feedforward-backpropagation neural network, that has been trained to detect edges in images, are described in terms of differential operators of various orders and with various angles of operation.
Abstract: This paper illustrates a novel method to analyze artificial neural networks so as to gain insight into their internal functionality. To this purpose, the elements of a feedforward-backpropagation neural network, that has been trained to detect edges in images, are described in terms of differential operators of various orders and with various angles of operation.

Journal Article
TL;DR: This article demonstrates that feed-forward neural networks can be applied successfully to high-dimensional problems and the main difficulties of using backpropagation networks in reinforcement learning are reviewed, and a simple method to perform gradient descent efficiently is proposed.
Abstract: Local linear function approximators are often preferred to feedforward neural networks to estimate value functions in reinforcement learning. Still, motor tasks usually solved by this kind of methods have a low-dimensional state space. This article demonstrates that feedforward neural networks can be applied successfully to high-dimensional problems. The main difficulties of using backpropagation networks in reinforcement learning are reviewed, and a simple method to perform gradient descent efficiently is proposed. It was tested successfully on an original task of learning to swim by a complex simulated articulated robot, with 4 control variables and 12 independent state variables.

Journal ArticleDOI
TL;DR: A complete implementation for the classification and learning algorithms is given in terms of unitary quantum gates and can be used to perform complex classification tasks or to solve the general problem of binary mapping.
Abstract: We present the algorithms necessary for the implementation of a quantum neural network with learning and classification tasks. A complete implementation for the classification and learning algorithms is given in terms of unitary quantum gates. Such a quantum neural network can be used to perform complex classification tasks or to solve the general problem of binary mapping.

Proceedings ArticleDOI
04 Nov 2002
TL;DR: The research establishes a neural network credit-risk evaluation model by using back-propagation algorithm that has higher classification accuracy compared with the traditional parameter statistical approach, that is linear discriminant analysis.
Abstract: The research establishes a neural network credit-risk evaluation model by using back-propagation algorithm. The model is evaluated by the credits for 120 applicants. The 120 data are separated in three groups: a "good credit" group, a "middle credit" group and a "bad credit" group. The simulation shows that the neural network credit-risk evaluation model has higher classification accuracy compared with the traditional parameter statistical approach, that is linear discriminant analysis. We still give a learning algorithm and a corresponding algorithm of the model.

Journal ArticleDOI
TL;DR: This study proposes to use information obtained from a first-principle model to impart a sense of “direction” to the neural network model estimate by modifying the objective function so as to include an additional term that is the difference between the time derivative of the outputs, as predicted by the Neural Network, and that of the Output of theFirst-principles model during the training phase.