Showing papers on "Softmax function published in 2004"

PDF

Open Access

Journal Article•DOI•

Efficient training of RBF networks for classification.

[...]

Ian T. Nabney¹•Institutions (1)

01 Jun 2004-International Journal of Neural Systems

TL;DR: This work shows how RBFs with logistic and softmax outputs can be trained efficiently using the Fisher scoring algorithm, and compares this approach with standard non-linear optimisation algorithms on a number of datasets.

...read moreread less

Abstract: Radial Basis Function networks with linear outputs are often used in regression problems because they can be substantially faster to train than Multi-layer Perceptrons. For classification problems, the use of linear outputs is less appropriate as the outputs are not guaranteed to represent probabilities. We show how RBFs with logistic and softmax outputs can be trained efficiently using the Fisher scoring algorithm. This approach can be used with any model which consists of a generalised linear output function applied to a model which is linear in its parameters. We compare this approach with standard non-linear optimisation algorithms on a number of datasets.

...read moreread less

58 citations

Proceedings Article•DOI•

Softprop: softmax neural network backpropagation learning

[...]

M.E. Rimer¹, Tony Martinez¹•Institutions (1)

Brigham Young University¹

25 Jul 2004

TL;DR: Softprop is a novel learning approach presented here that is reminiscent of the softmax explore-exploit Q-learning search heuristic and fits the problem while delaying settling into error minima to achieve better generalization and more robust learning.

...read moreread less

Abstract: Multi-layer backpropagation, like many learning algorithms that can create complex decision surfaces, is prone to overfitting. Softprop is a novel learning approach presented here that is reminiscent of the softmax explore-exploit Q-learning search heuristic. It fits the problem while delaying settling into error minima to achieve better generalization and more robust learning. This is accomplished by blending standard SSE optimization with lazy training, a new objective function well suited to learning classification tasks, to form a more stable learning model. Over several machine learning data sets, softprop reduces classification error by 17.1 percent and the variance in results by 38.6 percent over standard SSE minimization.

...read moreread less

13 citations

Book Chapter•DOI•

A dynamic allocation method of basis functions in reinforcement learning

[...]

Shingo Iida¹, Kiyotake Kuwayama¹, Masayoshi Kanoh², Shohei Kato¹, Hidenori Itoh¹ - Show less +1 more•Institutions (2)

Nagoya Institute of Technology¹, Chukyo University²

04 Dec 2004

TL;DR: It is demonstrated that the AE-GSBFN is capable of providing better performance than the existing method and overcomes the curse of dimensionality and avoids a fall into local minima through the allocation and elimination processes.

...read moreread less

Abstract: In this paper, we propose a dynamic allocation method of basis functions, an Allocation/Elimination Gaussian Softmax Basis Function Network (AE-GSBFN), that is used in reinforcement learning AE-GSBFN is a kind of actor-critic method that uses basis functions This method can treat continuous high-dimensional state spaces, because basis functions required only for learning are dynamically allocated, and if an allocated basis function is identified as redundant, the function is eliminated This method overcomes the curse of dimensionality and avoids a fall into local minima through the allocation and elimination processes To confirm the effectiveness of our method, we used a maze task to compare our method with an existing method, which has only an allocation process Moreover, as learning of continuous high-dimensional state spaces, our method was applied to motion control of a humanoid robot We demonstrate that the AE-GSBFN is capable of providing better performance than the existing method.

...read moreread less

12 citations

Journal Article•DOI•

Softmax and ε-greedy policies applied to process control

[...]

S. Syafiie¹, Fernando Tadeo¹, E. Martinez•Institutions (1)

University of Valladolid¹

01 Aug 2004-IFAC Proceedings Volumes

TL;DR: This paper studies the application of Q-learning on Process Control problems, more precisely on Neutralization Processes, and results show that the controllers are able to learn how to control adequately the process.

...read moreread less

11 citations

Proceedings Article•

Learning control application to nonlinear process control

[...]

S. Syafiie¹, Fernando Tadeo¹, E. Martinez•Institutions (1)

University of Valladolid¹

01 Jan 2004

TL;DR: The application shows the ability of the proposed Reinforcement to nonlinear process control to control chemical processes with difficult, unknown or time-varying dynamics.

...read moreread less

Abstract: This paper presents the application of Reinforcement to nonlinear process control Reinforcement Learning is a model-free technique based on online learning without supervision, with the objective of optimizing a cumulative future reward by resorting to experimentation with the system The One-step-ahead Q-learning look-up table of reinforcement Learning Method is applied to a model of a pH neutralization process Control actions are selected using the e-greedy and softmax policies The application shows the ability of the proposed method to control chemical processes with difficult, unknown or time-varying dynamics

...read moreread less

2 citations

Task distribution through vacancy chains for heterogeneous groups of robots

[...]

Torbjørn S. Dahl, Gaurav S. Sukhatme

01 Jan 2004

TL;DR: Experimental results show how the ability of the Boltzmann Softmax action selection function to differentiate between suboptimal actions can be made to work on an inter-robot level, as a mechanism forallocating high-performance robots to high-value tasks withoutcommunication.

...read moreread less

Abstract: —We present an extension to our adaptive multi-robottask allocation algorithm based on vacancy chains, a resourcedistribution process common in animal and human societies. Thealgorithm uses individual reinforcement learning of task utilitiesand relies on the specializing abilities of the members of thegroup to promote dedicated optimal allocation patterns. Usingrealistic simulation experiments, we evaluate the approach bycomparing greedy and softmax action selection functions for taskallocation. We conclude that using softmax functions makes thevacancy chain algorithm responsive to different levels of abilityin a group of heterogeneous robots as well as to the effects of theunderlying group dynamics such as interference and synergy. I. I NTRODUCTION Existing multi-robot task allocation (MRTA) algorithms [1],[2], [3], [4] are typically not sensitive to the complex effectsof group dynamics, such as interference and synergy. For acooperative task such as transportation or foraging, the averagecompletion time may depend on the number of robots thatare allocated to the same task. Allocating a robot to a taskmay have either a positive or negative effect on a group’sperformance according to how much that robot contributespositively, in accomplishing tasks, or negatively, in increasinginterference. Such dynamics are difﬁcult to model.As a way of circumventing the difﬁculties related to model-ing group dynamics, our past work [5] presented the vacancychain (VC) algorithm. This algorithm is inspired by the VCdistribution process as found in animal and human societies[6]. Each robot in a group following this algorithm uses localreinforcement learning (RL) to estimate the utilities of a set oftasks. From the local utilities and the robots’ action selectionfunctions, emerges the allocation pattern. Experiments in sim-ulation have shown that for groups of homogeneous robots,the VC algorithm promotes optimal system states as deﬁnedby the VC framework.The VC algorithm relies on stigmergy [7], unintentionalcommunication between the robots through their effects onthe environment, to produce specialized individuals for optimalallocation. In this paper we present experimental results show-ing how the ability of the Boltzmann Softmax action selectionfunction to differentiate between suboptimal actions can bemade to work on an inter-robot level, as a mechanism forallocating high-performance robots to high-value tasks withoutcommunication. The VC algorithm can thus be extended towork for groups of heterogeneous robots. Extending the VCalgorithm to cover groups of heterogeneous robots increasesits applicability. This is important because the VC algorithm,unlike existing MRTA algorithms, is sensitive to the effectsof group dynamics and provides a way of optimizing theperformance of groups of cooperative robots in domains wherethese effects are signiﬁcant.II. M

...read moreread less

1 citations

Posted Content•

Self-organized annealing in laterally inhibited neural networks shows power law decay

[...]

Frank Emmert-Streib

30 Jan 2004-arXiv: Disordered Systems and Neural Networks

TL;DR: A method which assigns to each layer of a multilayer neural network, whose network dynamics is governed by a noisy winner-take-all mechanism, a neural temperature, and shows that after a transient the neural temperature decreases in each layer according to a power law indicates a self-organized annealing behavior induced by the learning rule itself.

...read moreread less

Abstract: In this paper we present a method which assigns to each layer of a multilayer neural network, whose network dynamics is governed by a noisy winner-take-all mechanism, a neural temperature. This neural temperature is obtained by a least mean square fit of the probability distribution of the noisy winner-take-all mechanism to the distribution of a softmax mechanism, which has a well defined temperature as free parameter. We call this approximated temperature resulting from the optimization step the neural temperature. We apply this method to a multilayer neural network during learning the XOR-problem with a Hebb-like learning rule and show that after a transient the neural temperature decreases in each layer according to a power law. This indicates a self-organized annealing behavior induced by the learning rule itself instead of an external adjustment of a control parameter as in physically motivated optimization methods e.g. simulated annealing.

...read moreread less

1 citations

Proceedings Article•DOI•

Optimization of time series forecasting by combination of models with evolutionary heuristic and error-correlation parameters

[...]

E. Bautista-Thompson¹, J. Figueroa-Nazuno¹•Institutions (1)

Instituto Politécnico Nacional¹

20 Sep 2004

TL;DR: Two algorithms for the optimization of time series forecasting by combination of models are proposed and evaluated and both are able to improve the forecasting of different time series, reducing the forecast error (RMSE) and increasing the modeling capability expressed by a reduction of the bias error (BE).

...read moreread less

Abstract: Two algorithms for the optimization of time series forecasting by combination of models are proposed and evaluated. The first named GABoost, exploits the heuristic of genetic algorithm in order to search the optimal weights for the mixing of forecasting models. The second named CombFEC, extracts information provided by the forecast errors (RMSE, BE and MAE) of each model to be combined, and the correlation between each model and the forecasted time series, in order to build an error-correlation function (FEC) used to calculate the weights with a SOFTMAX function. The results show that both algorithms are able to improve the forecasting of different time series, reducing the forecast error (RMSE) and increasing the modeling capability expressed by a reduction of the bias error (BE).

...read moreread less