Top 2 papers published by Andreas Knoblauch from Honda in 2022

Journal Article•DOI•

On the antiderivatives of xp/(1 − x) with an application to optimize loss functions for classification with neural networks

[...]

Andreas Knoblauch

18 Mar 2022-Annals of Mathematics and Artificial Intelligence

TL;DR: In this paper , a generalized power error loss function is proposed to adapt to more realistic error distributions by fitting the exponent q of a power function used for initializing the backpropagation learning algorithm.

...read moreread less

Abstract: Abstract Supervised learning in neural nets means optimizing synaptic weights W such that outputs y ( x ; W ) for inputs x match as closely as possible the corresponding targets t from the training data set. This optimization means minimizing a loss function ${\mathscr{L}}(\mathbf {W})$ L ( W ) that usually motivates from maximum-likelihood principles, silently making some prior assumptions on the distribution of output errors y − t . While classical crossentropy loss assumes triangular error distributions, it has recently been shown that generalized power error loss functions can be adapted to more realistic error distributions by fitting the exponent q of a power function used for initializing the backpropagation learning algorithm. This approach can significantly improve performance, but computing the loss function requires the antiderivative of the function f ( y ) := y q − 1 /(1 − y ) that has previously been determined only for natural $q\in \mathbb {N}$ q ∈ ℕ . In this work I extend this approach for rational q = n /2 m where the denominator is a power of 2. I give closed-form expressions for the antiderivative ${\int \limits } f(y) dy$ ∫ f ( y ) d y and the corresponding loss function. The benefits of such an approach are demonstrated by experiments showing that optimal exponents q are often non-natural, and that error exponents q best fitting output error distributions vary continuously during learning, typically decreasing from large q > 1 to small q < 1 during convergence of learning. These results suggest new adaptive learning methods where loss functions could be continuously adapted to output error distributions during learning.

...read moreread less

5 citations

Book Chapter•DOI•

Adapting Loss Functions to Learning Progress Improves Accuracy of Classification in Neural Networks

[...]

Andreas Knoblauch

01 Jan 2022

TL;DR: In this article , it was shown that the dominant mechanism of PEL is better adapting to output error distributions, rather than implicitly manipulating learning rate, and that PEL clearly remains superior over BCE/CCE if q is properly decreased during learning.

...read moreread less

Abstract: AbstractPower error loss (PEL) has recently been suggested as a more efficient generalization of binary or categorical cross entropy (BCE/CCE). However, as PEL requires to adapt the exponent q of a power function to training data and learning progress, it has been argued that the observed improvements may be due to implicitly optimizing learning rate. Here we invalidate this argument by optimizing learning rate in each training step. We find that PEL clearly remains superior over BCE/CCE if q is properly decreased during learning. This proves that the dominant mechanism of PEL is better adapting to output error distributions, rather than implicitly manipulating learning rate.KeywordsCross entropyPower error lossLearning rateLearning scheduleRandom grid path search

...read moreread less

Showing papers by "Andreas Knoblauch published in 2022"