Conference

International Conference on Artificial Neural Networks

About: International Conference on Artificial Neural Networks is an academic conference. The conference publishes majorly in the area(s): Artificial neural network & Computer science. Over the lifetime, 6053 publications have been published by the conference receiving 69089 citations.

...read moreread less

Topics: Artificial neural network, Computer science, Recurrent neural network, Deep learning, Time delay neural network ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Learning to forget: continual prediction with LSTM

[...]

Felix A. Gers¹, Jürgen Schmidhuber¹, Fred Cummins¹•Institutions (1)

Dalle Molle Institute for Artificial Intelligence Research¹

01 Jan 1999

TL;DR: This work identifies a weakness of LSTM networks processing continual input streams without explicitly marked sequence ends and proposes an adaptive "forget gate" that enables an L STM cell to learn to reset itself at appropriate times, thus releasing internal resources.

...read moreread less

Abstract: Long short-term memory (LSTM) can solve many tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams without explicitly marked sequence ends. Without resets, the internal state values may grow indefinitely and eventually cause the network to break down. Our remedy is an adaptive "forget gate" that enables an LSTM cell to learn to reset itself at appropriate times, thus releasing internal resources. We review an illustrative benchmark problem on which standard LSTM outperforms other RNN algorithms. All algorithms (including LSTM) fail to solve a continual version of that problem. LSTM with forget gates, however, easily solves it in an elegant way.

...read moreread less

2,961 citations

Book Chapter•DOI•

Kernel Principal Component Analysis

[...]

Bernhard Schölkopf¹, Alexander J. Smola, Klaus-Robert Müller•Institutions (1)

Max Planck Society¹

08 Oct 1997

TL;DR: A new method for performing a nonlinear form of Principal Component Analysis by the use of integral operator kernel functions is proposed and experimental results on polynomial feature extraction for pattern recognition are presented.

...read moreread less

Abstract: A new method for performing a nonlinear form of Principal Component Analysis is proposed. By the use of integral operator kernel functions, one can efficiently compute principal components in highdimensional feature spaces, related to input space by some nonlinear map; for instance the space of all possible d-pixel products in images. We give the derivation of the method and present experimental results on polynomial feature extraction for pattern recognition.

...read moreread less

2,223 citations

Book Chapter•DOI•

Stacked convolutional auto-encoders for hierarchical feature extraction

[...]

Jonathan Masci, Ueli Meier, Dan Ciresan, Jürgen Schmidhuber

14 Jun 2011

TL;DR: A novel convolutional auto-encoder (CAE) for unsupervised feature learning that initializing a CNN with filters of a trained CAE stack yields superior performance on a digit and an object recognition benchmark.

...read moreread less

Abstract: We present a novel convolutional auto-encoder (CAE) for unsupervised feature learning. A stack of CAEs forms a convolutional neural network (CNN). Each CAE is trained using conventional on-line gradient descent without additional regularization terms. A max-pooling layer is essential to learn biologically plausible features consistent with those found by previous approaches. Initializing a CNN with filters of a trained CAE stack yields superior performance on a digit (MNIST) and an object recognition (CIFAR10) benchmark.

...read moreread less

1,832 citations

Book Chapter•DOI•

A Survey on Deep Transfer Learning

[...]

Chuanqi Tan¹, Fuchun Sun¹, Tao Kong¹, Wenchang Zhang¹, Chao Yang¹, Chunfang Liu¹ - Show less +2 more•Institutions (1)

Tsinghua University¹

04 Oct 2018

TL;DR: Deep transfer learning relaxes the hypothesis that the training data must be independent and identically distributed (i.i.d.) with the test data, which motivates researchers to use transfer learning to solve the problem of insufficient training data as mentioned in this paper.

...read moreread less

Abstract: As a new classification platform, deep learning has recently received increasing attention from researchers and has been successfully applied to many domains. In some domains, like bioinformatics and robotics, it is very difficult to construct a large-scale well-annotated dataset due to the expense of data acquisition and costly annotation, which limits its development. Transfer learning relaxes the hypothesis that the training data must be independent and identically distributed (i.i.d.) with the test data, which motivates us to use transfer learning to solve the problem of insufficient training data. This survey focuses on reviewing the current researches of transfer learning by using deep neural network and its applications. We defined deep transfer learning, category and review the recent research works based on the techniques used in deep transfer learning.

...read moreread less

1,543 citations

Book Chapter•DOI•

Evaluation of pooling operations in convolutional architectures for object recognition

[...]

Dominik Scherer¹, Andreas Müller¹, Sven Behnke¹•Institutions (1)

University of Bonn¹

15 Sep 2010

TL;DR: The aim is to gain insight into different functions by directly comparing them on a fixed architecture for several common object recognition tasks, and empirical results show that a maximum pooling operation significantly outperforms subsampling operations.

...read moreread less

Abstract: A common practice to gain invariant features in object recognition models is to aggregate multiple low-level features over a small neighborhood. However, the differences between those models makes a comparison of the properties of different aggregation functions hard. Our aim is to gain insight into different functions by directly comparing them on a fixed architecture for several common object recognition tasks. Empirical results show that a maximum pooling operation significantly outperforms subsampling operations. Despite their shift-invariant properties, overlapping pooling windows are no significant improvement over nonoverlapping pooling windows. By applying this knowledge, we achieve state-of-the-art error rates of 4.57% on the NORB normalized-uniform dataset and 5.6% on the NORB jittered-cluttered dataset.

...read moreread less

1,409 citations

Collapse

Performance

Metrics

6,053

Papers

69,089

Citations

No. of papers from the Conference in previous years
Year	Papers
2023	1
2022	249
2021	264
2020	140
2019	321
2018	216