scispace - formally typeset
G

Giancarlo Kerg

Researcher at Université de Montréal

Publications -  13
Citations -  118

Giancarlo Kerg is an academic researcher from Université de Montréal. The author has contributed to research in topics: Recurrent neural network & Vanishing gradient problem. The author has an hindex of 5, co-authored 8 publications receiving 83 citations.

Papers
More filters
Proceedings Article

Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics

TL;DR: In this paper, the authors propose a connectivity structure based on the Schur decomposition, which allows to parametrize matrices with unit-norm eigenspectra without orthogonality constraints on eigenbases.
Proceedings Article

h-detach: Modifying the LSTM Gradient Towards Better Optimization

TL;DR: In this paper, a stochastic algorithm called H-detach was proposed to prevent the vanishing gradient problem in LSTM by suppressing the gradient components through the linear path (cell state) in the computational graph, which can prevent LSTMs from capturing long-term dependencies.
Posted Content

Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics

TL;DR: This work proposes a novel connectivity structure based on the Schur decomposition and a splitting of theSchur form into normal and non-normal parts that retains the stability advantages and training speed of orthogonal RNNs while enhancing expressivity, especially on tasks that require computations over ongoing input sequences.
Posted Content

h-detach: Modifying the LSTM Gradient Towards Better Optimization

TL;DR: A simple stochastic algorithm is introduced that prevents gradients flowing through this path from getting suppressed, thus allowing the LSTM to capture such dependencies better and show significant improvements over vanilla L STM gradient based training in terms of convergence speed, robustness to seed and learning rate, and generalization.
Proceedings Article

Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization

TL;DR: This paper showed that the early value of the trace of the Fisher Information Matrix (FIM) correlates strongly with the final generalization and showed that in the absence of implicit or explicit regularization, the trace can increase to a large value early in training, to which they refer as catastrophic Fisher explosion.