Yoshua Bengio

Posted Content

Three Factors Influencing Minima in SGD

- 13 Nov 2017 -

TL;DR: Through this analysis, it is found that three factors – learning rate, batch size and the variance of the loss gradients – control the trade-off between the depth and width of the minima found by SGD, with wider minima favoured by a higher ratio of learning rate to batch size.

...read moreread less

Posted Content

Deep Complex Networks

Chiheb Trabelsi, +9 more

- 27 May 2017 -

arXiv: Neural and Evolutionary Computing

TL;DR: This work relies on complex convolutions and present algorithms for complex batch-normalization, complex weight initialization strategies for complex-valued neural nets and uses them in experiments with end-to-end training schemes and demonstrates that such complex- valued models are competitive with their real-valued counterparts.

...read moreread less

Proceedings ArticleDOI

Advances in optimizing recurrent networks

Yoshua Bengio, +2 more

TL;DR: In this paper, the authors evaluate the use of clipping gradients, spanning longer time ranges with leaky integration, advanced momentum techniques, using more powerful output probability models, and encouraging sparser gradients to help symmetry breaking and credit assignment.

...read moreread less

Proceedings Article

Hierarchical Recurrent Neural Networks for Long-Term Dependencies

Salah El Hihi, +1 more

TL;DR: This paper proposes to use a more general type of a-priori knowledge, namely that the temporal dependencies are structured hierarchically, which implies that long-term dependencies are represented by variables with a long time scale.

...read moreread less

Journal ArticleDOI

EmoNets: Multimodal deep learning approaches for emotion recognition in video

Samira Ebrahimi Kahou, +17 more

- 01 Jun 2016 -

Journal on Multimodal User Interfaces

TL;DR: In this article, the authors presented an approach to learn several specialist models using deep learning techniques, each focusing on one modality, including CNN, deep belief net, K-means based bag-of-mouths, and relational autoencoder.

...read moreread less

Papers

Three Factors Influencing Minima in SGD

Deep Complex Networks

Advances in optimizing recurrent networks

Hierarchical Recurrent Neural Networks for Long-Term Dependencies

EmoNets: Multimodal deep learning approaches for emotion recognition in video