scispace - formally typeset
Y

Yoshua Bengio

Researcher at Université de Montréal

Publications -  1146
Citations -  534376

Yoshua Bengio is an academic researcher from Université de Montréal. The author has contributed to research in topics: Artificial neural network & Deep learning. The author has an hindex of 202, co-authored 1033 publications receiving 420313 citations. Previous affiliations of Yoshua Bengio include McGill University & Centre de Recherches Mathématiques.

Papers
More filters
Proceedings ArticleDOI

A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion

TL;DR: This work presents a novel hierarchical recurrent encoder-decoder architecture that makes possible to account for sequences of previous queries of arbitrary lengths and is sensitive to the order of queries in the context while avoiding data sparsity.
Proceedings Article

Zero-data learning of new tasks

TL;DR: The main contributions of this work lie in the presentation of a general formalization of zero-data learning, in an experimental analysis of its properties and in empirical evidence showing that generalization is possible and significant in this context.
Proceedings ArticleDOI

Pointing the unknown words

TL;DR: This article used two softmax layers in order to predict the next word in conditional language models: one predicts the location of a word in the source sentence, and the other predicts a word from the shortlist vocabulary.
Posted Content

Manifold Mixup: Better Representations by Interpolating Hidden States.

TL;DR: Manifold Mixup, a simple regularizer that encourages neural networks to predict less confidently on interpolations of hidden representations, improves strong baselines in supervised learning, robustness to single-step adversarial attacks, and test log-likelihood.
Posted Content

Hierarchical Multiscale Recurrent Neural Networks

TL;DR: In this paper, a hierarchical multiscale recurrent neural network (HM-RNN) is proposed to capture the latent hierarchical structure in the sequence by encoding the temporal dependencies with different timescales using a novel update mechanism.