scispace - formally typeset
Y

Yoshua Bengio

Researcher at Université de Montréal

Publications -  1146
Citations -  534376

Yoshua Bengio is an academic researcher from Université de Montréal. The author has contributed to research in topics: Artificial neural network & Deep learning. The author has an hindex of 202, co-authored 1033 publications receiving 420313 citations. Previous affiliations of Yoshua Bengio include McGill University & Centre de Recherches Mathématiques.

Papers
More filters
Posted Content

BabyAI 1.1.

TL;DR: BabyAI 1.1 improves the agent's architecture in three minor ways that increases reinforcement learning sample efficiency by up to 3 times and improves imitation learning performance on the hardest level from 77 % to 90.4 %.
Proceedings Article

Regularized Auto-Encoders Estimate Local Statistics

TL;DR: In this article, it was shown that minimizing a particular form of regularized reconstruction error yields a reconstruction function that locally characterizes the shape of the data generating density, and this was confirmed in sampling experiments.
Posted Content

Experiments on the Application of IOHMMs to Model Financial Returns Series

Abstract: Input/Output Hidden Markov Models (IOHMMs) are conditional hidden Markov models in which the emission (and possibly the transition) probabilities can be conditioned on an input sequence. For example, these conditional distributions can be linear, logistic, or non-linear (using for example multi-layer neural networks). We compare the generalization performance of several models which are special cases of Input/Output Hidden Markov Models on financial time-series prediction tasks: an unconditional Gaussian, a conditional linear Gaussian, a mixture of Gaussians, a mixture of conditional linear Gaussians, a hidden Markov model, and various IOHMMs. The experiments compare these models on predicting the conditional density of returns of market and sector indices. Note that the unconditional Gaussian estimates the first moment with the historical average. The results show that, although for the first moment the historical average gives the best results, for the higher moments, the IOHMMs yielded significantly better performance, as estimated by the out-of-sample likelihood. Input/Output Hidden Markov Models (IOHMMs) sont des modeles de Markov caches pour lesquels les probabilites d'emission (et possiblement de transition) peuvent dependre d'une sequence d'entree. Par exemple, ces distributions conditionnelles peuvent etre lineaires, logistique, ou non-lineaire (utilisant, par exemple, une reseau de neurones multi-couches). Nous comparons les performances de generalisation de plusieurs modeles qui sont des cas particuliers de IOHMMs pour des problemes de predictions de series financieres : une gaussienne inconditionnelle, une gaussienne lineaire conditionnelle, une mixture de gaussienne, une mixture de gaussiennes lineaires conditionnelles, un modele de Markov cache, et divers IOHMMs. Les experiences comparent ces modeles sur leur predictions de la densite conditionnelle des rendements des indices sectoriels et du marche. Notons qu'une gaussienne inconditionnelle estime le premier moment avec une moyenne historique. Les resultats montrent que, meme si la moyenne historique donne les meilleurs resultats pour le premier moment, pour les moments d'ordres superieurs les IOHMMs performent significativement mieux, comme estime par la vraisemblance hors-echantillon.
Posted Content

On Random Weights for Texture Generation in One Layer Neural Networks

TL;DR: It is theoretically show that one layer convolutional architectures (without a non-linearity) paired with the an energy function used in previous literature, can in fact preserve and modulate frequency coefficients in a manner so that random weights and pretrained weights will generate the same type of images.
Posted Content

Learning to rank for censored survival data.

TL;DR: This work studies how three categories of loss functions, namely partial likelihood methods, rank methods, and the classification method based on a Wasserstein metric and the non-parametric Kaplan Meier estimate of the probability density to impute the labels of censored examples, can take advantage of right-censored examples.