Institution

Université de Montréal

Education•Montreal, Quebec, Canada•

About: Université de Montréal is a education organization based out in Montreal, Quebec, Canada. It is known for research contribution in the topics: Population & Context (language use). The organization has 45641 authors who have published 100476 publications receiving 4004007 citations. The organization is also known as: University of Montreal & UdeM.

...read moreread less

Topics: Population, Context (language use), Poison control, Health care, Medicine ...read more

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

On the Properties of Neural Machine Translation: Encoder--Decoder Approaches

[...]

Kyunghyun Cho¹, Bart van Merriënboer¹, Dzmitry Bahdanau², Yoshua Bengio³, Yoshua Bengio⁴, Yoshua Bengio⁵ - Show less +2 more•Institutions (5)

Université de Montréal¹, Jacobs University Bremen², École Polytechnique de Montréal³, Alcatel-Lucent⁴, AT&T⁵

03 Sep 2014

TL;DR: In this paper, a gated recursive convolutional neural network (GRNN) was proposed to learn a grammatical structure of a sentence automatically, which performed well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase.

...read moreread less

Abstract: Neural machine translation is a relatively new approach to statistical machine translation based purely on neural networks. The neural machine translation models often consist of an encoder and a decoder. The encoder extracts a fixed-length representation from a variable-length input sentence, and the decoder generates a correct translation from this representation. In this paper, we focus on analyzing the properties of the neural machine translation using two models; RNN Encoder‐Decoder and a newly proposed gated recursive convolutional neural network. We show that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Furthermore, we find that the proposed gated recursive convolutional network learns a grammatical structure of a sentence automatically.

...read moreread less

4,702 citations

Posted Content•

How transferable are features in deep neural networks

[...]

Jason Yosinski¹, Jeff Clune², Yoshua Bengio³, Hod Lipson¹•Institutions (3)

Cornell University¹, University of Wyoming², Université de Montréal³

06 Nov 2014-arXiv: Learning

TL;DR: This paper quantifies the generality versus specificity of neurons in each layer of a deep convolutional neural network and reports a few surprising results, including that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset.

...read moreread less

Abstract: Many deep neural networks trained on natural images exhibit a curious phenomenon in common: on the first layer they learn features similar to Gabor filters and color blobs. Such first-layer features appear not to be specific to a particular dataset or task, but general in that they are applicable to many datasets and tasks. Features must eventually transition from general to specific by the last layer of the network, but this transition has not been studied extensively. In this paper we experimentally quantify the generality versus specificity of neurons in each layer of a deep convolutional neural network and report a few surprising results. Transferability is negatively affected by two distinct issues: (1) the specialization of higher layer neurons to their original task at the expense of performance on the target task, which was expected, and (2) optimization difficulties related to splitting networks between co-adapted neurons, which was not expected. In an example network trained on ImageNet, we demonstrate that either of these two issues may dominate, depending on whether features are transferred from the bottom, middle, or top of the network. We also document that the transferability of features decreases as the distance between the base task and target task increases, but that transferring features even from distant tasks can be better than using random features. A final surprising result is that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset.

...read moreread less

4,663 citations

Journal Article•DOI•

A second generation human haplotype map of over 3.1 million SNPs

[...]

Kelly A. Frazer¹, Dennis G. Ballinger, David R. Cox, David A. Hinds +234 more•Institutions (48)

18 Oct 2007-Nature

TL;DR: The Phase II HapMap is described, which characterizes over 3.1 million human single nucleotide polymorphisms genotyped in 270 individuals from four geographically diverse populations and includes 25–35% of common SNP variation in the populations surveyed, and increased differentiation at non-synonymous, compared to synonymous, SNPs is demonstrated.

...read moreread less

Abstract: We describe the Phase II HapMap, which characterizes over 3.1 million human single nucleotide polymorphisms (SNPs) genotyped in 270 individuals from four geographically diverse populations and includes 25-35% of common SNP variation in the populations surveyed. The map is estimated to capture untyped common variation with an average maximum r2 of between 0.9 and 0.96 depending on population. We demonstrate that the current generation of commercial genome-wide genotyping products captures common Phase II SNPs with an average maximum r2 of up to 0.8 in African and up to 0.95 in non-African populations, and that potential gains in power in association studies can be obtained through imputation. These data also reveal novel aspects of the structure of linkage disequilibrium. We show that 10-30% of pairs of individuals within a population share at least one region of extended genetic identity arising from recent ancestry and that up to 1% of all common variants are untaggable, primarily because they lie within recombination hotspots. We show that recombination rates vary systematically around genes and between genes of different function. Finally, we demonstrate increased differentiation at non-synonymous, compared to synonymous, SNPs, resulting from systematic differences in the strength or efficacy of natural selection between populations.

...read moreread less

4,565 citations

Journal Article•DOI•

Global, regional, and national incidence, prevalence, and years lived with disability for 301 acute and chronic diseases and injuries in 188 countries, 1990-2013: a systematic analysis for the Global Burden of Disease Study 2013

[...]

Theo Vos¹, Ryan M Barber¹, Brad Bell¹, Amelia Bertozzi-Villa¹ +686 more•Institutions (287)

22 Aug 2015-The Lancet

TL;DR: In the Global Burden of Disease Study 2013 (GBD 2013) as mentioned in this paper, the authors estimated the quantities for acute and chronic diseases and injuries for 188 countries between 1990 and 2013.

...read moreread less

4,510 citations

Proceedings Article•

Greedy Layer-Wise Training of Deep Networks

[...]

Yoshua Bengio¹, Pascal Lamblin¹, Dan Popovici¹, Hugo Larochelle¹•Institutions (1)

Université de Montréal¹

04 Dec 2006

TL;DR: These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization.

...read moreread less

Abstract: Complexity theory of circuits strongly suggests that deep architectures can be much more efficient (sometimes exponentially) than shallow architectures, in terms of computational elements required to represent some functions. Deep multi-layer neural networks have many levels of non-linearities allowing them to compactly represent highly non-linear and highly-varying functions. However, until recently it was not clear how to train such deep networks, since gradient-based optimization starting from random initialization appears to often get stuck in poor solutions. Hinton et al. recently introduced a greedy layer-wise unsupervised learning algorithm for Deep Belief Networks (DBN), a generative model with many layers of hidden causal variables. In the context of the above optimization problem, we study this algorithm empirically and explore variants to better understand its success and extend it to cases where the inputs are continuous or where the structure of the input distribution is not revealing enough about the variable to be predicted in a supervised task. Our experiments also confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization.

...read moreread less

4,385 citations

Collapse

Authors

Showing all 45957 results

Name	H-index	Papers	Citations
Yoshua Bengio	202	1033	420313
Alan C. Evans	183	866	134642
Richard H. Friend	169	1182	140032
Anders Björklund	165	769	84268
Charles N. Serhan	158	728	84810
Fernando Rivadeneira	146	628	86582
C. Dallapiccola	136	1717	101947
Michael J. Meaney	136	604	81128
Claude Leroy	135	1170	88604
Georges Azuelos	134	1294	90690
Phillip Gutierrez	133	1391	96205
Danny Miller	133	512	71238
Henry T. Lynch	133	925	86270
Stanley Nattel	132	778	65700
Lucie Gauthier	132	679	64794