scispace - formally typeset
Search or ask a question

Showing papers by "Université de Montréal published in 2014"


Journal ArticleDOI
08 Dec 2014
TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Abstract: We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to ½ everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.

38,211 citations


Proceedings ArticleDOI
01 Jan 2014
TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.
Abstract: In this paper, we propose a novel neural network model called RNN Encoder‐ Decoder that consists of two recurrent neural networks (RNN). One RNN encodes a sequence of symbols into a fixedlength vector representation, and the other decodes the representation into another sequence of symbols. The encoder and decoder of the proposed model are jointly trained to maximize the conditional probability of a target sequence given a source sequence. The performance of a statistical machine translation system is empirically found to improve by using the conditional probabilities of phrase pairs computed by the RNN Encoder‐Decoder as an additional feature in the existing log-linear model. Qualitatively, we show that the proposed model learns a semantically and syntactically meaningful representation of linguistic phrases.

19,998 citations


Posted Content
TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Abstract: Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belong to a family of encoder-decoders and consists of an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.

14,077 citations


Proceedings Article
01 Jan 2014
TL;DR: It is found that there is no distinction between individual highlevel units and random linear combinations of high level units, according to various methods of unit analysis, and it is suggested that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.
Abstract: Deep neural networks are highly expressive models that have recently achieved state of the art performance on speech and visual recognition tasks. While their expressiveness is the reason they succeed, it also causes them to learn uninterpretable solutions that could have counter-intuitive properties. In this paper we report two such properties. First, we find that there is no distinction between individual high level units and random linear combinations of high level units, according to various methods of unit analysis. It suggests that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks. Second, we find that deep neural networks learn input-output mappings that are fairly discontinuous to a significant extend. We can cause the network to misclassify an image by applying a certain imperceptible perturbation, which is found by maximizing the network's prediction error. In addition, the specific nature of these perturbations is not a random artifact of learning: the same perturbation can cause a different network, that was trained on a different subset of the dataset, to misclassify the same input.

9,561 citations


Journal ArticleDOI
Keith A. Olive1, Kaustubh Agashe2, Claude Amsler3, Mario Antonelli  +222 moreInstitutions (107)
TL;DR: The review as discussed by the authors summarizes much of particle physics and cosmology using data from previous editions, plus 3,283 new measurements from 899 Japers, including the recently discovered Higgs boson, leptons, quarks, mesons and baryons.
Abstract: The Review summarizes much of particle physics and cosmology. Using data from previous editions, plus 3,283 new measurements from 899 Japers, we list, evaluate, and average measured properties of gauge bosons and the recently discovered Higgs boson, leptons, quarks, mesons, and baryons. We summarize searches for hypothetical particles such as heavy neutrinos, supersymmetric and technicolor particles, axions, dark photons, etc. All the particle properties and search limits are listed in Summary Tables. We also give numerous tables, figures, formulae, and reviews of topics such as Supersymmetry, Extra Dimensions, Particle Detectors, Probability, and Statistics. Among the 112 reviews are many that are new or heavily revised including those on: Dark Energy, Higgs Boson Physics, Electroweak Model, Neutrino Cross Section Measurements, Monte Carlo Neutrino Generators, Top Quark, Dark Matter, Dynamical Electroweak Symmetry Breaking, Accelerator Physics of Colliders, High-Energy Collider Parameters, Big Bang Nucleosynthesis, Astrophysical Constants and Cosmological Parameters.

7,337 citations


Journal ArticleDOI
TL;DR: A protocol for coin-tossing by exchange of quantum messages is presented, which is secure against traditional kinds of cheating, even by an opponent with unlimited computing power, but ironically can be subverted by use of a still subtler quantum phenomenon, the Einstein-Podolsky-Rosen paradox.

5,126 citations


Journal ArticleDOI
TL;DR: LCZ696 was superior to enalapril in reducing the risks of death and of hospitalization for heart failure and decreased the symptoms and physical limitations of heart failure.
Abstract: Background We compared the angiotensin receptor–neprilysin inhibitor LCZ696 with enalapril in patients who had heart failure with a reduced ejection fraction. In previous studies, enalapril improved survival in such patients. Methods In this double-blind trial, we randomly assigned 8442 patients with class II, III, or IV heart failure and an ejection fraction of 40% or less to receive either LCZ696 (at a dose of 200 mg twice daily) or enalapril (at a dose of 10 mg twice daily), in addition to recommended therapy. The primary outcome was a composite of death from cardiovascular causes or hospitalization for heart failure, but the trial was designed to detect a difference in the rates of death from cardiovascular causes. Results The trial was stopped early, according to prespecified rules, after a median followup of 27 months, because the boundary for an overwhelming benefit with LCZ696 had been crossed. At the time of study closure, the primary outcome had occurred in 914 patients (21.8%) in the LCZ696 group and 1117 patients (26.5%) in the enalapril group (hazard ratio in the LCZ696 group, 0.80; 95% confidence interval [CI], 0.73 to 0.87; P<0.001). A total of 711 patients (17.0%) receiving LCZ696 and 835 patients (19.8%) receiving enalapril died (hazard ratio for death from any cause, 0.84; 95% CI, 0.76 to 0.93; P<0.001); of these patients, 558 (13.3%) and 693 (16.5%), respectively, died from cardiovascular causes (hazard ratio, 0.80; 95% CI, 0.71 to 0.89; P<0.001). As compared with enalapril, LCZ696 also reduced the risk of hospitalization for heart failure by 21% (P<0.001) and decreased the symptoms and physical limitations of heart failure (P = 0.001). The LCZ696 group had higher proportions of patients with hypotension and nonserious angioedema but lower proportions with renal impairment, hyperkalemia, and cough than the enalapril group. Conclusions LCZ696 was superior to enalapril in reducing the risks of death and of hospitalization for heart failure. (Funded by Novartis; PARADIGM-HF ClinicalTrials.gov number, NCT01035255.)

4,727 citations


Proceedings ArticleDOI
03 Sep 2014
TL;DR: In this paper, a gated recursive convolutional neural network (GRNN) was proposed to learn a grammatical structure of a sentence automatically, which performed well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase.
Abstract: Neural machine translation is a relatively new approach to statistical machine translation based purely on neural networks. The neural machine translation models often consist of an encoder and a decoder. The encoder extracts a fixed-length representation from a variable-length input sentence, and the decoder generates a correct translation from this representation. In this paper, we focus on analyzing the properties of the neural machine translation using two models; RNN Encoder‐Decoder and a newly proposed gated recursive convolutional neural network. We show that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Furthermore, we find that the proposed gated recursive convolutional network learns a grammatical structure of a sentence automatically.

4,702 citations


Posted Content
TL;DR: This paper quantifies the generality versus specificity of neurons in each layer of a deep convolutional neural network and reports a few surprising results, including that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset.
Abstract: Many deep neural networks trained on natural images exhibit a curious phenomenon in common: on the first layer they learn features similar to Gabor filters and color blobs. Such first-layer features appear not to be specific to a particular dataset or task, but general in that they are applicable to many datasets and tasks. Features must eventually transition from general to specific by the last layer of the network, but this transition has not been studied extensively. In this paper we experimentally quantify the generality versus specificity of neurons in each layer of a deep convolutional neural network and report a few surprising results. Transferability is negatively affected by two distinct issues: (1) the specialization of higher layer neurons to their original task at the expense of performance on the target task, which was expected, and (2) optimization difficulties related to splitting networks between co-adapted neurons, which was not expected. In an example network trained on ImageNet, we demonstrate that either of these two issues may dominate, depending on whether features are transferred from the bottom, middle, or top of the network. We also document that the transferability of features decreases as the distance between the base task and target task increases, but that transferring features even from distant tasks can be better than using random features. A final surprising result is that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset.

4,663 citations


Proceedings Article
08 Dec 2014
TL;DR: In this paper, the authors quantify the transferability of features from the first layer to the last layer of a deep neural network and show that transferability is negatively affected by two distinct issues: (1) the specialization of higher layer neurons to their original task at the expense of performance on the target task and (2) optimization difficulties related to splitting networks between co-adapted neurons.
Abstract: Many deep neural networks trained on natural images exhibit a curious phenomenon in common: on the first layer they learn features similar to Gabor filters and color blobs. Such first-layer features appear not to be specific to a particular dataset or task, but general in that they are applicable to many datasets and tasks. Features must eventually transition from general to specific by the last layer of the network, but this transition has not been studied extensively. In this paper we experimentally quantify the generality versus specificity of neurons in each layer of a deep convolutional neural network and report a few surprising results. Transferability is negatively affected by two distinct issues: (1) the specialization of higher layer neurons to their original task at the expense of performance on the target task, which was expected, and (2) optimization difficulties related to splitting networks between co-adapted neurons, which was not expected. In an example network trained on ImageNet, we demonstrate that either of these two issues may dominate, depending on whether features are transferred from the bottom, middle, or top of the network. We also document that the transferability of features decreases as the distance between the base task and target task increases, but that transferring features even from distant tasks can be better than using random features. A final surprising result is that initializing a network with transferred features from almost any number of layers can produce a boost to generalization that lingers even after fine-tuning to the target dataset.

4,368 citations


Posted Content
TL;DR: In this article, a generative adversarial network (GAN) is proposed to estimate generative models via an adversarial process, in which two models are simultaneously trained: a generator G and a discriminator D that estimates the probability that a sample came from the training data rather than G.
Abstract: We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.

Posted Content
TL;DR: Qualitatively, the proposed RNN Encoder‐Decoder model learns a semantically and syntactically meaningful representation of linguistic phrases.
Abstract: In this paper, we propose a novel neural network model called RNN Encoder-Decoder that consists of two recurrent neural networks (RNN). One RNN encodes a sequence of symbols into a fixed-length vector representation, and the other decodes the representation into another sequence of symbols. The encoder and decoder of the proposed model are jointly trained to maximize the conditional probability of a target sequence given a source sequence. The performance of a statistical machine translation system is empirically found to improve by using the conditional probabilities of phrase pairs computed by the RNN Encoder-Decoder as an additional feature in the existing log-linear model. Qualitatively, we show that the proposed model learns a semantically and syntactically meaningful representation of linguistic phrases.

Journal ArticleDOI
TL;DR: Enzalutamide significantly decreased the risk of radiographic progression and death and delayed the initiation of chemotherapy in men with metastatic prostate cancer.
Abstract: BACKGROUND: Enzalutamide is an oral androgen-receptor inhibitor that prolongs survival in men with metastatic castration-resistant prostate cancer in whom the disease has progressed after chemotherapy. New treatment options are needed for patients with metastatic prostate cancer who have not received chemotherapy, in whom the disease has progressed despite androgen-deprivation therapy. METHODS: In this double-blind, phase 3 study, we randomly assigned 1717 patients to receive either enzalutamide (at a dose of 160 mg) or placebo once daily. The coprimary end points were radiographic progression-free survival and overall survival. RESULTS: The study was stopped after a planned interim analysis, conducted when 540 deaths had been reported, showed a benefit of the active treatment. The rate of radiographic progression-free survival at 12 months was 65% among patients treated with enzalutamide, as compared with 14% among patients receiving placebo (81% risk reduction; hazard ratio in the enzalutamide group, 0.19; 95% confidence interval [CI], 0.15 to 0.23; P<0.001). A total of 626 patients (72%) in the enzalutamide group, as compared with 532 patients (63%) in the placebo group, were alive at the data-cutoff date (29% reduction in the risk of death; hazard ratio, 0.71; 95% CI, 0.60 to 0.84; P<0.001). The benefit of enzalutamide was shown with respect to all secondary end points, including the time until the initiation of cytotoxic chemotherapy (hazard ratio, 0.35), the time until the first skeletal-related event (hazard ratio, 0.72), a complete or partial soft-tissue response (59% vs. 5%), the time until prostate-specific antigen (PSA) progression (hazard ratio, 0.17), and a rate of decline of at least 50% in PSA (78% vs. 3%) (P<0.001 for all comparisons). Fatigue and hypertension were the most common clinically relevant adverse events associated with enzalutamide treatment. CONCLUSIONS: Enzalutamide significantly decreased the risk of radiographic progression and death and delayed the initiation of chemotherapy in men with metastatic prostate cancer. (Funded by Medivation and Astellas Pharma; PREVAIL ClinicalTrials.gov number, NCT01212991.).

Journal ArticleDOI
TL;DR: The survey work and case studies will be useful for all those involved in developing software for data analysis using Ward’s hierarchical clustering method.
Abstract: The Ward error sum of squares hierarchical clustering method has been very widely used since its first description by Ward in a 1963 publication. It has also been generalized in various ways. Two algorithms are found in the literature and software, both announcing that they implement the Ward clustering method. When applied to the same distance matrix, they produce different results. One algorithm preserves Ward's criterion, the other does not. Our survey work and case studies will be useful for all those involved in developing software for data analysis using Ward's hierarchical clustering method.

Posted Content
TL;DR: It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase.
Abstract: Neural machine translation is a relatively new approach to statistical machine translation based purely on neural networks. The neural machine translation models often consist of an encoder and a decoder. The encoder extracts a fixed-length representation from a variable-length input sentence, and the decoder generates a correct translation from this representation. In this paper, we focus on analyzing the properties of the neural machine translation using two models; RNN Encoder--Decoder and a newly proposed gated recursive convolutional neural network. We show that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase. Furthermore, we find that the proposed gated recursive convolutional network learns a grammatical structure of a sentence automatically.

Journal ArticleDOI
TL;DR: In patients with heart failure and a preserved ejection fraction, treatment with spironolactone did not significantly reduce the incidence of the primary composite outcome of death from cardiovascular causes, aborted cardiac arrest, or hospitalization for the management of heart failure.
Abstract: Background Mineralocorticoid-receptor antagonists improve the prognosis for patients with heart failure and a reduced left ventricular ejection fraction. We evaluated the effects of spironolactone in patients with heart failure and a preserved left ventricular ejection fraction. Methods In this randomized, double-blind trial, we assigned 3445 patients with symptomatic heart failure and a left ventricular ejection fraction of 45% or more to receive either spironolactone (15 to 45 mg daily) or placebo. The primary outcome was a composite of death from cardiovascular causes, aborted cardiac arrest, or hospitalization for the management of heart failure. Results With a mean follow-up of 3.3 years, the primary outcome occurred in 320 of 1722 patients in the spironolactone group (18.6%) and 351 of 1723 patients in the placebo group (20.4%) (hazard ratio, 0.89; 95% confidence interval [CI], 0.77 to 1.04; P = 0.14). Of the components of the primary outcome, only hospitalization for heart failure had a significantly lower incidence in the spironolactone group than in the placebo group (206 patients [12.0%] vs. 245 patients [14.2%]; hazard ratio, 0.83; 95% CI, 0.69 to 0.99, P = 0.04). Neither total deaths nor hospitalizations for any reason were significantly reduced by spironolactone. Treatment with spiron olactone was associated with increased serum creatinine levels and a doubling of the rate of hyperkalemia (18.7%, vs. 9.1% in the placebo group) but reduced hypokalemia. With frequent monitoring, there were no significant differences in the incidence of serious adverse events, a serum creatinine level of 3.0 mg per deciliter (265 μmol per liter) or higher, or dialysis. Conclusions In patients with heart failure and a preserved ejection fraction, treatment with spironolactone did not significantly reduce the incidence of the primary composite outcome of death from cardiovascular causes, aborted cardiac arrest, or hospitalization for the management of heart failure. (Funded by the National Heart, Lung, and Blood Institute; TOPCAT ClinicalTrials.gov number, NCT00094302.)

Journal ArticleDOI
Andrew R. Wood1, Tõnu Esko2, Jian Yang3, Sailaja Vedantam4  +441 moreInstitutions (132)
TL;DR: This article identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height, and all common variants together captured 60% of heritability.
Abstract: Using genome-wide data from 253,288 individuals, we identified 697 variants at genome-wide significance that together explained one-fifth of the heritability for adult height. By testing different numbers of variants in independent studies, we show that the most strongly associated ∼2,000, ∼3,700 and ∼9,500 SNPs explained ∼21%, ∼24% and ∼29% of phenotypic variance. Furthermore, all common variants together captured 60% of heritability. The 697 variants clustered in 423 loci were enriched for genes, pathways and tissue types known to be involved in growth and together implicated genes and pathways not highlighted in earlier efforts, such as signaling by fibroblast growth factors, WNT/β-catenin and chondroitin sulfate-related genes. We identified several genes and pathways not previously connected with human skeletal growth, including mTOR, osteoglycin and binding of hyaluronic acid. Our results indicate a genetic architecture for human height that is characterized by a very large but finite number (thousands) of causal variants.

Journal ArticleDOI
TL;DR: Global rates of change suggest that only 16 countries will achieve the MDG 5 target by 2015, with evidence of continued acceleration in the MMR, and MMR was highest in the oldest age groups in both 1990 and 2013.

Journal ArticleDOI
TL;DR: Similarity network fusion substantially outperforms single data type analysis and established integrative approaches when identifying cancer subtypes and is effective for predicting survival.
Abstract: Similarity network fusion (SNF) is an approach to integrate multiple data types on the basis of similarity between biological samples rather than individual measurements. The authors demonstrate SNF by constructing patient networks to identify disease subtypes with differential survival profiles.

Journal ArticleDOI
TL;DR: Strong and robust support is found for a sister-group relationship between land plants and one group of streptophyte green algae, the Zygnematophyceae, and suggests that phylogenetic hypotheses used to understand the evolution of fundamental plant traits should be reevaluated.
Abstract: Reconstructing the origin and evolution of land plants and their algal relatives is a fundamental problem in plant phylogenetics, and is essential for understanding how critical adaptations arose, including the embryo, vascular tissue, seeds, and flowers. Despite advances in molecular systematics, some hypotheses of relationships remain weakly resolved. Inferring deep phylogenies with bouts of rapid diversification can be problematic; however, genome-scale data should significantly increase the number of informative characters for analyses. Recent phylogenomic reconstructions focused on the major divergences of plants have resulted in promising but inconsistent results. One limitation is sparse taxon sampling, likely resulting from the difficulty and cost of data generation. To address this limitation, transcriptome data for 92 streptophyte taxa were generated and analyzed along with 11 published plant genome sequences. Phylogenetic reconstructions were conducted using up to 852 nuclear genes and 1,701,170 aligned sites. Sixty-nine analyses were performed to test the robustness of phylogenetic inferences to permutations of the data matrix or to phylogenetic method, including supermatrix, supertree, and coalescent-based approaches, maximum-likelihood and Bayesian methods, partitioned and unpartitioned analyses, and amino acid versus DNA alignments. Among other results, we find robust support for a sister-group relationship between land plants and one group of streptophyte green algae, the Zygnematophyceae. Strong and robust support for a clade comprising liverworts and mosses is inconsistent with a widely accepted view of early land plant evolution, and suggests that phylogenetic hypotheses used to understand the evolution of fundamental plant traits should be reevaluated.

Journal ArticleDOI
19 Feb 2014-Neuron
TL;DR: It is proposed that astrocytes mainly signal through high-affinity slowly desensitizing receptors to modulate neurons and perform integration in spatiotemporal domains complementary to those of neurons.

Journal ArticleDOI
TL;DR: Overall, the review provides some support for the widespread recommendation that PFMT be included in first-line conservative management programmes for women with stress, urge, or mixed, urinary incontinence.
Abstract: Background Pelvic floor muscle training is the most commonly used physical therapy treatment for women with stress urinary incontinence (SUI). It is sometimes also recommended for mixed and, less commonly, urgency urinary incontinence. Objectives To determine the effects of pelvic floor muscle training for women with urinary incontinence in comparison to no treatment, placebo or sham treatments, or other inactive control treatments. Search methods We searched the Cochrane Incontinence Group Specialised Register, which contains trials identified from the Cochrane Central Register of Controlled Trials (CENTRAL) (1999 onwards), MEDLINE (1966 onwards) and MEDLINE In-Process (2001 onwards), and handsearched journals and conference proceedings (searched 15 April 2013) and the reference lists of relevant articles. Selection criteria Randomised or quasi-randomised trials in women with stress, urgency or mixed urinary incontinence (based on symptoms, signs, or urodynamics). One arm of the trial included pelvic floor muscle training (PFMT). Another arm was a no treatment, placebo, sham, or other inactive control treatment arm. Data collection and analysis Trials were independently assessed by two review authors for eligibility and methodological quality. Data were extracted then cross-checked. Disagreements were resolved by discussion. Data were processed as described in the Cochrane Handbook for Systematic Reviews of Interventions. Trials were subgrouped by diagnosis of urinary incontinence. Formal meta-analysis was undertaken when appropriate. Main results Twenty-one trials involving 1281 women (665 PFMT, 616 controls) met the inclusion criteria; 18 trials (1051 women) contributed data to the forest plots. The trials were generally small to moderate sized, and many were at moderate risk of bias, based on the trial reports. There was considerable variation in the interventions used, study populations, and outcome measures. There were no studies of women with mixed or urgency urinary incontinence alone. Women with SUI who were in the PFMT groups were 8 times more likely than the controls to report that they were cured (46/82 (56.1%) versus 5/83 (6.0%), RR 8.38, 95% CI 3.68 to 19.07) and 17 times more likely to report cure or improvement (32/58 (55%) versus 2/63 (3.2%), RR 17.33, 95% CI 4.31 to 69.64). In trials in women with any type of urinary incontinence, PFMT groups were also more likely to report cure, or more cure and improvement than the women in the control groups, although the effect size was reduced. Women with either SUI or any type of urinary incontinence were also more satisfied with the active treatment, while women in the control groups were more likely to seek further treatment. Women treated with PFMT leaked urine less often, lost smaller amounts on the short office-based pad test, and emptied their bladders less often during the day. Their sexual outcomes were also better. Two trials (one small and one moderate size) reported some evidence of the benefit persisting for up to a year after treatment. Of the few adverse effects reported, none were serious. The findings of the review were largely supported by the summary of findings tables, but most of the evidence was down-graded to moderate on methodological grounds. The exception was 'Participant perceived cure' in women with SUI, which was rated as high quality. Authors' conclusions The review provides support for the widespread recommendation that PFMT be included in first-line conservative management programmes for women with stress and any type of urinary incontinence. Long-term effectiveness of PFMT needs to be further researched.

Posted Content
TL;DR: The complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have is investigated.
Abstract: We study the complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have. Deep networks are able to sequentially map portions of each layer's input-space to the same output. In this way, deep models compute functions that react equally to complicated patterns of different inputs. The compositional structure of these functions enables them to re-use pieces of computation exponentially often in terms of the network's depth. This paper investigates the complexity of such compositional maps and contributes new theoretical results regarding the advantage of depth for neural networks with piecewise linear activation functions. In particular, our analysis is not specific to a single family of models, and as an example, we employ it for rectifier and maxout networks. We improve complexity bounds from pre-existing work and investigate the behavior of units in higher layers.

Journal ArticleDOI
TL;DR: The Global Burden of Disease 2013 study provides a consistent and comprehensive approach to disease estimation for between 1990 and 2013, and an opportunity to assess whether accelerated progress has occured since the Millennium Declaration.

Journal ArticleDOI
Dalila Pinto1, Elsa Delaby2, Elsa Delaby3, Elsa Delaby4, Daniele Merico5, Mafalda Barbosa1, Alison K. Merikangas6, Lambertus Klei7, Bhooma Thiruvahindrapuram5, Xiao Xu1, Robert Ziman5, Zhuozhi Wang5, Jacob A. S. Vorstman8, Ann P. Thompson9, Regina Regan10, Regina Regan11, Marion Pilorge4, Marion Pilorge2, Marion Pilorge3, Giovanna Pellecchia5, Alistair T. Pagnamenta12, Bárbara Oliveira13, Bárbara Oliveira14, Christian R. Marshall5, Tiago R. Magalhaes11, Tiago R. Magalhaes10, Jennifer K. Lowe15, Jennifer L. Howe5, Anthony J. Griswold16, John R. Gilbert16, Eftichia Duketis17, Beth A. Dombroski18, Maretha de Jonge8, Michael L. Cuccaro16, Emily L. Crawford19, Catarina Correia14, Catarina Correia13, Judith Conroy20, Inȇs C. Conceição14, Inȇs C. Conceição13, Andreas G. Chiocchetti17, Jillian P. Casey11, Jillian P. Casey10, Guiqing Cai1, Christelle Cabrol2, Christelle Cabrol4, Christelle Cabrol3, Nadia Bolshakova6, Elena Bacchelli21, Richard Anney6, Steven Gallinger5, Michelle Cotterchio22, Graham Casey23, Lonnie Zwaigenbaum24, Kerstin Wittemeyer25, Kirsty Wing12, Simon Wallace12, Herman van Engeland8, Ana Tryfon26, Susanne Thomson19, Latha Soorya27, Bernadette Rogé, Wendy Roberts5, Fritz Poustka17, Susana Mouga28, Nancy J. Minshew7, L. Alison McInnes29, Susan G. McGrew19, Catherine Lord30, Marion Leboyer, Ann Le Couteur31, Alexander Kolevzon1, Patricia Jiménez González, Suma Jacob32, Suma Jacob33, Richard Holt12, Stephen J. Guter32, Jonathan Green, Andrew Green11, Andrew Green10, Christopher Gillberg34, Bridget A. Fernandez35, Frederico Duque28, Richard Delorme, Geraldine Dawson36, Pauline Chaste, Cátia Café, Sean Brennan6, Thomas Bourgeron37, Patrick Bolton38, Patrick Bolton39, Sven Bölte17, Raphael Bernier40, Gillian Baird39, Anthony J. Bailey12, Evdokia Anagnostou5, Joana Almeida, Ellen M. Wijsman40, Veronica J. Vieland41, Astrid M. Vicente14, Astrid M. Vicente13, Gerard D. Schellenberg18, Margaret A. Pericak-Vance16, Andrew D. Paterson5, Jeremy R. Parr31, Guiomar Oliveira28, John I. Nurnberger42, Anthony P. Monaco43, Anthony P. Monaco12, Elena Maestrini21, Sabine M. Klauck44, Hakon Hakonarson18, Jonathan L. Haines19, Daniel H. Geschwind15, Christine M. Freitag17, Susan E. Folstein16, Sean Ennis10, Sean Ennis11, Hilary Coon45, Agatino Battaglia, Peter Szatmari9, James S. Sutcliffe19, Joachim Hallmayer46, Michael Gill6, Edwin H. Cook32, Joseph D. Buxbaum1, Bernie Devlin7, Louise Gallagher6, Catalina Betancur3, Catalina Betancur4, Catalina Betancur2, Stephen W. Scherer5 
TL;DR: For example, the authors analyzed 2,446 ASD-affected families and confirmed an excess of genic deletions and duplications in affected versus control groups (1.41-fold, p = 1.0 × 10(-5)) and an increase in affected subjects carrying exonic pathogenic CNVs overlapping known loci associated with dominant or X-linked ASD and intellectual disability.
Abstract: Rare copy-number variation (CNV) is an important source of risk for autism spectrum disorders (ASDs). We analyzed 2,446 ASD-affected families and confirmed an excess of genic deletions and duplications in affected versus control groups (1.41-fold, p = 1.0 × 10(-5)) and an increase in affected subjects carrying exonic pathogenic CNVs overlapping known loci associated with dominant or X-linked ASD and intellectual disability (odds ratio = 12.62, p = 2.7 × 10(-15), ∼3% of ASD subjects). Pathogenic CNVs, often showing variable expressivity, included rare de novo and inherited events at 36 loci, implicating ASD-associated genes (CHD2, HDAC4, and GDI1) previously linked to other neurodevelopmental disorders, as well as other genes such as SETD5, MIR137, and HDAC9. Consistent with hypothesized gender-specific modulators, females with ASD were more likely to have highly penetrant CNVs (p = 0.017) and were also overrepresented among subjects with fragile X syndrome protein targets (p = 0.02). Genes affected by de novo CNVs and/or loss-of-function single-nucleotide variants converged on networks related to neuronal signaling and development, synapse function, and chromatin regulation.

Journal ArticleDOI
TL;DR: CBD bears investigation in epilepsy and other neuropsychiatric disorders, including anxiety, schizophrenia, addiction, and neonatal hypoxic‐ischemic encephalopathy, however, data from well‐powered double‐blind randomized, controlled studies on the efficacy of pure CBD for any disorder is lacking.
Abstract: To present a summary of current scientific evidence about the cannabinoid, cannabidiol (CBD) with regard to its relevance to epilepsy and other selected neuropsychiatric disorders. We summarize the presentations from a conference in which invited participants reviewed relevant aspects of the physiology, mechanisms of action, pharmacology, and data from studies with animal models and human subjects. Cannabis has been used to treat disease since ancient times. Δ(9) -Tetrahydrocannabinol (Δ(9) -THC) is the major psychoactive ingredient and CBD is the major nonpsychoactive ingredient in cannabis. Cannabis and Δ(9) -THC are anticonvulsant in most animal models but can be proconvulsant in some healthy animals. The psychotropic effects of Δ(9) -THC limit tolerability. CBD is anticonvulsant in many acute animal models, but there are limited data in chronic models. The antiepileptic mechanisms of CBD are not known, but may include effects on the equilibrative nucleoside transporter; the orphan G-protein-coupled receptor GPR55; the transient receptor potential of vanilloid type-1 channel; the 5-HT1a receptor; and the α3 and α1 glycine receptors. CBD has neuroprotective and antiinflammatory effects, and it appears to be well tolerated in humans, but small and methodologically limited studies of CBD in human epilepsy have been inconclusive. More recent anecdotal reports of high-ratio CBD:Δ(9) -THC medical marijuana have claimed efficacy, but studies were not controlled. CBD bears investigation in epilepsy and other neuropsychiatric disorders, including anxiety, schizophrenia, addiction, and neonatal hypoxic-ischemic encephalopathy. However, we lack data from well-powered double-blind randomized, controlled studies on the efficacy of pure CBD for any disorder. Initial dose-tolerability and double-blind randomized, controlled studies focusing on target intractable epilepsy populations such as patients with Dravet and Lennox-Gastaut syndromes are being planned. Trials in other treatment-resistant epilepsies may also be warranted. A PowerPoint slide summarizing this article is available for download in the Supporting Information section here.

Journal ArticleDOI
TL;DR: Observations ofBeta Pictoris clearly detect the planet, Beta Pictoris b, in a single 60-s exposure with minimal postprocessing, and fitting the Keplerian orbit of Beta Pic b using the new position together with previous astrometry gives a factor of 3 improvement in most parameters over previous solutions.
Abstract: The Gemini Planet Imager is a dedicated facility for directly imaging and spectroscopically characterizing extrasolar planets. It combines a very high-order adaptive optics system, a diffraction-suppressing coronagraph, and an integral field spectrograph with low spectral resolution but high spatial resolution. Every aspect of the Gemini Planet Imager has been tuned for maximum sensitivity to faint planets near bright stars. During first-light observations, we achieved an estimated H band Strehl ratio of 0.89 and a 5-σ contrast of 10(6) at 0.75 arcseconds and 10(5) at 0.35 arcseconds. Observations of Beta Pictoris clearly detect the planet, Beta Pictoris b, in a single 60-s exposure with minimal postprocessing. Beta Pictoris b is observed at a separation of 434 ± 6 milliarcseconds (mas) and position angle 211.8 ± 0.5°. Fitting the Keplerian orbit of Beta Pic b using the new position together with previous astrometry gives a factor of 3 improvement in most parameters over previous solutions. The planet orbits at a semimajor axis of [Formula: see text] near the 3:2 resonance with the previously known 6-AU asteroidal belt and is aligned with the inner warped disk. The observations give a 4% probability of a transit of the planet in late 2017.

Journal ArticleDOI
TL;DR: Replacement provided a more durable correction of mitral regurgitation, but there was no significant between-group difference in clinical outcomes.
Abstract: Background Ischemic mitral regurgitation is associated with a substantial risk of death. Practice guidelines recommend surgery for patients with a severe form of this condition but acknowledge that the supporting evidence for repair or replacement is limited. Methods We randomly assigned 251 patients with severe ischemic mitral regurgitation to undergo either mitral-valve repair or chordal-sparing replacement in order to evaluate efficacy and safety. The primary end point was the left ventricular end-systolic volume index (LVESVI) at 12 months, as assessed with the use of a Wilcoxon rank-sum test in which deaths were categorized below the lowest LVESVI rank. Results At 12 months, the mean LVESVI among surviving patients was 54.6±25.0 ml per square meter of body-surface area in the repair group and 60.7±31.5 ml per square meter in the replacement group (mean change from baseline, −6.6 and −6.8 ml per square meter, respectively). The rate of death was 14.3% in the repair group and 17.6% in the replacement g...

Proceedings Article
30 Oct 2014
TL;DR: A deep learning framework for modeling complex high-dimensional densities called Non-linear Independent Component Estimation (NICE) is proposed, based on the idea that a good representation is one in which the data has a distribution that is easy to model.
Abstract: We propose a deep learning framework for modeling complex high-dimensional densities called Non-linear Independent Component Estimation (NICE). It is based on the idea that a good representation is one in which the data has a distribution that is easy to model. For this purpose, a non-linear deterministic transformation of the data is learned that maps it to a latent space so as to make the transformed data conform to a factorized distribution, i.e., resulting in independent latent variables. We parametrize this transformation so that computing the Jacobian determinant and inverse transform is trivial, yet we maintain the ability to learn complex non-linear transformations, via a composition of simple building blocks, each based on a deep neural network. The training criterion is simply the exact log-likelihood, which is tractable. Unbiased ancestral sampling is also easy. We show that this approach yields good generative models on four image datasets and can be used for inpainting.

Proceedings ArticleDOI
03 Nov 2014
TL;DR: A new latent semantic model that incorporates a convolutional-pooling structure over word sequences to learn low-dimensional, semantic vector representations for search queries and Web documents is proposed.
Abstract: In this paper, we propose a new latent semantic model that incorporates a convolutional-pooling structure over word sequences to learn low-dimensional, semantic vector representations for search queries and Web documents. In order to capture the rich contextual structures in a query or a document, we start with each word within a temporal context window in a word sequence to directly capture contextual features at the word n-gram level. Next, the salient word n-gram features in the word sequence are discovered by the model and are then aggregated to form a sentence-level feature vector. Finally, a non-linear transformation is applied to extract high-level semantic information to generate a continuous vector representation for the full text string. The proposed convolutional latent semantic model (CLSM) is trained on clickthrough data and is evaluated on a Web document ranking task using a large-scale, real-world data set. Results show that the proposed model effectively captures salient semantic information in queries and documents for the task while significantly outperforming previous state-of-the-art semantic models.