Showing papers by "Chris Pal published in 2017"

PDF

Open Access

Journal Article•DOI•

Brain tumor segmentation with Deep Neural Networks

[...]

Mohammad Havaei¹, Axel Davy², David Warde-Farley³, Antoine Biard³, Aaron Courville³, Yoshua Bengio³, Chris Pal⁴, Pierre-Marc Jodoin¹, Hugo Larochelle¹ - Show less +5 more•Institutions (4)

Université de Sherbrooke¹, École Normale Supérieure², Université de Montréal³, École Polytechnique de Montréal⁴

01 Jan 2017-Medical Image Analysis

TL;DR: A fast and accurate fully automatic method for brain tumor segmentation which is competitive both in terms of accuracy and speed compared to the state of the art, and introduces a novel cascaded architecture that allows the system to more accurately model local label dependencies.

...read moreread less

2,538 citations

Journal Article•DOI•

Deep Learning: A Primer for Radiologists

[...]

Gabriel Chartrand¹, Phillip M. Cheng, Eugene Vorontsov², Michal Drozdzal, Simon Turcotte³, Chris Pal¹, Samuel Kadoury¹, An Tang - Show less +4 more•Institutions (3)

Université de Montréal¹, University of Southern California², École Polytechnique de Montréal³

13 Nov 2017-Radiographics

TL;DR: The key concepts of deep learning for clinical radiologists are reviewed, technical requirements are discussed, emerging applications in clinical radiology are described, and limitations and future directions in this field are outlined.

...read moreread less

Abstract: Deep learning is a class of machine learning methods that are gaining success and attracting interest in many domains, including computer vision, speech recognition, natural language processing, and playing games. Deep learning methods produce a mapping from raw inputs to desired outputs (eg, image classes). Unlike traditional machine learning methods, which require hand-engineered feature extraction from inputs, deep learning methods learn these features directly from data. With the advent of large datasets and increased computing power, these methods can produce models with exceptional performance. These models are multilayer artificial neural networks, loosely inspired by biologic neural systems. Weighted connections between nodes (neurons) in the network are iteratively adjusted based on example pairs of inputs and target outputs by back-propagating a corrective error signal through the network. For computer vision tasks, convolutional neural networks (CNNs) have proven to be effective. Recently, several clinical applications of CNNs have been proposed and studied in radiology for classification, detection, and segmentation tasks. This article reviews the key concepts of deep learning for clinical radiologists, discusses technical requirements, describes emerging applications in clinical radiology, and outlines limitations and future directions in this field. Radiologists should become familiar with the principles and potential applications of deep learning in medical imaging. ©RSNA, 2017.

...read moreread less

687 citations

Journal Article•DOI•

ISLES 2015 - A public evaluation benchmark for ischemic stroke lesion segmentation from multispectral MRI

[...]

Oskar Maier¹, Bjoern H. Menze², Janina von der Gablentz¹, Levin Häni³, Mattias P. Heinrich¹, Matthias Liebrand¹, Stefan Winzeck², Abdul Basit⁴, Paul Bentley⁵, Liang Chen⁵, Daan Christiaens⁶, Francis Dutil⁷, Karl Egger⁸, Chaolu Feng⁹, Ben Glocker⁵, Michael Götz¹⁰, Tom Haeck⁶, Hanna-Leena Halme¹¹, Hanna-Leena Halme¹², Mohammad Havaei⁷, Khan M. Iftekharuddin¹³, Pierre-Marc Jodoin⁷, Konstantinos Kamnitsas⁵, Elias Kellner⁸, Antti Korvenoja¹², Hugo Larochelle⁷, Christian Ledig⁵, Jia-Hong Lee¹⁴, Frederik Maes⁶, Qaiser Mahmood⁴, Qaiser Mahmood¹⁵, Klaus H. Maier-Hein¹⁰, Richard McKinley, John Muschelli¹⁶, Chris Pal¹⁷, Linmin Pei¹³, Janaki Raman Rangarajan⁶, Syed M. S. Reza¹³, David Robben⁶, Daniel Rueckert⁵, Eero Salli¹², Paul Suetens⁶, Ching-Wei Wang¹⁴, Matthias Wilms¹, Jan S. Kirschke², Ulrike M. Krämer¹, Thomas F. Münte¹, Peter Schramm, Roland Wiest, Heinz Handels¹, Mauricio Reyes³ - Show less +47 more•Institutions (17)

University of Lübeck¹, Technische Universität München², University of Bern³, Pakistan Institute of Nuclear Science and Technology⁴, Imperial College London⁵, Katholieke Universiteit Leuven⁶, Université de Sherbrooke⁷, University Medical Center Freiburg⁸, Northeastern University (China)⁹, German Cancer Research Center¹⁰, Aalto University¹¹, University of Helsinki¹², Old Dominion University¹³, National Taiwan University of Science and Technology¹⁴, Chalmers University of Technology¹⁵, Johns Hopkins University¹⁶, École Polytechnique de Montréal¹⁷

01 Jan 2017-Medical Image Analysis

TL;DR: This paper proposes a common evaluation framework for automatic stroke lesion segmentation from MRIP, describes the publicly available datasets, and presents the results of the two sub‐challenges: Sub‐Acute Stroke Lesion Segmentation (SISS) and Stroke Perfusion Estimation (SPES).

...read moreread less

417 citations

Posted Content•

Deep Complex Networks

[...]

Chiheb Trabelsi¹, Olexa Bilaniuk², Ying Zhang³, Dmitriy Serdyuk⁴, Sandeep Subramanian⁵, João Felipe Santos⁶, Soroush Mehri, Negar Rostamzadeh⁷, Yoshua Bengio³, Chris Pal¹ - Show less +6 more•Institutions (7)

École Polytechnique de Montréal¹, University of Ottawa², Université de Montréal³, Facebook⁴, Carnegie Mellon University⁵, Institut national de la recherche scientifique⁶, University of Trento⁷

27 May 2017-arXiv: Neural and Evolutionary Computing

TL;DR: This work relies on complex convolutions and present algorithms for complex batch-normalization, complex weight initialization strategies for complex-valued neural nets and uses them in experiments with end-to-end training schemes and demonstrates that such complex- valued models are competitive with their real-valued counterparts.

...read moreread less

Abstract: At present, the vast majority of building blocks, techniques, and architectures for deep learning are based on real-valued operations and representations. However, recent work on recurrent neural networks and older fundamental theoretical analysis suggests that complex numbers could have a richer representational capacity and could also facilitate noise-robust memory retrieval mechanisms. Despite their attractive properties and potential for opening up entirely new neural architectures, complex-valued deep neural networks have been marginalized due to the absence of the building blocks required to design such models. In this work, we provide the key atomic components for complex-valued deep neural networks and apply them to convolutional feed-forward networks and convolutional LSTMs. More precisely, we rely on complex convolutions and present algorithms for complex batch-normalization, complex weight initialization strategies for complex-valued neural nets and we use them in experiments with end-to-end training schemes. We demonstrate that such complex-valued models are competitive with their real-valued counterparts. We test deep complex models on several computer vision tasks, on music transcription using the MusicNet dataset and on Speech Spectrum Prediction using the TIMIT dataset. We achieve state-of-the-art performance on these audio-related tasks.

...read moreread less

371 citations

Proceedings Article•

Deep Complex Networks

[...]

27 May 2017

TL;DR: In this paper, the authors provide the key atomic components for complex-valued deep neural networks and apply them to convolutional feed-forward networks, and demonstrate that such complexvalued models are competitive with their real-valued counterparts.

...read moreread less

Abstract: At present, the vast majority of building blocks, techniques, and architectures for deep learning are based on real-valued operations and representations. However, recent work on recurrent neural networks and older fundamental theoretical analysis suggests that complex numbers could have a richer representational capacity and could also facilitate noise-robust memory retrieval mechanisms. Despite their attractive properties and potential for opening up entirely new neural architectures, complex-valued deep neural networks have been marginalized due to the absence of the building blocks required to design such models. In this work, we provide the key atomic components for complex-valued deep neural networks and apply them to convolutional feed-forward networks. More precisely, we rely on complex convolutions and present algorithms for complex batch-normalization, complex weight initialization strategies for complex-valued neural nets and we use them in experiments with end-to-end training schemes. We demonstrate that such complex-valued models are competitive with their real-valued counterparts. We test deep complex models on several computer vision tasks, on music transcription using the MusicNet dataset and on Speech spectrum prediction using TIMIT. We achieve state-of-the-art performance on these audio-related tasks.

...read moreread less

288 citations

Journal Article•DOI•

Movie Description

[...]

Anna Rohrbach¹, Atousa Torabi², Marcus Rohrbach³, Niket Tandon¹, Chris Pal⁴, Hugo Larochelle⁵, Aaron Courville⁶, Bernt Schiele¹ - Show less +4 more•Institutions (6)

Max Planck Society¹, Disney Research², University of California, Berkeley³, École Polytechnique de Montréal⁴, Université de Sherbrooke⁵, Université de Montréal⁶

01 May 2017-International Journal of Computer Vision

TL;DR: The Large Scale Movie Description Challenge (LSMDC) as discussed by the authors ) is a dataset of 128,118 sentences aligned to video clips from 200 movies (around 150 h of video in total).

...read moreread less

Abstract: Audio description (AD) provides linguistic descriptions of movies and allows visually impaired people to follow a movie along with their peers. Such descriptions are by design mainly visual and thus naturally form an interesting data source for computer vision and computational linguistics. In this work we propose a novel dataset which contains transcribed ADs, which are temporally aligned to full length movies. In addition we also collected and aligned movie scripts used in prior work and compare the two sources of descriptions. We introduce the Large Scale Movie Description Challenge (LSMDC) which contains a parallel corpus of 128,118 sentences aligned to video clips from 200 movies (around 150 h of video in total). The goal of the challenge is to automatically generate descriptions for the movie clips. First we characterize the dataset by benchmarking different approaches for generating video descriptions. Comparing ADs to scripts, we find that ADs are more visual and describe precisely what is shown rather than what should happen according to the scripts created prior to movie production. Furthermore, we present and compare the results of several teams who participated in the challenges organized in the context of two workshops at ICCV 2015 and ECCV 2016.

...read moreread less

212 citations

Posted Content•

Learning Normalized Inputs for Iterative Estimation in Medical Image Segmentation

[...]

Michal Drozdzal, Gabriel Chartrand, Eugene Vorontsov, Lisa Di Jorio, An Tang, Adriana Romero, Yoshua Bengio, Chris Pal, Samuel Kadoury - Show less +5 more

16 Feb 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, a simple yet powerful pipeline for medical image segmentation that combines Fully Convolutional Networks (FCNs) with Fully convolutional residual networks (FC-ResNets) is presented.

...read moreread less

Abstract: In this paper, we introduce a simple, yet powerful pipeline for medical image segmentation that combines Fully Convolutional Networks (FCNs) with Fully Convolutional Residual Networks (FC-ResNets). We propose and examine a design that takes particular advantage of recent advances in the understanding of both Convolutional Neural Networks as well as ResNets. Our approach focuses upon the importance of a trainable pre-processing when using FC-ResNets and we show that a low-capacity FCN model can serve as a pre-processor to normalize medical input data. In our image segmentation pipeline, we use FCNs to obtain normalized images, which are then iteratively refined by means of a FC-ResNet to generate a segmentation prediction. As in other fully convolutional approaches, our pipeline can be used off-the-shelf on different image modalities. We show that using this pipeline, we exhibit state-of-the-art performance on the challenging Electron Microscopy benchmark, when compared to other 2D methods. We improve segmentation results on CT images of liver lesions, when contrasting with standard FCN methods. Moreover, when applying our 2D pipeline on a challenging 3D MRI prostate segmentation challenge we reach results that are competitive even when compared to 3D methods. The obtained results illustrate the strong potential and versatility of the pipeline by achieving highly accurate results on multi-modality images from different anatomical regions and organs.

...read moreread less

160 citations

Proceedings Article•

On orthogonality and learning recurrent networks with long term dependencies

[...]

Eugene Vorontsov¹, Chiheb Trabelsi¹, Samuel Kadoury¹, Chris Pal¹•Institutions (1)

École Polytechnique de Montréal¹

06 Aug 2017

TL;DR: This paper proposes a weight matrix factorization and parameterization strategy through which the degree of expansivity induced during backpropagation can be controlled and finds that hard constraints on orthogonality can negatively affect the speed of convergence and model performance.

...read moreread less

Abstract: It is well known that it is challenging to train deep neural networks and recurrent neural networks for tasks that exhibit long term dependencies. The vanishing or exploding gradient problem is a well known issue associated with these challenges. One approach to addressing vanishing and exploding gradients is to use either soft or hard constraints on weight matrices so as to encourage or enforce orthogonality. Orthogonal matrices preserve gradient norm during back-propagation and may therefore be a desirable property. This paper explores issues with optimization convergence, speed and gradient stability when encouraging or enforcing orthogonality. To perform this analysis, we propose a weight matrix factorization and parameterization strategy through which we can bound matrix norms and therein control the degree of expansivity induced during backpropagation. We find that hard constraints on orthogonality can negatively affect the speed of convergence and model performance.

...read moreread less

158 citations

Posted Content•

Adversarial Generation of Natural Language

[...]

Sai Rajeswar, Sandeep Subramanian, Francis Dutil, Chris Pal, Aaron Courville - Show less +1 more

31 May 2017-arXiv: Computation and Language

TL;DR: The authors used GANs to generate sentences from context-free and probabilistic context free grammars, and qualitative language modeling results. But their results were not commensurate with the progress made in generating images, and still lag far behind likelihood based methods.

...read moreread less

Abstract: Generative Adversarial Networks (GANs) have gathered a lot of attention from the computer vision community, yielding impressive results for image generation Advances in the adversarial generation of natural language from noise however are not commensurate with the progress made in generating images, and still lag far behind likelihood based methods In this paper, we take a step towards generating natural language with a GAN objective alone We introduce a simple baseline that addresses the discrete output space problem without relying on gradient estimators and show that it is able to achieve state-of-the-art results on a Chinese poem generation dataset We present quantitative results on generating sentences from context-free and probabilistic context-free grammars, and qualitative language modeling results A conditional version is also described that can generate sequences conditioned on sentence characteristics

...read moreread less

155 citations

Proceedings Article•

ExtremeWeather: A large-scale climate dataset for semi-supervised detection, localization, and understanding of extreme weather events

[...]

Evan Racah¹, Christopher Beckham², Tegan Maharaj², Samira Ebrahimi Kahou², Prabhat¹, Chris Pal² - Show less +2 more•Institutions (2)

Lawrence Berkeley National Laboratory¹, École Polytechnique de Montréal²

01 Jan 2017

TL;DR: This work presents a multichannel spatiotemporal CNN architecture for semi-supervised bounding box prediction and exploratory data analysis, and demonstrates that this approach is able to leverage temporal information and unlabeled data to improve the localization of extreme weather events.

...read moreread less

Abstract: Then detection and identification of extreme weather events in large-scale climate simulations is an important problem for risk management, informing governmental policy decisions and advancing our basic understanding of the climate system. Recent work has shown that fully supervised convolutional neural networks (CNNs) can yield acceptable accuracy for classifying well-known types of extreme weather events when large amounts of labeled data are available. However, many different types of spatially localized climate patterns are of interest including hurricanes, extra-tropical cyclones, weather fronts, and blocking events among others. Existing labeled data for these patterns can be incomplete in various ways, such as covering only certain years or geographic areas and having false negatives. This type of climate data therefore poses a number of interesting machine learning challenges. We present a multichannel spatiotemporal CNN architecture for semi-supervised bounding box prediction and exploratory data analysis. We demonstrate that our approach is able to leverage temporal information and unlabeled data to improve the localization of extreme weather events. Further, we explore the representations learned by our model in order to better understand this important data. We present a dataset, ExtremeWeather, to encourage machine learning research in this area and to help facilitate further work in understanding and mitigating the effects of climate change. The dataset is available at extremeweatherdataset.github.io and the code is available at https://github.com/eracah/hur-detect.

...read moreread less

118 citations

Posted Content•

On orthogonality and learning recurrent networks with long term dependencies

[...]

Eugene Vorontsov¹, Chiheb Trabelsi¹, Samuel Kadoury¹, Chris Pal¹•Institutions (1)

École Polytechnique de Montréal¹

31 Jan 2017-arXiv: Learning

TL;DR: In this article, a weight matrix factorization and parameterization strategy is proposed to control the degree of expansivity induced during backpropagation, and the authors find that hard constraints on orthogonality can negatively affect convergence and model performance.

...read moreread less

Abstract: It is well known that it is challenging to train deep neural networks and recurrent neural networks for tasks that exhibit long term dependencies. The vanishing or exploding gradient problem is a well known issue associated with these challenges. One approach to addressing vanishing and exploding gradients is to use either soft or hard constraints on weight matrices so as to encourage or enforce orthogonality. Orthogonal matrices preserve gradient norm during backpropagation and may therefore be a desirable property. This paper explores issues with optimization convergence, speed and gradient stability when encouraging or enforcing orthogonality. To perform this analysis, we propose a weight matrix factorization and parameterization strategy through which we can bound matrix norms and therein control the degree of expansivity induced during backpropagation. We find that hard constraints on orthogonality can negatively affect the speed of convergence and model performance.

...read moreread less

Proceedings Article•DOI•

RATM: Recurrent Attentive Tracking Model

[...]

Samira Ebrahimi Kahou¹, Vincent Michalski², Roland Memisevic², Chris Pal¹, Pascal Vincent² - Show less +1 more•Institutions (2)

École Polytechnique de Montréal¹, Université de Montréal²

12 Apr 2017

TL;DR: In this article, an attention-based modular neural framework for computer vision is proposed, which consists of three modules: a recurrent attention module controlling where to look in an image or video frame, a feature-extraction module providing a representation of what is seen, and an objective module formalizing why the model learns its attentive behavior.

...read moreread less

Abstract: We present an attention-based modular neural framework for computer vision. The framework uses a soft attention mechanism allowing models to be trained with gradient descent. It consists of three modules: a recurrent attention module controlling where to look in an image or video frame, a feature-extraction module providing a representation of what is seen, and an objective module formalizing why the model learns its attentive behavior. The attention module allows the model to focus computation on task-related information in the input. We apply the framework to several object tracking tasks and explore various design choices. We experiment with three data sets, bouncing ball, moving digits and the real-world KTH data set. The proposed RATM performs well on all three tasks and can generalize to related but previously unseen sequences from a challenging tracking data set.

...read moreread less

Proceedings Article•DOI•

Adversarial Generation of Natural Language

[...]

Sandeep Subramanian¹, Sai Rajeswar², Francis Dutil³, Chris Pal⁴, Aaron Courville⁵ - Show less +1 more•Institutions (5)

Carnegie Mellon University¹, Indian Institute of Technology Delhi², Université de Sherbrooke³, École Polytechnique de Montréal⁴, Université de Montréal⁵

31 May 2017

TL;DR: A simple baseline is introduced that addresses the discrete output space problem without relying on gradient estimators and is able to achieve state-of-the-art results on a Chinese poem generation dataset.

...read moreread less

Abstract: Generative Adversarial Networks (GANs) have gathered a lot of attention from the computer vision community, yielding impressive results for image generation. Advances in the adversarial generation of natural language from noise however are not commensurate with the progress made in generating images, and still lag far behind likelihood based methods. In this paper, we take a step towards generating natural language with a GAN objective alone. We introduce a simple baseline that addresses the discrete output space problem without relying on gradient estimators and show that it is able to achieve state-of-the-art results on a Chinese poem generation dataset. We present quantitative results on generating sentences from context-free and probabilistic context-free grammars, and qualitative language modeling results. A conditional version is also described that can generate sequences conditioned on sentence characteristics.

...read moreread less

Proceedings Article•DOI•

A Dataset and Exploration of Models for Understanding Video Data through Fill-in-the-Blank Question-Answering

[...]

Tegan Maharaj¹, Nicolas Ballas², Anna Rohrbach³, Aaron Courville², Chris Pal¹ - Show less +1 more•Institutions (3)

École Polytechnique de Montréal¹, Université de Montréal², Max Planck Society³

21 Jul 2017

TL;DR: The MovieFIB dataset as mentioned in this paper is a large-scale question-answering dataset with over 300,000 examples based on descriptive video annotations for the visually impaired, which is used as a benchmark for video understanding.

...read moreread less

Abstract: While deep convolutional neural networks frequently approach or exceed human-level performance in benchmark tasks involving static images, extending this success to moving images is not straightforward. Video understanding is of interest for many applications, including content recommendation, prediction, summarization, event/object detection, and understanding human visual perception. However, many domains lack sufficient data to explore and perfect video models. In order to address the need for a simple, quantitative benchmark for developing and understanding video, we present MovieFIB, a fill-in-the-blank question-answering dataset with over 300,000 examples, based on descriptive video annotations for the visually impaired. In addition to presenting statistics and a description of the dataset, we perform a detailed analysis of 5 different models predictions, and compare these with human performance. We investigate the relative importance of language, static (2D) visual features, and moving (3D) visual features, the effects of increasing dataset size, the number of frames sampled, and of vocabulary size. We illustrate that: this task is not solvable by a language model alone, our model combining 2D and 3D visual information indeed provides the best result, all models perform significantly worse than human-level. We provide human evaluation for responses given by different models and find that accuracy on the MovieFIB evaluation corresponds well with human judgment. We suggest avenues for improving video models, and hope that the MovieFIB challenge can be useful for measuring and encouraging progress in this very interesting field.

...read moreread less

Proceedings Article•

Unimodal Probability Distributions for Deep Ordinal Classification

[...]

Christopher Beckham¹, Chris Pal¹•Institutions (1)

École Polytechnique de Montréal¹

17 Jul 2017

TL;DR: This work proposes a straightforward technique to constrain discrete ordinal probability distributions to be unimodal via a combination of the Poisson probability mass function and the softmax nonlinearity.

...read moreread less

Abstract: Probability distributions produced by the cross-entropy loss for ordinal classification problems can possess undesired properties. We propose a straightforward technique to constrain discrete ordinal probability distributions to be unimodal via the use of the Poisson and binomial probability distributions. We evaluate this approach in the context of deep learning on two large ordinal image datasets, obtaining promising results.

...read moreread less

Posted Content•

Improving Landmark Localization with Semi-Supervised Learning

[...]

Sina Honari¹, Pavlo Molchanov², Stephen Tyree², Pascal Vincent¹, Chris Pal³, Jan Kautz² - Show less +2 more•Institutions (3)

Université de Montréal¹, Nvidia², École Polytechnique de Montréal³

05 Sep 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, the authors leverage the common situation where precise landmark locations are only provided for a small data subset, but where class labels for classification or regression tasks related to the landmarks are more abundantly available.

...read moreread less

Abstract: We present two techniques to improve landmark localization in images from partially annotated datasets. Our primary goal is to leverage the common situation where precise landmark locations are only provided for a small data subset, but where class labels for classification or regression tasks related to the landmarks are more abundantly available. First, we propose the framework of sequential multitasking and explore it here through an architecture for landmark localization where training with class labels acts as an auxiliary signal to guide the landmark localization on unlabeled data. A key aspect of our approach is that errors can be backpropagated through a complete landmark localization model. Second, we propose and explore an unsupervised learning technique for landmark localization based on having a model predict equivariant landmarks with respect to transformations applied to the image. We show that these techniques, improve landmark prediction considerably and can learn effective detectors even when only a small fraction of the dataset has landmark labels. We present results on two toy datasets and four real datasets, with hands and faces, and report new state-of-the-art on two datasets in the wild, e.g. with only 5\% of labeled images we outperform previous state-of-the-art trained on the AFLW dataset.

...read moreread less

Posted Content•

Liver lesion segmentation informed by joint liver segmentation

[...]

Eugene Vorontsov¹, An Tang, Chris Pal¹, Samuel Kadoury¹•Institutions (1)

École Polytechnique de Montréal¹

24 Jul 2017-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, a model for joint segmentation of the liver and liver lesions in computed tomography (CT) volumes is proposed, where two fully convolutional networks are connected in tandem and trained together end-to-end.

...read moreread less

Abstract: We propose a model for the joint segmentation of the liver and liver lesions in computed tomography (CT) volumes. We build the model from two fully convolutional networks, connected in tandem and trained together end-to-end. We evaluate our approach on the 2017 MICCAI Liver Tumour Segmentation Challenge, attaining competitive liver and liver lesion detection and segmentation scores across a wide range of metrics. Unlike other top performing methods, our model output post-processing is trivial, we do not use data external to the challenge, and we propose a simple single-stage model that is trained end-to-end. However, our method nearly matches the top lesion segmentation performance and achieves the second highest precision for lesion detection while maintaining high recall.

...read moreread less

Journal Article•DOI•

Metastatic liver tumour segmentation with a neural network-guided 3D deformable model.

[...]

Eugene Vorontsov¹, An Tang², David Roy¹, Chris Pal¹, Samuel Kadoury¹ - Show less +1 more•Institutions (2)

École Polytechnique de Montréal¹, Université de Montréal²

01 Jan 2017-Medical & Biological Engineering & Computing

TL;DR: A deformable model that uses a voxel classifier based on a multilayer perceptron (MLP) to interpret the CT image and considers vertex displacement towards apparent tumour boundaries and regularization that promotes surface smoothness is proposed.

...read moreread less

Abstract: The segmentation of liver tumours in CT images is useful for the diagnosis and treatment of liver cancer. Furthermore, an accurate assessment of tumour volume aids in the diagnosis and evaluation of treatment response. Currently, segmentation is performed manually by an expert, and because of the time required, a rough estimate of tumour volume is often done instead. We propose a semi-automatic segmentation method that makes use of machine learning within a deformable surface model. Specifically, we propose a deformable model that uses a voxel classifier based on a multilayer perceptron (MLP) to interpret the CT image. The new deformable model considers vertex displacement towards apparent tumour boundaries and regularization that promotes surface smoothness. During operation, a user identifies the target tumour and the mesh then automatically delineates the tumour from the MLP processed image. The method was tested on a dataset of 40 abdominal CT scans with a total of 95 colorectal metastases collected from a variety of scanners with variable spatial resolution. The segmentation results are encouraging with a Dice similarity metric of [Formula: see text] and demonstrates that the proposed method can deal with highly variable data. This work motivates further research into tumour segmentation using machine learning with more data and deeper neural networks.

...read moreread less

Journal Article•DOI•

Improving probabilistic inference in graphical models with determinism and cycles

[...]

Mohamed Hamza Ibrahim¹, Chris Pal¹, Gilles Pesant¹•Institutions (1)

École Polytechnique de Montréal¹

01 Jan 2017-Machine Learning

TL;DR: Generalized arc-consistency Expectation Maximization Message-Passing (GEM-MP), a novel message-passing approach to inference in an extended factor graph that combines constraint programming techniques with variational methods, is introduced.

...read moreread less

Abstract: Many important real-world applications of machine learning, statistical physics, constraint programming and information theory can be formulated using graphical models that involve determinism and cycles. Accurate and efficient inference and training of such graphical models remains a key challenge. Markov logic networks (MLNs) have recently emerged as a popular framework for expressing a number of problems which exhibit these properties. While loopy belief propagation (LBP) can be an effective solution in some cases; unfortunately, when both determinism and cycles are present, LBP frequently fails to converge or converges to inaccurate results. As such, sampling based algorithms have been found to be more effective and are more popular for general inference tasks in MLNs. In this paper, we introduce Generalized arc-consistency Expectation Maximization Message-Passing (GEM-MP), a novel message-passing approach to inference in an extended factor graph that combines constraint programming techniques with variational methods. We focus our experiments on Markov logic and Ising models but the method is applicable to graphical models in general. In contrast to LBP, GEM-MP formulates the message-passing structure as steps of variational expectation maximization. Moreover, in the algorithm we leverage the local structures in the factor graph by using generalized arc consistency when performing a variational mean-field approximation. Thus each such update increases a lower bound on the model evidence. Our experiments on Ising grids, entity resolution and link prediction problems demonstrate the accuracy and convergence of GEM-MP over existing state-of-the-art inference algorithms such as MC-SAT, LBP, and Gibbs sampling, as well as convergent message passing algorithms such as the concave---convex procedure, residual BP, and the L2-convex method.

...read moreread less

Posted Content•

A step towards procedural terrain generation with GANs.

[...]

Christopher Beckham, Chris Pal

11 Jul 2017-arXiv: Machine Learning

TL;DR: This work proposes a first step toward the learning and synthesis of these using recent advances in deep generative modelling with openly available satellite imagery from NASA.

...read moreread less

Abstract: Procedural terrain generation for video games has been traditionally been done with smartly designed but handcrafted algorithms that generate heightmaps. We propose a first step toward the learning and synthesis of these using recent advances in deep generative modelling with openly available satellite imagery from NASA.

...read moreread less

Posted Content•

Sparse Attentive Backtracking: Long-Range Credit Assignment in Recurrent Networks

[...]

Nan Rosemary Ke, Anirudh Goyal, Olexa Bilaniuk, Jonathan Binas, Laurent Charlin, Chris Pal, Yoshua Bengio - Show less +3 more

07 Nov 2017-arXiv: Artificial Intelligence

TL;DR: This work proposes an alternative algorithm, Sparse Attentive Backtracking, which might also be related to principles used by brains to learn long-term dependencies, and learns an attention mechanism over the hidden states of the past and selectively backpropagates through paths with high attention weights.

...read moreread less

Abstract: A major drawback of backpropagation through time (BPTT) is the difficulty of learning long-term dependencies, coming from having to propagate credit information backwards through every single step of the forward computation. This makes BPTT both computationally impractical and biologically implausible. For this reason, full backpropagation through time is rarely used on long sequences, and truncated backpropagation through time is used as a heuristic. However, this usually leads to biased estimates of the gradient in which longer term dependencies are ignored. Addressing this issue, we propose an alternative algorithm, Sparse Attentive Backtracking, which might also be related to principles used by brains to learn long-term dependencies. Sparse Attentive Backtracking learns an attention mechanism over the hidden states of the past and selectively backpropagates through paths with high attention weights. This allows the model to learn long term dependencies while only backtracking for a small number of time steps, not just from the recent past but also from attended relevant past states.

...read moreread less

Posted Content•

Semi-Supervised Detection of Extreme Weather Events in Large Climate Datasets

[...]

Evan Racah¹, Christopher Beckham², Tegan Maharaj², Prabhat¹, Chris Pal² - Show less +1 more•Institutions (2)

Lawrence Berkeley National Laboratory¹, École Polytechnique de Montréal²

24 Apr 2017

TL;DR: The approach is able to leverage temporal information and unlabelled data to improve localization of extreme weather events and explore the representations learned by the model in order to better understand this important data, and facilitate further work in understanding and mitigating the effects of climate change.

...read moreread less

Abstract: The detection and identification of extreme weather events in large scale climate simulations is an important problem for risk management, informing governmental policy decisions and advancing our basic understanding of the climate system. Recent work has shown that fully supervised convolutional neural networks (CNNs) can yield acceptable accuracy for classifying well-known types of extreme weather events when large amounts of labeled data are available. However, there are many different types of spatially localized climate patterns of interest (including hurricanes, extra-tropical cyclones, weather fronts, blocking events, etc.) found in simulation data for which labeled data is not available at large scale for all simulations of interest. We present a multichannel spatiotemporal encoder-decoder CNN architecture for semi-supervised bounding box prediction and exploratory data analysis. This architecture is designed to fully model multi-channel simulation data, temporal dynamics and unlabelled data within a reconstruction and prediction framework so as to improve the detection of a wide range of extreme weather events. Our architecture can be viewed as a 3D convolutional autoencoder with an additional modified one-pass bounding box regression loss. We demonstrate that our approach is able to leverage temporal information and unlabelled data to improve localization of extreme weather events. Further, we explore the representations learned by our model in order to better understand this important data, and facilitate further work in understanding and mitigating the effects of climate change.

...read moreread less

Posted Content•

Twin Networks: Using the Future as a Regularizer.

[...]

Dmitriy Serdyuk, Nan Rosemary Ke, Alessandro Sordoni, Chris Pal, Yoshua Bengio - Show less +1 more

22 Aug 2017

TL;DR: This paper introduces a simple way of encouraging the RNNs to plan for the future by introducing an additional neural network which is trained to generate the sequence in reverse order, and requires closeness between the states of the forward RNN and backward RNN that predict the same token.

...read moreread less

Abstract: Being able to model long-term dependencies in sequential data, such as text, has been among the long-standing challenges of recurrent neural networks (RNNs). This issue is strictly related to the absence of explicit planning in current RNN architectures. More explicitly, the RNNs are trained to predict only the next token given previous ones. In this paper, we introduce a simple way of encouraging the RNNs to plan for the future. In order to accomplish this, we introduce an additional neural network which is trained to generate the sequence in reverse order, and we require closeness between the states of the forward RNN and backward RNN that predict the same token. At each step, the states of the forward RNN are required to match the future information contained in the backward states. We hypothesize that the approach eases modeling of long-term dependencies thus helping in generating more globally consistent samples. The model trained with conditional generation for a speech recognition task achieved 12\% relative improvement (CER of 6.7 compared to a baseline of 7.6).

...read moreread less

Posted Content•

Unimodal probability distributions for deep ordinal classification

[...]

Christopher Beckham¹, Chris Pal¹•Institutions (1)

École Polytechnique de Montréal¹

15 May 2017-arXiv: Machine Learning

TL;DR: In this article, a straightforward technique to constrain discrete ordinal probability distributions to be unimodal via the use of the Poisson and binomial probability distributions has been proposed.

...read moreread less

Abstract: Probability distributions produced by the cross-entropy loss for ordinal classification problems can possess undesired properties We propose a straightforward technique to constrain discrete ordinal probability distributions to be unimodal via the use of the Poisson and binomial probability distributions We evaluate this approach in the context of deep learning on two large ordinal image datasets, obtaining promising results

...read moreread less

Posted Content•

ACtuAL: Actor-Critic Under Adversarial Learning

[...]

Anirudh Goyal, Nan Rosemary Ke, Alex Lamb, R Devon Hjelm, Chris Pal, Joelle Pineau, Yoshua Bengio - Show less +3 more

13 Nov 2017-arXiv: Machine Learning

TL;DR: The GAN framework is reframed so that the generator is no longer trained using gradients through the discriminator, but is instead trained using a learned critic in the actor-critic framework with a Temporal Difference (TD) objective.

...read moreread less

Abstract: Generative Adversarial Networks (GANs) are a powerful framework for deep generative modeling. Posed as a two-player minimax problem, GANs are typically trained end-to-end on real-valued data and can be used to train a generator of high-dimensional and realistic images. However, a major limitation of GANs is that training relies on passing gradients from the discriminator through the generator via back-propagation. This makes it fundamentally difficult to train GANs with discrete data, as generation in this case typically involves a non-differentiable function. These difficulties extend to the reinforcement learning setting when the action space is composed of discrete decisions. We address these issues by reframing the GAN framework so that the generator is no longer trained using gradients through the discriminator, but is instead trained using a learned critic in the actor-critic framework with a Temporal Difference (TD) objective. This is a natural fit for sequence modeling and we use it to achieve improvements on language modeling tasks over the standard Teacher-Forcing methods.

...read moreread less

Posted Content•

Self-organized Hierarchical Softmax

[...]

Yikang Shen, Shawn Tan, Chris Pal, Aaron Courville

26 Jul 2017-arXiv: Computation and Language

TL;DR: This work proposes a new self-organizing hierarchical softmax formulation for neural-network-based language models over large vocabularies that is capable of learning word clusters with clear syntactical and semantic meaning during the language model training process.

...read moreread less

Abstract: We propose a new self-organizing hierarchical softmax formulation for neural-network-based language models over large vocabularies. Instead of using a predefined hierarchical structure, our approach is capable of learning word clusters with clear syntactical and semantic meaning during the language model training process. We provide experiments on standard benchmarks for language modeling and sentence compression tasks. We find that this approach is as fast as other efficient softmax approximations, while achieving comparable or even better performance relative to similar full softmax models.

...read moreread less

Posted Content•

Twin Networks: Matching the Future for Sequence Generation

[...]

Dmitriy Serdyuk¹, Nan Rosemary Ke², Alessandro Sordoni³, Adam Trischler³, Chris Pal², Yoshua Bengio⁴ - Show less +2 more•Institutions (4)

Facebook¹, École Polytechnique de Montréal², Microsoft³, Université de Montréal⁴

22 Aug 2017-arXiv: Learning

TL;DR: The authors propose to train a backward recurrent network to generate a given sequence in reverse order, and encourage states of the forward model to predict cotemporal states of a backward model.

...read moreread less

Abstract: We propose a simple technique for encouraging generative RNNs to plan ahead. We train a "backward" recurrent network to generate a given sequence in reverse order, and we encourage states of the forward model to predict cotemporal states of the backward model. The backward network is used only during training, and plays no role during sampling or inference. We hypothesize that our approach eases modeling of long-term dependencies by implicitly forcing the forward states to hold information about the longer-term future (as contained in the backward states). We show empirically that our approach achieves 9% relative improvement for a speech recognition task, and achieves significant improvement on a COCO caption generation task.

...read moreread less