Home
/
Authors
/
Grégoire Montavon

Author

Grégoire Montavon

Other affiliations: Johannes Kepler University of Linz

Bio: Grégoire Montavon is an academic researcher from Technical University of Berlin. The author has contributed to research in topics: Artificial neural network & Computer science. The author has an hindex of 35, co-authored 79 publications receiving 10125 citations. Previous affiliations of Grégoire Montavon include Johannes Kepler University of Linz.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2013
2012
2011
2010

Papers

PDF

Open Access

More filters

Journal Article•DOI•

On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation.

[...]

Sebastian Bach¹, Alexander Binder², Grégoire Montavon¹, Frederick Klauschen³, Klaus-Robert Müller¹, Wojciech Samek¹ - Show less +2 more•Institutions (3)

Technical University of Berlin¹, Singapore University of Technology and Design², Charité³

10 Jul 2015-PLOS ONE

TL;DR: This work proposes a general solution to the problem of understanding classification decisions by pixel-wise decomposition of nonlinear classifiers by introducing a methodology that allows to visualize the contributions of single pixels to predictions for kernel-based classifiers over Bag of Words features and for multilayered neural networks.

...read moreread less

Abstract: Understanding and interpreting classification decisions of automated image classification systems is of high value in many applications, as it allows to verify the reasoning of the system and provides additional information to the human expert. Although machine learning methods are solving very successfully a plethora of tasks, they have in most cases the disadvantage of acting as a black box, not providing any information about what made them arrive at a particular decision. This work proposes a general solution to the problem of understanding classification decisions by pixel-wise decomposition of nonlinear classifiers. We introduce a methodology that allows to visualize the contributions of single pixels to predictions for kernel-based classifiers over Bag of Words features and for multilayered neural networks. These pixel contributions can be visualized as heatmaps and are provided to a human expert who can intuitively not only verify the validity of the classification decision, but also focus further analysis on regions of potential interest. We evaluate our method for classifiers trained on PASCAL VOC 2009 images, synthetic image data containing geometric shapes, the MNIST handwritten digits data set and for the pre-trained ImageNet model available as part of the Caffe open source package.

...read moreread less

3,330 citations

Journal Article•DOI•

Methods for interpreting and understanding deep neural networks

[...]

Grégoire Montavon¹, Wojciech Samek², Klaus-Robert Müller¹, Klaus-Robert Müller³, Klaus-Robert Müller⁴ - Show less +1 more•Institutions (4)

Technical University of Berlin¹, Heinrich Hertz Institute², Max Planck Society³, Korea University⁴

01 Feb 2018-Digital Signal Processing

TL;DR: The second part of the tutorial focuses on the recently proposed layer-wise relevance propagation (LRP) technique, for which the author provides theory, recommendations, and tricks, to make most efficient use of it on real data.

...read moreread less

1,939 citations

Journal Article•DOI•

Explaining nonlinear classification decisions with deep Taylor decomposition

[...]

Grégoire Montavon¹, Sebastian Lapuschkin², Alexander Binder³, Wojciech Samek², Klaus-Robert Müller⁴ - Show less +1 more•Institutions (4)

Technical University of Berlin¹, Heinrich Hertz Institute², Singapore University of Technology and Design³, Korea University⁴

01 May 2017-Pattern Recognition

TL;DR: A novel methodology for interpreting generic multilayer neural networks by decomposing the network classification decision into contributions of its input elements by backpropagating the explanations from the output to the input layer is introduced.

...read moreread less

1,247 citations

Journal Article•DOI•

Evaluating the Visualization of What a Deep Neural Network Has Learned

[...]

Wojciech Samek¹, Alexander Binder², Grégoire Montavon³, Sebastian Lapuschkin¹, Klaus-Robert Müller³ - Show less +1 more•Institutions (3)

Heinrich Hertz Institute¹, Singapore University of Technology and Design², Technical University of Berlin³

01 Nov 2017-IEEE Transactions on Neural Networks

TL;DR: In this article, a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps is presented, and the authors compare heatmaps computed by three different methods on the SUN397, ILSVRC2012, and MIT Places data sets.

...read moreread less

Abstract: Deep neural networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multilayer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision, given a new unseen data sample. Recently, several approaches have been proposed enabling one to understand and interpret the reasoning embodied in a DNN for a single test image. These methods quantify the “importance” of individual pixels with respect to the classification decision and allow a visualization in terms of a heatmap in pixel/input space. While the usefulness of heatmaps can be judged subjectively by a human, an objective quality measure is missing. In this paper, we present a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps. We compare heatmaps computed by three different methods on the SUN397, ILSVRC2012, and MIT Places data sets. Our main result is that the recently proposed layer-wise relevance propagation algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. We provide theoretical arguments to explain this result and discuss its practical implications. Finally, we investigate the use of heatmaps for unsupervised assessment of the neural network performance.

...read moreread less

866 citations

Journal Article•DOI•

Unmasking Clever Hans Predictors and Assessing What Machines Really Learn

[...]

Sebastian Lapuschkin¹, Stephan Wäldchen², Alexander Binder³, Grégoire Montavon², Wojciech Samek¹, Klaus-Robert Müller⁴, Klaus-Robert Müller⁵, Klaus-Robert Müller² - Show less +4 more•Institutions (5)

Heinrich Hertz Institute¹, Technical University of Berlin², Singapore University of Technology and Design³, Korea University⁴, Max Planck Society⁵

26 Feb 2019-arXiv: Artificial Intelligence

TL;DR: The authors investigate how these methods approach learning in order to assess the dependability of their decision making and propose a semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines.

...read moreread less

Abstract: Current learning machines have successfully solved hard application problems, reaching high accuracy and displaying seemingly "intelligent" behavior. Here we apply recent techniques for explaining decisions of state-of-the-art learning machines and analyze various tasks from computer vision and arcade games. This showcases a spectrum of problem-solving behaviors ranging from naive and short-sighted, to well-informed and strategic. We observe that standard performance evaluation metrics can be oblivious to distinguishing these diverse problem solving behaviors. Furthermore, we propose our semi-automated Spectral Relevance Analysis that provides a practically effective way of characterizing and validating the behavior of nonlinear learning machines. This helps to assess whether a learned model indeed delivers reliably for the problem that it was conceived for. Furthermore, our work intends to add a voice of caution to the ongoing excitement about machine intelligence and pledges to evaluate and judge some of these recent successes in a more nuanced manner.

...read moreread less

614 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18

Collapse

Cited by

PDF

Open Access

More filters

Book•

Deep Learning

[...]

Ian Goodfellow¹, Yoshua Bengio², Aaron Courville²•Institutions (2)

Google¹, Université de Montréal²

18 Nov 2016

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.

...read moreread less

Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

...read moreread less

38,208 citations

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Journal Article•DOI•

A survey on deep learning in medical image analysis

[...]

Geert Litjens¹, Thijs Kooi¹, Babak Ehteshami Bejnordi¹, Arnaud Arindra Adiyoso Setio¹, Francesco Ciompi¹, Mohsen Ghafoorian¹, Jeroen van der Laak¹, Bram van Ginneken¹, Clara I. Sánchez¹ - Show less +5 more•Institutions (1)

Radboud University Nijmegen¹

01 Dec 2017-Medical Image Analysis

TL;DR: This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year, to survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks.

...read moreread less

8,730 citations

Proceedings Article•

A unified approach to interpreting model predictions

[...]

Scott M. Lundberg¹, Su-In Lee¹•Institutions (1)

University of Washington¹

04 Dec 2017

TL;DR: In this article, a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations), is presented, which assigns each feature an importance value for a particular prediction.

...read moreread less

Abstract: Understanding why a model makes a certain prediction can be as crucial as the prediction's accuracy in many applications. However, the highest accuracy for large modern datasets is often achieved by complex models that even experts struggle to interpret, such as ensemble or deep learning models, creating a tension between accuracy and interpretability. In response, various methods have recently been proposed to help users interpret the predictions of complex models, but it is often unclear how these methods are related and when one method is preferable over another. To address this problem, we present a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations). SHAP assigns each feature an importance value for a particular prediction. Its novel components include: (1) the identification of a new class of additive feature importance measures, and (2) theoretical results showing there is a unique solution in this class with a set of desirable properties. The new class unifies six existing methods, notable because several recent methods in the class lack the proposed desirable properties. Based on insights from this unification, we present new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.

...read moreread less

7,309 citations

Proceedings Article•

Wasserstein Generative Adversarial Networks

[...]

Martin Arjovsky¹, Soumith Chintala², Léon Bottou²•Institutions (2)

Courant Institute of Mathematical Sciences¹, Facebook²

17 Jul 2017

TL;DR: This work introduces a new algorithm named WGAN, an alternative to traditional GAN training that can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches.

...read moreread less

Abstract: We introduce a new algorithm named WGAN, an alternative to traditional GAN training. In this new model, we show that we can improve the stability of learning, get rid of problems like mode collapse, and provide meaningful learning curves useful for debugging and hyperparameter searches. Furthermore, we show that the corresponding optimization problem is sound, and provide extensive theoretical work highlighting the deep connections to different distances between distributions.

...read moreread less

5,667 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse