Home
/
Authors
/
Luc Van Gool

Author

Luc Van Gool

Other affiliations: Microsoft, ETH Zurich, Politehnica University of Timișoara ...read more

Bio: Luc Van Gool is an academic researcher from Katholieke Universiteit Leuven. The author has contributed to research in topics: Computer science & Object detection. The author has an hindex of 133, co-authored 1307 publications receiving 107743 citations. Previous affiliations of Luc Van Gool include Microsoft & ETH Zurich.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1986
1985
1984

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Three-dimensional measurement of microchips using structured light techniques

[...]

Javier Vargas¹, Thomas Koninckx², Juan Antonio Quiroga¹, Luc Van Gool²•Institutions (2)

Complutense University of Madrid¹, Katholieke Universiteit Leuven²

01 May 2008-Optical Engineering

TL;DR: This work presents a 3-D measurement technique capable of optically measuring microchip devices using a camera-projector system and improves the dynamic range of the imaging system through the use of a set of gray-code and phase-shift measures with different CCD integration times.

...read moreread less

Abstract: The industry dealing with microchip inspection requires fast, flexible, repeatable, and stable 3-D measuring systems. The typical devices used for this purpose are coordinate measurement machines (CMMs). These systems have limitations such as high cost, low measurement speed, and small quantity of measured 3-D points. Now optical techniques are beginning to replace the typical touch probes because of their noncontact nature, their full-field measurement capability, their high measurement density, as well as their low cost and high measurement speed. However, typical properties of microchip devices, which include a strongly spatially varying reflectance, make impossible the direct use of the classical optical 3-D measurement techniques. We present a 3-D measurement technique capable of optically measuring these devices using a camera-projector system. The proposed method improves the dynamic range of the imaging system through the use of a set of gray-code (GC) and phase- shift (PS) measures with different CCD integration times. A set of extended-range GC and PS images are obtained and used to acquire a dense 3-D measure of the object. We measure the 3-D shape of an integrated circuit and obtained satisfactory results.

...read moreread less

17 citations

Posted Content•

Learning Driving Models with a Surround-View Camera System and a Route Planner.

[...]

Simon Hecker, Dengxin Dai, Luc Van Gool

27 Mar 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work develops a sensor setup that provides data for a 360-degree view of the area surrounding the vehicle, the driving route to the destination, and the low-level driving maneuvers by human drivers, and learns a novel driving model by integrating information from the surround-view cameras and the route planner.

...read moreread less

Abstract: For people, having a rear-view mirror and side-view mirrors is vital for safe driving. They deliver a better view of what happens around the car. Human drivers also heavily exploit their mental map for navigation. Nonetheless, several methods have been published that learn driving models with only a front-facing camera and without a route planner. This lack of information renders the self-driving task quite intractable. Hence, we investigate the problem with a more realistic setting, which consists of a surround-view camera system with eight cameras, a route planner, and a CAN bus reader. In particular, we develop a sensor setup that provides data for a 360-degree view of the area surrounding the vehicle, the driving route to the destination, and the low-level driving maneuvers (e.g. steering angle and speed) by human drivers. With such sensor setup we collect a new driving dataset, covering diverse driving scenarios and varying weather/illumination conditions. Finally, we learn a novel driving model by integrating information from the surround-view cameras and the route planner. Two route planners are exploited: one based on OpenStreetMap and the other on TomTom Maps. The route planners are exploited in two ways: 1) by representing the planned routes as a stack of GPS coordinates, and 2) by rendering the planned routes on a map and recording the progression into a video. Our experiments show that: 1) 360-degree surround-view cameras help avoid failures made with a single front-view camera for the driving task; and 2) a route planner helps the driving task significantly. We acknowledge that our method is not the best-ever driving model, but that is not our focus. Rather, it provides a strong basis for further academic research, especially on driving relevant tasks by integrating information from street-view images and the planned driving routes. Code and data will be made available.

...read moreread less

17 citations

Proceedings Article•DOI•

Make my day - high-fidelity color denoising with Near-Infrared

[...]

Hiroto Honda¹, Radu Timofte², Luc Van Gool³•Institutions (3)

Toshiba¹, ETH Zurich², Katholieke Universiteit Leuven³

07 Jun 2015

TL;DR: The algorithm that is proposed - coined `Make My Day' or MMD for short - is akin to the previously published BM3D denoising algorithm and outperforms other state-of-art Denoising methods in terms of PSNR, texture quality, and color fidelity.

...read moreread less

Abstract: We address the task of restoring RGB images taken under low illumination (e.g. night time), when an aligned near infrared (NIR or simply N) image taken under stronger NIR illumination is available. Such restoration holds the promise that algorithms designed to work under daylight conditions could be used around the clock. Increasingly, RGBN cameras are becoming available, as car cameras tend to include a Near-Infrared (N) band, next to R, G, and B bands, and NIR artificial lighting is applied. Under low lighting conditions, the NIR band is less noisy than the others and this is all the more the case if stronger illumination is only available in the NIR band. We address the task of restoring the R, G, and B bands on the basis of the NIR band in such cases. Even if the NIR band is less strongly correlated with the R, G, and B bands than these bands are mutually, there is sufficient such correlation to pick up important textural and gradient information in the NIR band and inject it into the others. The algorithm that we propose - coined ‘Make My Day’ or MMD for short - is akin to the previously published BM3D denoising algorithm. MMD denoises the three (visible - NIR) differential images to then add back the original NIR image. It not only effectively reduces the noise but also includes the texture and edge information in the high spatial frequency range. MMD outperforms other state-of-art denoising methods in terms of PSNR, texture quality, and color fidelity. We publish our codes and images.

...read moreread less

17 citations

Posted Content•

Efficient Video Semantic Segmentation with Labels Propagation and Refinement

[...]

Matthieu Paul¹, Christoph Mayer¹, Luc Van Gool¹, Radu Timofte¹•Institutions (1)

ETH Zurich¹

26 Dec 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: The proposed Efficient Video Segmentation (EVS) pipeline achieves accuracy levels competitive to the existing real-time methods for semantic image segmentation (mIoU above 60%), while achieving much higher frame rates.

...read moreread less

Abstract: This paper tackles the problem of real-time semantic segmentation of high definition videos using a hybrid GPU / CPU approach. We propose an Efficient Video Segmentation(EVS) pipeline that combines: (i) On the CPU, a very fast optical flow method, that is used to exploit the temporal aspect of the video and propagate semantic information from one frame to the next. It runs in parallel with the GPU. (ii) On the GPU, two Convolutional Neural Networks: A main segmentation network that is used to predict dense semantic labels from scratch, and a Refiner that is designed to improve predictions from previous frames with the help of a fast Inconsistencies Attention Module (IAM). The latter can identify regions that cannot be propagated accurately. We suggest several operating points depending on the desired frame rate and accuracy. Our pipeline achieves accuracy levels competitive to the existing real-time methods for semantic image segmentation(mIoU above 60%), while achieving much higher frame rates. On the popular Cityscapes dataset with high resolution frames (2048 x 1024), the proposed operating points range from 80 to 1000 Hz on a single GPU and CPU.

...read moreread less

17 citations

Journal Article•DOI•

Beyond Novelty Detection: Incongruent Events, When General and Specific Classifiers Disagree

[...]

Daphna Weinshall¹, Alon Zweig¹, Hynek Hermansky², Stefan Kombrink², Frank W. Ohl³, rg-Hendrik Bach⁴, Luc Van Gool⁴, Fabian Nater, Tomas Pajdla, Michal Havlena⁵, Misha Pavel⁵ - Show less +7 more•Institutions (5)

Hebrew University of Jerusalem¹, Brno University of Technology², Leibniz Institute for Neurobiology³, ETH Zurich⁴, Oregon Health & Science University⁵

01 Oct 2012-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this paper, the authors define a formal framework for the representation and processing of incongruent events and derive algorithms to detect these events from different types of hierarchies, different applications and a variety of data types.

...read moreread less

Abstract: Unexpected stimuli are a challenge to any machine learning algorithm. Here, we identify distinct types of unexpected events when general-level and specific-level classifiers give conflicting predictions. We define a formal framework for the representation and processing of incongruent events: Starting from the notion of label hierarchy, we show how partial order on labels can be deduced from such hierarchies. For each event, we compute its probability in different ways, based on adjacent levels in the label hierarchy. An incongruent event is an event where the probability computed based on some more specific level is much smaller than the probability computed based on some more general level, leading to conflicting predictions. Algorithms are derived to detect incongruent events from different types of hierarchies, different applications, and a variety of data types. We present promising results for the detection of novel visual and audio objects, and new patterns of motion in video. We also discuss the detection of Out-Of-Vocabulary words in speech recognition, and the detection of incongruent events in a multimodal audiovisual scenario.

...read moreread less

17 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
…
130
131
132
133
134
135
136
…
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Deep Residual Learning for Image Recognition

[...]

Kaiming He¹, Xiangyu Zhang¹, Shaoqing Ren¹, Jian Sun¹•Institutions (1)

Microsoft¹

27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

...read moreread less

123,388 citations

Proceedings Article•

Very Deep Convolutional Networks for Large-Scale Image Recognition

[...]

Karen Simonyan¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

...read moreread less

55,235 citations

Proceedings Article•

Very Deep Convolutional Networks for Large-Scale Image Recognition

[...]

Karen Simonyan¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

49,914 citations

Posted Content•

Deep Residual Learning for Image Recognition

[...]

Kaiming He¹, Xiangyu Zhang¹, Shaoqing Ren¹, Jian Sun¹•Institutions (1)

Microsoft¹

10 Dec 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

...read moreread less

Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

...read moreread less

44,703 citations

Proceedings Article•DOI•

Going deeper with convolutions

[...]

Christian Szegedy¹, Wei Liu², Yangqing Jia¹, Pierre Sermanet¹, Scott Reed³, Dragomir Anguelov¹, Dumitru Erhan¹, Vincent Vanhoucke¹, Andrew Rabinovich - Show less +5 more•Institutions (3)

Google¹, University of North Carolina at Chapel Hill², University of Michigan³

07 Jun 2015

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

...read moreread less

40,257 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse