Home
/
Authors
/
Luc Van Gool

Author

Luc Van Gool

Other affiliations: Microsoft, ETH Zurich, Politehnica University of Timișoara ...read more

Bio: Luc Van Gool is an academic researcher from Katholieke Universiteit Leuven. The author has contributed to research in topics: Computer science & Object detection. The author has an hindex of 133, co-authored 1307 publications receiving 107743 citations. Previous affiliations of Luc Van Gool include Microsoft & ETH Zurich.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1986
1985
1984

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

Simultaneous object recognition and segmentation by image exploration

[...]

Vittorio Ferrari, Tinne Tuytelaars, Luc Van Gool

01 Jan 2006-Lecture Notes in Computer Science

TL;DR: The approach can extend any viewpoint invariant feature extractor and covers the object with matches, and simultaneously separates the correct matches from the wrong ones, and approximative contours of the object are produced.

...read moreread less

Abstract: Methods based on local, viewpoint invariant features have proven capable of recognizing objects in spite of viewpoint changes, occlusion and clutter. However, these approaches fail when these factors are too strong, due to the limited repeatability and discriminative power of the features. As additional shortcomings, the objects need to be rigid and only their approximate location is found. We present an object recognition approach which overcomes these limitations. An initial set of feature correspondences is first generated. The method anchors on it and then gradually explores the surrounding area, trying to construct more and more matching features, increasingly farther from the initial ones. The resulting process covers the object with matches, and simultaneously separates the correct matches from the wrong ones. Hence, recognition and segmentation are achieved at the same time. Only very few correct initial matches suffice for reliable recognition. Experimental results on still images and television news broadcasts demonstrate the stronger power of the presented method in dealing with extensive clutter, dominant occlusion, large scale and viewpoint changes. Moreover non-rigid deformations are explicitly taken into account, and the approximative contours of the object are produced. The approach can extend any viewpoint invariant feature extractor.

...read moreread less

159 citations

Proceedings Article•DOI•

Dense 3D Regression for Hand Pose Estimation

[...]

Chengde Wan¹, Thomas Probst¹, Luc Van Gool², Angela Yao³•Institutions (3)

ETH Zurich¹, Katholieke Universiteit Leuven², National University of Singapore³

01 Jun 2018

TL;DR: Zhang et al. as discussed by the authors decompose the pose parameters into a set of per-pixel estimations, i.e., 2D heat maps, 3D heatmaps and unit 3D directional vector fields.

...read moreread less

Abstract: We present a simple and effective method for 3D hand pose estimation from a single depth frame. As opposed to previous state-of-the-art methods based on holistic 3D regression, our method works on dense pixel-wise estimation. This is achieved by careful design choices in pose parameterization, which leverages both 2D and 3D properties of depth map. Specifically, we decompose the pose parameters into a set of per-pixel estimations, i.e., 2D heat maps, 3D heat maps and unit 3D directional vector fields. The 2D/3D joint heat maps and 3D joint offsets are estimated via multitask network cascades, which is trained end-to-end. The pixel-wise estimations can be directly translated into a vote casting scheme. A variant of mean shift is then used to aggregate local votes while enforcing consensus between the the estimated 3D pose and the pixel-wise 2D and 3D estimations by design. Our method is efficient and highly accurate. On MSRA and NYU hand dataset, our method outperforms all previous state-of-the-art approaches by a large margin. On the ICVL hand dataset, our method achieves similar accuracy compared to the nearly saturated result obtained by [5] and outperforms various other proposed methods. Code is available online1.

...read moreread less

159 citations

Posted Content•

Learning Discriminative Model Prediction for Tracking

[...]

Goutam Bhat¹, Martin Danelljan¹, Luc Van Gool¹, Radu Timofte¹•Institutions (1)

ETH Zurich¹

15 Apr 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, the authors propose an end-to-end tracking architecture, which is derived from a discriminative learning loss by designing a dedicated optimization process that is capable of predicting a powerful model in only a few iterations.

...read moreread less

Abstract: The current strive towards end-to-end trainable computer vision systems imposes major challenges for the task of visual tracking. In contrast to most other vision problems, tracking requires the learning of a robust target-specific appearance model online, during the inference stage. To be end-to-end trainable, the online learning of the target model thus needs to be embedded in the tracking architecture itself. Due to the imposed challenges, the popular Siamese paradigm simply predicts a target feature template, while ignoring the background appearance information during inference. Consequently, the predicted model possesses limited target-background discriminability. We develop an end-to-end tracking architecture, capable of fully exploiting both target and background appearance information for target model prediction. Our architecture is derived from a discriminative learning loss by designing a dedicated optimization process that is capable of predicting a powerful model in only a few iterations. Furthermore, our approach is able to learn key aspects of the discriminative loss itself. The proposed tracker sets a new state-of-the-art on 6 tracking benchmarks, achieving an EAO score of 0.440 on VOT2018, while running at over 40 FPS. The code and models are available at this https URL.

...read moreread less

159 citations

Book Chapter•DOI•

The Eighth Visual Object Tracking VOT2020 Challenge Results

[...]

Matej Kristan¹, Ales Leonardis², Jiří Matas³, Michael Felsberg⁴, Roman Pflugfelder⁵, Roman Pflugfelder⁶, Joni-Kristian Kamarainen, Martin Danelljan⁷, Luka Čehovin Zajc¹, Alan Lukežič¹, Ondrej Drbohlav³, Linbo He⁴, Yushan Zhang⁴, Yushan Zhang⁸, Song Yan, Jinyu Yang², Gustavo Fernandez⁶, Alexander G. Hauptmann⁹, Alireza Memarmoghadam¹⁰, Alvaro Garcia-Martin¹¹, Andreas Robinson⁴, Anton Varfolomieiev¹², Awet Haileslassie Gebrehiwot¹¹, Bedirhan Uzun¹³, Bin Yan¹⁴, Bing Li¹⁵, Chen Qian, Chi-Yi Tsai¹⁶, Christian Micheloni¹⁷, Dong Wang¹⁴, Fei Wang, Fei Xie¹⁸, Felix Järemo Lawin⁴, Fredrik K. Gustafsson¹⁹, Gian Luca Foresti¹⁷, Goutam Bhat⁷, Guangqi Chen, Haibin Ling²⁰, Haitao Zhang, Hakan Cevikalp¹³, Haojie Zhao¹⁴, Haoran Bai²¹, Hari Chandana Kuchibhotla²², Hasan Saribas, Heng Fan²⁰, Hossein Ghanei-Yakhdan²³, Houqiang Li²⁴, Houwen Peng²⁵, Huchuan Lu¹⁴, Hui Li²⁶, Javad Khaghani²⁷, Jesús Bescós¹¹, Jianhua Li¹⁴, Jianlong Fu²⁵, Jiaqian Yu²⁸, Jingtao Xu²⁸, Josef Kittler²⁹, Jun Yin, Junhyun Lee³⁰, Kaicheng Yu³¹, Kaiwen Liu¹⁵, Kang Yang³², Kenan Dai¹⁴, Li Cheng²⁷, Li Zhang³³, Lijun Wang¹⁴, Linyuan Wang, Luc Van Gool⁷, Luca Bertinetto, Matteo Dunnhofer¹⁷, Miao Cheng, Mohana Murali Dasari²², Ning Wang³², Pengyu Zhang¹⁴, Philip H. S. Torr³³, Qiang Wang, Radu Timofte⁷, Rama Krishna Sai Subrahmanyam Gorthi²², Seokeon Choi³⁴, Seyed Mojtaba Marvasti-Zadeh²⁷, Shaochuan Zhao²⁶, Shohreh Kasaei³⁵, Shoumeng Qiu¹⁵, Shuhao Chen¹⁴, Thomas B. Schön¹⁹, Tianyang Xu²⁹, Wei Lu, Weiming Hu¹⁵, Wengang Zhou²⁴, Xi Qiu, Xiao Ke³⁶, Xiaojun Wu²⁶, Xiaolin Zhang¹⁵, Xiaoyun Yang, Xue-Feng Zhu²⁶, Yingjie Jiang²⁶, Yingming Wang¹⁴, Yiwei Chen²⁸, Yu Ye³⁶, Yuezhou Li³⁶, Yuncon Yao¹⁸, Yunsung Lee³⁰, Yuzhang Gu¹⁵, Zezhou Wang¹⁴, Zhangyong Tang²⁶, Zhen-Hua Feng²⁹, Zhijun Mai³⁷, Zhipeng Zhang¹⁵, Zhirong Wu²⁵, Ziang Ma - Show less +106 more•Institutions (37)

University of Ljubljana¹, University of Birmingham², Czech Technical University in Prague³, Linköping University⁴, Vienna University of Technology⁵, Austrian Institute of Technology⁶, ETH Zurich⁷, Beijing Institute of Technology⁸, Carnegie Mellon University⁹, University of Isfahan¹⁰, Autonomous University of Madrid¹¹, National Technical University¹², Eskişehir Osmangazi University¹³, Dalian University of Technology¹⁴, Chinese Academy of Sciences¹⁵, Tamkang University¹⁶, University of Udine¹⁷, Southeast University¹⁸, Uppsala University¹⁹, Stony Brook University²⁰, Sichuan University²¹, Indian Institutes of Technology²², Yazd University²³, University of Science and Technology of China²⁴, Microsoft²⁵, Jiangnan University²⁶, University of Alberta²⁷, Samsung²⁸, University of Surrey²⁹, Korea University³⁰, Renmin University of China³¹, Nanjing University of Information Science and Technology³², University of Oxford³³, KAIST³⁴, Sharif University of Technology³⁵, Fuzhou University³⁶, University of Electronic Science and Technology of China³⁷

23 Aug 2020

TL;DR: A significant novelty is introduction of a new VOT short-term tracking evaluation methodology, and introduction of segmentation ground truth in the VOT-ST2020 challenge – bounding boxes will no longer be used in theVDT challenges.

...read moreread less

Abstract: The Visual Object Tracking challenge VOT2020 is the eighth annual tracker benchmarking activity organized by the VOT initiative. Results of 58 trackers are presented; many are state-of-the-art trackers published at major computer vision conferences or in journals in the recent years. The VOT2020 challenge was composed of five sub-challenges focusing on different tracking domains: (i) VOT-ST2020 challenge focused on short-term tracking in RGB, (ii) VOT-RT2020 challenge focused on “real-time” short-term tracking in RGB, (iii) VOT-LT2020 focused on long-term tracking namely coping with target disappearance and reappearance, (iv) VOT-RGBT2020 challenge focused on short-term tracking in RGB and thermal imagery and (v) VOT-RGBD2020 challenge focused on long-term tracking in RGB and depth imagery. Only the VOT-ST2020 datasets were refreshed. A significant novelty is introduction of a new VOT short-term tracking evaluation methodology, and introduction of segmentation ground truth in the VOT-ST2020 challenge – bounding boxes will no longer be used in the VOT-ST challenges. A new VOT Python toolkit that implements all these novelites was introduced. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The dataset, the evaluation kit and the results are publicly available at the challenge website (http://votchallenge.net).

...read moreread less

158 citations

Posted Content•

Dark Model Adaptation: Semantic Image Segmentation from Daytime to Nighttime

[...]

Dengxin Dai¹, Luc Van Gool¹•Institutions (1)

ETH Zurich¹

05 Oct 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: A novel method to progressive adapt the semantic models trained on daytime scenes, along with large-scale annotations therein, to nighttime scenes via the bridge of twilight time, to alleviate the cost of human annotation for nighttime images by transferring knowledge from standard daytime conditions.

...read moreread less

Abstract: This work addresses the problem of semantic image segmentation of nighttime scenes. Although considerable progress has been made in semantic image segmentation, it is mainly related to daytime scenarios. This paper proposes a novel method to progressive adapt the semantic models trained on daytime scenes, along with large-scale annotations therein, to nighttime scenes via the bridge of twilight time -- the time between dawn and sunrise, or between sunset and dusk. The goal of the method is to alleviate the cost of human annotation for nighttime images by transferring knowledge from standard daytime conditions. In addition to the method, a new dataset of road scenes is compiled; it consists of 35,000 images ranging from daytime to twilight time and to nighttime. Also, a subset of the nighttime images are densely annotated for method evaluation. Our experiments show that our method is effective for model adaptation from daytime scenes to nighttime scenes, without using extra human annotation.

...read moreread less

158 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
…
27
28
29
30
31
32
33
…
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

Deep Residual Learning for Image Recognition

[...]

Kaiming He¹, Xiangyu Zhang¹, Shaoqing Ren¹, Jian Sun¹•Institutions (1)

Microsoft¹

27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

...read moreread less

123,388 citations

Proceedings Article•

Very Deep Convolutional Networks for Large-Scale Image Recognition

[...]

Karen Simonyan¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

...read moreread less

55,235 citations

Proceedings Article•

Very Deep Convolutional Networks for Large-Scale Image Recognition

[...]

Karen Simonyan¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

49,914 citations

Posted Content•

Deep Residual Learning for Image Recognition

[...]

Kaiming He¹, Xiangyu Zhang¹, Shaoqing Ren¹, Jian Sun¹•Institutions (1)

Microsoft¹

10 Dec 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

...read moreread less

Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers---8x deeper than VGG nets but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

...read moreread less

44,703 citations

Proceedings Article•DOI•

Going deeper with convolutions

[...]

Christian Szegedy¹, Wei Liu², Yangqing Jia¹, Pierre Sermanet¹, Scott Reed³, Dragomir Anguelov¹, Dumitru Erhan¹, Vincent Vanhoucke¹, Andrew Rabinovich - Show less +5 more•Institutions (3)

Google¹, University of North Carolina at Chapel Hill², University of Michigan³

07 Jun 2015

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

...read moreread less

40,257 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse