Home
/
Authors
/
Hyeonseob Nam

Author

Hyeonseob Nam

Pohang University of Science and Technology

Bio: Hyeonseob Nam is an academic researcher from Pohang University of Science and Technology. The author has contributed to research in topics: Convolutional neural network & Artificial intelligence. The author has an hindex of 12, co-authored 16 publications receiving 4935 citations.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Learning Multi-domain Convolutional Neural Networks for Visual Tracking

[...]

Hyeonseob Nam¹, Bohyung Han¹•Institutions (1)

Pohang University of Science and Technology¹

27 Jun 2016

TL;DR: A novel visual tracking algorithm based on the representations from a discriminatively trained Convolutional Neural Network using a large set of videos with tracking ground-truths to obtain a generic target representation.

...read moreread less

Abstract: We propose a novel visual tracking algorithm based on the representations from a discriminatively trained Convolutional Neural Network (CNN). Our algorithm pretrains a CNN using a large set of videos with tracking groundtruths to obtain a generic target representation. Our network is composed of shared layers and multiple branches of domain-specific layers, where domains correspond to individual training sequences and each branch is responsible for binary classification to identify target in each domain. We train each domain in the network iteratively to obtain generic target representations in the shared layers. When tracking a target in a new sequence, we construct a new network by combining the shared layers in the pretrained CNN with a new binary classification layer, which is updated online. Online tracking is performed by evaluating the candidate windows randomly sampled around the previous target state. The proposed algorithm illustrates outstanding performance in existing tracking benchmarks.

...read moreread less

1,960 citations

Posted Content•

Learning Multi-Domain Convolutional Neural Networks for Visual Tracking

[...]

Hyeonseob Nam¹, Bohyung Han¹•Institutions (1)

Pohang University of Science and Technology¹

27 Oct 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: Zhang et al. as discussed by the authors proposed a novel visual tracking algorithm based on the representations from a discriminatively trained Convolutional Neural Network (CNN), which pretrain a CNN using a large set of videos with tracking ground-truths to obtain a generic target representation.

...read moreread less

Abstract: We propose a novel visual tracking algorithm based on the representations from a discriminatively trained Convolutional Neural Network (CNN). Our algorithm pretrains a CNN using a large set of videos with tracking ground-truths to obtain a generic target representation. Our network is composed of shared layers and multiple branches of domain-specific layers, where domains correspond to individual training sequences and each branch is responsible for binary classification to identify the target in each domain. We train the network with respect to each domain iteratively to obtain generic target representations in the shared layers. When tracking a target in a new sequence, we construct a new network by combining the shared layers in the pretrained CNN with a new binary classification layer, which is updated online. Online tracking is performed by evaluating the candidate windows randomly sampled around the previous target state. The proposed algorithm illustrates outstanding performance compared with state-of-the-art methods in existing tracking benchmarks.

...read moreread less

1,818 citations

Proceedings Article•DOI•

Dual Attention Networks for Multimodal Reasoning and Matching

[...]

Hyeonseob Nam, Jung-Woo Ha¹, Jeonghee Kim•Institutions (1)

Naver Corporation¹

01 Jul 2017

TL;DR: The authors propose Dual Attention Networks (DANs) which jointly leverage visual and textual attention mechanisms to capture fine-grained interplay between vision and language for VQA and image-text matching.

...read moreread less

Abstract: We propose Dual Attention Networks (DANs) which jointly leverage visual and textual attention mechanisms to capture fine-grained interplay between vision and language. DANs attend to specific regions in images and words in text through multiple steps and gather essential information from both modalities. Based on this framework, we introduce two types of DANs for multimodal reasoning and matching, respectively. The reasoning model allows visual and textual attentions to steer each other during collaborative inference, which is useful for tasks such as Visual Question Answering (VQA). In addition, the matching model exploits the two attention mechanisms to estimate the similarity between images and sentences by focusing on their shared semantics. Our extensive experiments validate the effectiveness of DANs in combining vision and language, achieving the state-of-the-art performance on public benchmarks for VQA and image-text matching.

...read moreread less

527 citations

Posted Content•

Dual Attention Networks for Multimodal Reasoning and Matching

[...]

Hyeonseob Nam, Jung-Woo Ha¹, Jeonghee Kim•Institutions (1)

Naver Corporation¹

02 Nov 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work proposes Dual Attention Networks which jointly leverage visual and textual attention mechanisms to capture fine-grained interplay between vision and language and introduces two types of DANs for multimodal reasoning and matching, respectively.

...read moreread less

456 citations

Book Chapter•DOI•

The Visual Object Tracking VOT2014 challenge results

[...]

Matej Kristan¹, Roman Pflugfelder², Ales Leonardis³, Jiri Matas⁴, Luka Cehovin¹, Georg Nebehay², Tomas Vojir⁴, Gustavo Fernandez², Alan Lukezic¹, Aleksandar Dimitriev¹, Alfredo Petrosino⁵, Amir Saffari, Bo Li⁶, Bohyung Han⁷, Cherkeng Heng⁶, Christophe Garcia, Dominik Pangersic¹, Gustav Häger⁸, Fahad Shahbaz Khan⁸, Franci Oven¹, Horst Possegger⁹, Horst Bischof⁹, Hyeonseob Nam⁷, Jianke Zhu¹⁰, Jijia Li¹¹, Jin-Young Choi¹², Jin-Woo Choi¹³, João F. Henriques¹⁴, Joost van de Weijer¹⁵, Jorge Batista¹⁴, Karel Lebeda¹⁶, Kristoffer Öfjäll⁸, Kwang Moo Yi¹⁷, Lei Qin, Longyin Wen¹⁸, Mario Edoardo Maresca⁵, Martin Danelljan⁸, Michael Felsberg⁸, Ming-Ming Cheng¹⁹, Philip H. S. Torr¹⁹, Qingming Huang²⁰, Richard Bowden¹⁶, Sam Hare, Samantha Yueying Lim⁶, Seunghoon Hong⁷, Shengcai Liao¹⁸, Simon Hadfield¹⁶, Stan Z. Li¹⁸, Stefan Duffner, Stuart Golodetz¹⁹, Thomas Mauthner⁹, Vibhav Vineet¹⁹, Weiyao Lin¹¹, Yang Li¹⁰, Yuankai Qi²⁰, Zhen Lei¹⁸, Zhiheng Niu⁶ - Show less +53 more•Institutions (20)

University of Ljubljana¹, Austrian Institute of Technology², University of Birmingham³, Czech Technical University in Prague⁴, Parthenope University of Naples⁵, Panasonic⁶, Pohang University of Science and Technology⁷, Linköping University⁸, Graz University of Technology⁹, Zhejiang University¹⁰, Shanghai Jiao Tong University¹¹, Seoul National University¹², Electronics and Telecommunications Research Institute¹³, University of Coimbra¹⁴, Autonomous University of Barcelona¹⁵, University of Surrey¹⁶, École Polytechnique Fédérale de Lausanne¹⁷, Chinese Academy of Sciences¹⁸, University of Oxford¹⁹, Harbin Institute of Technology²⁰

06 Sep 2014

TL;DR: The evaluation protocol of the VOT2013 challenge and the results of a comparison of 27 trackers on the benchmark dataset are presented, offering a more systematic comparison of the trackers.

...read moreread less

Abstract: The Visual Object Tracking challenge 2014, VOT2014, aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results of 38 trackers are presented. The number of tested trackers makes VOT 2014 the largest benchmark on short-term tracking to date. For each participating tracker, a short description is provided in the appendix. Features of the VOT2014 challenge that go beyond its VOT2013 predecessor are introduced: (i) a new VOT2014 dataset with full annotation of targets by rotated bounding boxes and per-frame attribute, (ii) extensions of the VOT2013 evaluation methodology, (iii) a new unit for tracking speed assessment less dependent on the hardware and (iv) the VOT2014 evaluation toolkit that significantly speeds up execution of experiments. The dataset, the evaluation kit as well as the results are publicly available at the challenge website (http://votchallenge.net).

...read moreread less

391 citations

1
2
3
4
…

Cited by

PDF

Open Access

More filters

Journal Article•

“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

[...]

杉山拓海

12 Sep 2017-Computers & Graphics

3,940 citations

Proceedings Article•DOI•

CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features

[...]

Sangdoo Yun¹, Dongyoon Han¹, Sanghyuk Chun¹, Seong Joon Oh, Youngjoon Yoo¹, Junsuk Choe² - Show less +2 more•Institutions (2)

Naver Corporation¹, Yonsei University²

07 Aug 2019

TL;DR: CutMix as discussed by the authors augments the training data by cutting and pasting patches among training images, where the ground truth labels are also mixed proportionally to the area of the patches.

...read moreread less

Abstract: Regional dropout strategies have been proposed to enhance performance of convolutional neural network classifiers. They have proved to be effective for guiding the model to attend on less discriminative parts of objects (e.g. leg as opposed to head of a person), thereby letting the network generalize better and have better object localization capabilities. On the other hand, current methods for regional dropout removes informative pixels on training images by overlaying a patch of either black pixels or random noise. Such removal is not desirable because it suffers from information loss causing inefficiency in training. We therefore propose the CutMix augmentation strategy: patches are cut and pasted among training images where the ground truth labels are also mixed proportionally to the area of the patches. By making efficient use of training pixels and retaining the regularization effect of regional dropout, CutMix consistently outperforms state-of-the-art augmentation strategies on CIFAR and ImageNet classification tasks, as well as on ImageNet weakly-supervised localization task. Moreover, unlike previous augmentation methods, our CutMix-trained ImageNet classifier, when used as a pretrained model, results in consistent performance gain in Pascal detection and MS-COCO image captioning benchmarks. We also show that CutMix can improve the model robustness against input corruptions and its out-of distribution detection performance.

...read moreread less

3,013 citations

Journal Article•DOI•

Object Tracking Benchmark

[...]

Yi Wu¹, Jongwoo Lim², Ming-Hsuan Yang³•Institutions (3)

Nanjing University of Information Science and Technology¹, Hanyang University², University of California, Merced³

01 Sep 2015-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: An extensive evaluation of the state-of-the-art online object-tracking algorithms with various evaluation criteria is carried out to identify effective approaches for robust tracking and provide potential future research directions in this field.

...read moreread less

Abstract: Object tracking has been one of the most important and active research areas in the field of computer vision. A large number of tracking algorithms have been proposed in recent years with demonstrated success. However, the set of sequences used for evaluation is often not sufficient or is sometimes biased for certain types of algorithms. Many datasets do not have common ground-truth object positions or extents, and this makes comparisons among the reported quantitative results difficult. In addition, the initial conditions or parameters of the evaluated tracking algorithms are not the same, and thus, the quantitative results reported in literature are incomparable or sometimes contradictory. To address these issues, we carry out an extensive evaluation of the state-of-the-art online object-tracking algorithms with various evaluation criteria to understand how these methods perform within the same framework. In this work, we first construct a large dataset with ground-truth object positions and extents for tracking and introduce the sequence attributes for the performance analysis. Second, we integrate most of the publicly available trackers into one code library with uniform input and output formats to facilitate large-scale performance evaluation. Third, we extensively evaluate the performance of 31 algorithms on 100 sequences with different initialization settings. By analyzing the quantitative results, we identify effective approaches for robust tracking and provide potential future research directions in this field.

...read moreread less

2,974 citations

Book Chapter•DOI•

Fully-Convolutional Siamese Networks for Object Tracking

[...]

Luca Bertinetto¹, Jack Valmadre¹, João F. Henriques¹, Andrea Vedaldi¹, Philip H. S. Torr¹ - Show less +1 more•Institutions (1)

University of Oxford¹

08 Oct 2016

TL;DR: A basic tracking algorithm is equipped with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video and achieves state-of-the-art performance in multiple benchmarks.

...read moreread less

Abstract: The problem of arbitrary object tracking has traditionally been tackled by learning a model of the object’s appearance exclusively online, using as sole training data the video itself. Despite the success of these methods, their online-only approach inherently limits the richness of the model they can learn. Recently, several attempts have been made to exploit the expressive power of deep convolutional networks. However, when the object to track is not known beforehand, it is necessary to perform Stochastic Gradient Descent online to adapt the weights of the network, severely compromising the speed of the system. In this paper we equip a basic tracking algorithm with a novel fully-convolutional Siamese network trained end-to-end on the ILSVRC15 dataset for object detection in video. Our tracker operates at frame-rates beyond real-time and, despite its extreme simplicity, achieves state-of-the-art performance in multiple benchmarks.

...read moreread less

2,936 citations

Proceedings Article•DOI•

High Performance Visual Tracking with Siamese Region Proposal Network

[...]

Bo Li¹, Junjie Yan², Wei Wu³, Zheng Zhu⁴, Xiaolin Hu² - Show less +1 more•Institutions (4)

Beihang University¹, Tsinghua University², SenseTime³, Chinese Academy of Sciences⁴

18 Jun 2018

TL;DR: The Siamese region proposal network (Siamese-RPN) is proposed which is end-to-end trained off-line with large-scale image pairs for visual object tracking and consists of SiAMESe subnetwork for feature extraction and region proposal subnetwork including the classification branch and regression branch.

...read moreread less

Abstract: Visual object tracking has been a fundamental topic in recent years and many deep learning based trackers have achieved state-of-the-art performance on multiple benchmarks. However, most of these trackers can hardly get top performance with real-time speed. In this paper, we propose the Siamese region proposal network (Siamese-RPN) which is end-to-end trained off-line with large-scale image pairs. Specifically, it consists of Siamese subnetwork for feature extraction and region proposal subnetwork including the classification branch and regression branch. In the inference phase, the proposed framework is formulated as a local one-shot detection task. We can pre-compute the template branch of the Siamese subnetwork and formulate the correlation layers as trivial convolution layers to perform online tracking. Benefit from the proposal refinement, traditional multi-scale test and online fine-tuning can be discarded. The Siamese-RPN runs at 160 FPS while achieving leading performance in VOT2015, VOT2016 and VOT2017 real-time challenges.

...read moreread less

2,016 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse