Home
/
Authors
/
Venice Erin Liong

Author

Venice Erin Liong

Other affiliations: Agency for Science, Technology and Research, University of the Philippines Diliman, University of Illinois at Urbana–Champaign

Bio: Venice Erin Liong is an academic researcher from Nanyang Technological University. The author has contributed to research in topics: Discriminative model & Binary code. The author has an hindex of 18, co-authored 30 publications receiving 3138 citations. Previous affiliations of Venice Erin Liong include Agency for Science, Technology and Research & University of the Philippines Diliman.

Papers

PDF

Open Access

More filters

Posted Content•

nuScenes: A multimodal dataset for autonomous driving

[...]

Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, Oscar Beijbom - Show less +6 more

26 Mar 2019-arXiv: Learning

TL;DR: nuScenes as mentioned in this paper is the first dataset to carry the full autonomous vehicle sensor suite: 6 cameras, 5 radars and 1 lidar, all with full 360 degree field of view.

...read moreread less

Abstract: Robust detection and tracking of objects is crucial for the deployment of autonomous vehicle technology. Image based benchmark datasets have driven development in computer vision tasks such as object detection, tracking and segmentation of agents in the environment. Most autonomous vehicles, however, carry a combination of cameras and range sensors such as lidar and radar. As machine learning based methods for detection and tracking become more prevalent, there is a need to train and evaluate such methods on datasets containing range sensor data along with images. In this work we present nuTonomy scenes (nuScenes), the first dataset to carry the full autonomous vehicle sensor suite: 6 cameras, 5 radars and 1 lidar, all with full 360 degree field of view. nuScenes comprises 1000 scenes, each 20s long and fully annotated with 3D bounding boxes for 23 classes and 8 attributes. It has 7x as many annotations and 100x as many images as the pioneering KITTI dataset. We define novel 3D detection and tracking metrics. We also provide careful dataset analysis as well as baselines for lidar and image based detection and tracking. Data, development kit and more information are available online.

...read moreread less

1,939 citations

Proceedings Article•DOI•

nuScenes: A Multimodal Dataset for Autonomous Driving

[...]

Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, Oscar Beijbom - Show less +6 more

14 Jun 2020

TL;DR: nuScenes as discussed by the authors is the first dataset to carry the full autonomous vehicle sensor suite: 6 cameras, 5 radars and 1 lidar, all with full 360 degree field of view.

...read moreread less

1,378 citations

Proceedings Article•DOI•

Deep hashing for compact binary codes learning

[...]

Venice Erin Liong¹, Jiwen Lu¹, Gang Wang¹, Pierre Moulin¹, Jie Zhou² - Show less +1 more•Institutions (2)

Agency for Science, Technology and Research¹, Tsinghua University²

07 Jun 2015

TL;DR: A deep neural network is developed to seek multiple hierarchical non-linear transformations to learn compact binary codes for large scale visual search and shows the superiority of the proposed approach over the state-of-the-arts.

...read moreread less

Abstract: In this paper, we propose a new deep hashing (DH) approach to learn compact binary codes for large scale visual search. Unlike most existing binary codes learning methods which seek a single linear projection to map each sample into a binary vector, we develop a deep neural network to seek multiple hierarchical non-linear transformations to learn these binary codes, so that the nonlinear relationship of samples can be well exploited. Our model is learned under three constraints at the top layer of the deep network: 1) the loss between the original real-valued feature descriptor and the learned binary vector is minimized, 2) the binary codes distribute evenly on each bit, and 3) different bits are as independent as possible. To further improve the discriminative power of the learned binary codes, we extend DH into supervised DH (SDH) by including one discriminative term into the objective function of DH which simultaneously maximizes the inter-class variations and minimizes the intra-class variations of the learned binary codes. Experimental results show the superiority of the proposed approach over the state-of-the-arts.

...read moreread less

569 citations

Journal Article•DOI•

Learning Compact Binary Face Descriptor for Face Recognition

[...]

Jiwen Lu¹, Venice Erin Liong², Xiuzhuang Zhou³, Jie Zhou¹•Institutions (3)

Tsinghua University¹, University of Illinois at Urbana–Champaign², Capital Normal University³

01 Oct 2015-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: A compact binary face descriptor (CBFD) feature learning method for face representation and recognition that reduces the modality gap of heterogeneous faces at the feature level to make the method applicable to heterogeneous face recognition.

...read moreread less

Abstract: Binary feature descriptors such as local binary patterns (LBP) and its variations have been widely used in many face recognition systems due to their excellent robustness and strong discriminative power. However, most existing binary face descriptors are hand-crafted, which require strong prior knowledge to engineer them by hand. In this paper, we propose a compact binary face descriptor (CBFD) feature learning method for face representation and recognition. Given each face image, we first extract pixel difference vectors (PDVs) in local patches by computing the difference between each pixel and its neighboring pixels. Then, we learn a feature mapping to project these pixel difference vectors into low-dimensional binary vectors in an unsupervised manner, where 1) the variance of all binary codes in the training set is maximized, 2) the loss between the original real-valued codes and the learned binary codes is minimized, and 3) binary codes evenly distribute at each learned bin, so that the redundancy information in PDVs is removed and compact binary codes are obtained. Lastly, we cluster and pool these binary codes into a histogram feature as the final representation for each face image. Moreover, we propose a coupled CBFD (C-CBFD) method by reducing the modality gap of heterogeneous faces at the feature level to make our method applicable to heterogeneous face recognition. Extensive experimental results on five widely used face datasets show that our methods outperform state-of-the-art face descriptors.

...read moreread less

371 citations

Journal Article•DOI•

Simultaneous Local Binary Feature Learning and Encoding for Homogeneous and Heterogeneous Face Recognition

[...]

Jiwen Lu¹, Venice Erin Liong², Jie Zhou¹•Institutions (2)

Tsinghua University¹, Nanyang Technological University²

01 Aug 2018-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Experimental results on six widely used face datasets including the LFW, YouTube Face, FERET, PaSC, CASIA VIS-NIR 2.0, and Multi-PIE datasets are presented to demonstrate the effectiveness of the proposed methods.

...read moreread less

Abstract: In this paper, we propose a simultaneous local binary feature learning and encoding (SLBFLE) approach for both homogeneous and heterogeneous face recognition. Unlike existing hand-crafted face descriptors such as local binary pattern (LBP) and Gabor features which usually require strong prior knowledge, our SLBFLE is an unsupervised feature learning approach which automatically learns face representation from raw pixels. Unlike existing binary face descriptors such as the LBP, discriminant face descriptor (DFD), and compact binary face descriptor (CBFD) which use a two-stage feature extraction procedure, our SLBFLE jointly learns binary codes and the codebook for local face patches so that discriminative information from raw pixels from face images of different identities can be obtained by using a one-stage feature learning and encoding procedure. Moreover, we propose a coupled simultaneous local binary feature learning and encoding (C-SLBFLE) method to make the proposed approach suitable for heterogenous face matching. Unlike most existing coupled feature learning methods which learn a pair of transformation matrices for each modality, we exploit both the common and specific information from heterogeneous face samples to characterize their underlying correlations. Experimental results on six widely used face datasets including the LFW, YouTube Face (YTF), FERET, PaSC, CASIA VIS-NIR 2.0, and Multi-PIE datasets are presented to demonstrate the effectiveness of the proposed methods.

...read moreread less

146 citations

1
2
3
4
…
5
6
7

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Deep Learning for 3D Point Clouds: A Survey

[...]

Yulan Guo¹, Hanyun Wang², Qingyong Hu³, Hao Liu¹, Li Liu⁴, Mohammed Bennamoun⁵ - Show less +2 more•Institutions (5)

Sun Yat-sen University¹, PLA Information Engineering University², University of Oxford³, National University of Defense Technology⁴, University of Western Australia⁵

01 Dec 2021-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper presents a comprehensive review of recent progress in deep learning methods for point clouds, covering three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation.

...read moreread less

Abstract: Point cloud learning has lately attracted increasing attention due to its wide applications in many areas, such as computer vision, autonomous driving, and robotics As a dominating technique in AI, deep learning has been successfully used to solve various 2D vision problems However, deep learning on point clouds is still in its infancy due to the unique challenges faced by the processing of point clouds with deep neural networks Recently, deep learning on point clouds has become even thriving, with numerous methods being proposed to address different problems in this area To stimulate future research, this paper presents a comprehensive review of recent progress in deep learning methods for point clouds It covers three major tasks, including 3D shape classification, 3D object detection and tracking, and 3D point cloud segmentation It also presents comparative results on several publicly available datasets, together with insightful observations and inspiring future research directions

...read moreread less

1,021 citations

Posted Content•

Scalability in Perception for Autonomous Driving: Waymo Open Dataset

[...]

Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, Vijay K. Vasudevan, Wei Han, Jiquan Ngiam, Hang Zhao, Aleksei Timofeev, Scott Ettinger, Maxim Krivokon, Amy Gao, Aditya Joshi, Sheng Zhao, Shuyang Cheng, Yu Zhang, Jonathon Shlens, Zhifeng Chen, Dragomir Anguelov - Show less +21 more

10 Dec 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work introduces a new large scale, high quality, diverse dataset, consisting of well synchronized and calibrated high quality LiDAR and camera data captured across a range of urban and suburban geographies, and studies the effects of dataset size and generalization across geographies on 3D detection methods.

...read moreread less

Abstract: The research community has increasing interest in autonomous driving research, despite the resource intensity of obtaining representative real world data. Existing self-driving datasets are limited in the scale and variation of the environments they capture, even though generalization within and between operating regions is crucial to the overall viability of the technology. In an effort to help align the research community's contributions with real-world self-driving problems, we introduce a new large scale, high quality, diverse dataset. Our new dataset consists of 1150 scenes that each span 20 seconds, consisting of well synchronized and calibrated high quality LiDAR and camera data captured across a range of urban and suburban geographies. It is 15x more diverse than the largest camera+LiDAR dataset available based on our proposed diversity metric. We exhaustively annotated this data with 2D (camera image) and 3D (LiDAR) bounding boxes, with consistent identifiers across frames. Finally, we provide strong baselines for 2D as well as 3D detection and tracking tasks. We further study the effects of dataset size and generalization across geographies on 3D detection methods. Find data, code and more up-to-date information at this http URL.

...read moreread less

988 citations

Posted Content•

Person Re-identification: Past, Present and Future

[...]

Liang Zheng, Yi Yang, Alexander G. Hauptmann

10 Oct 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: The history of person re-identification and its relationship with image classification and instance retrieval is introduced and two new re-ID tasks which are much closer to real-world applications are described and discussed.

...read moreread less

Abstract: Person re-identification (re-ID) has become increasingly popular in the community due to its application and research significance. It aims at spotting a person of interest in other cameras. In the early days, hand-crafted algorithms and small-scale evaluation were predominantly reported. Recent years have witnessed the emergence of large-scale datasets and deep learning systems which make use of large data volumes. Considering different tasks, we classify most current re-ID methods into two classes, i.e., image-based and video-based; in both tasks, hand-crafted and deep learning systems will be reviewed. Moreover, two new re-ID tasks which are much closer to real-world applications are described and discussed, i.e., end-to-end re-ID and fast re-ID in very large galleries. This paper: 1) introduces the history of person re-ID and its relationship with image classification and instance retrieval; 2) surveys a broad selection of the hand-crafted systems and the large-scale methods in both image- and video-based re-ID; 3) describes critical future directions in end-to-end re-ID and fast retrieval in large galleries; and 4) finally briefs some important yet under-developed issues.

...read moreread less

984 citations

Proceedings Article•DOI•

Deep Hashing Network for Unsupervised Domain Adaptation

[...]

Hemanth Venkateswara¹, Jose Eusebio¹, Shayok Chakraborty¹, Sethuraman Panchanathan•Institutions (1)

Arizona State University¹

01 Jul 2017

TL;DR: In this article, the authors proposed a novel deep learning framework that can exploit labeled source data and unlabeled target data to learn informative hash codes, to accurately classify unseen target data.

...read moreread less

Abstract: In recent years, deep neural networks have emerged as a dominant machine learning tool for a wide variety of application domains. However, training a deep neural network requires a large amount of labeled data, which is an expensive process in terms of time, labor and human expertise. Domain adaptation or transfer learning algorithms address this challenge by leveraging labeled data in a different, but related source domain, to develop a model for the target domain. Further, the explosive growth of digital data has posed a fundamental challenge concerning its storage and retrieval. Due to its storage and retrieval efficiency, recent years have witnessed a wide application of hashing in a variety of computer vision applications. In this paper, we first introduce a new dataset, Office-Home, to evaluate domain adaptation algorithms. The dataset contains images of a variety of everyday objects from multiple domains. We then propose a novel deep learning framework that can exploit labeled source data and unlabeled target data to learn informative hash codes, to accurately classify unseen target data. To the best of our knowledge, this is the first research effort to exploit the feature learning capabilities of deep neural networks to learn representative hash codes to address the domain adaptation problem. Our extensive empirical studies on multiple transfer tasks corroborate the usefulness of the framework in learning efficient hash codes which outperform existing competitive baselines for unsupervised domain adaptation.

...read moreread less

984 citations

Proceedings Article•DOI•

Argoverse: 3D Tracking and Forecasting With Rich Maps

[...]

Ming-Fang Chang¹, Deva Ramanan, James Hays, John Lambert¹, Patsorn Sangkloy², Jasvinder A. Singh², Slawomir Bak², Andrew Hartnett², De Wang¹, Peter W. Carr, Simon Lucey - Show less +7 more•Institutions (2)

Georgia Institute of Technology¹, Carnegie Mellon University²

15 Jun 2019

TL;DR: Argoverse includes sensor data collected by a fleet of autonomous vehicles in Pittsburgh and Miami as well as 3D tracking annotations, 300k extracted interesting vehicle trajectories, and rich semantic maps, which contain rich geometric and semantic metadata which are not currently available in any public dataset.

...read moreread less

Abstract: We present Argoverse, a dataset designed to support autonomous vehicle perception tasks including 3D tracking and motion forecasting. Argoverse includes sensor data collected by a fleet of autonomous vehicles in Pittsburgh and Miami as well as 3D tracking annotations, 300k extracted interesting vehicle trajectories, and rich semantic maps. The sensor data consists of 360 degree images from 7 cameras with overlapping fields of view, forward-facing stereo imagery, 3D point clouds from long range LiDAR, and 6-DOF pose. Our 290km of mapped lanes contain rich geometric and semantic metadata which are not currently available in any public dataset. All data is released under a Creative Commons license at Argoverse.org. In baseline experiments, we use map information such as lane direction, driveable area, and ground height to improve the accuracy of 3D object tracking. We use 3D object tracking to mine for more than 300k interesting vehicle trajectories to create a trajectory forecasting benchmark. Motion forecasting experiments ranging in complexity from classical methods (k-NN) to LSTMs demonstrate that using detailed vector maps with lane-level information substantially reduces prediction error. Our tracking and forecasting experiments represent only a superficial exploration of the potential of rich maps in robotic perception. We hope that Argoverse will enable the research community to explore these problems in greater depth.

...read moreread less

950 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse