Home
/
Authors
/
Ankan Bansal

Author

Ankan Bansal

Other affiliations: Amazon.com, Indian Institute of Technology Kanpur

Bio: Ankan Bansal is an academic researcher from University of Maryland, College Park. The author has contributed to research in topics: Facial recognition system & Face detection. The author has an hindex of 14, co-authored 29 publications receiving 957 citations. Previous affiliations of Ankan Bansal include Amazon.com & Indian Institute of Technology Kanpur.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Deep Learning for Understanding Faces: Machines May Be Just as Good, or Better, than Humans

[...]

Rajeev Ranjan¹, Swami Sankaranarayanan¹, Ankan Bansal¹, Navaneeth Bodla¹, Jun-Cheng Chen¹, Vishal M. Patel², Carlos D. Castillo¹, Rama Chellappa¹ - Show less +4 more•Institutions (2)

University of Maryland, College Park¹, Rutgers University²

10 Jan 2018-IEEE Signal Processing Magazine

TL;DR: An overview of deep-learning methods used for face recognition is provided and different modules involved in designing an automatic face recognition system are discussed and the role of deep learning for each of them is discussed.

...read moreread less

Abstract: Recent developments in deep convolutional neural networks (DCNNs) have shown impressive performance improvements on various object detection/recognition problems. This has been made possible due to the availability of large annotated data and a better understanding of the nonlinear mapping between images and class labels, as well as the affordability of powerful graphics processing units (GPUs). These developments in deep learning have also improved the capabilities of machines in understanding faces and automatically executing the tasks of face detection, pose estimation, landmark localization, and face recognition from unconstrained images and videos. In this article, we provide an overview of deep-learning methods used for face recognition. We discuss different modules involved in designing an automatic face recognition system and the role of deep learning for each of them. Some open issues regarding DCNNs for face recognition problems are then discussed. This article should prove valuable to scientists, engineers, and end users working in the fields of face recognition, security, visual surveillance, and biometrics.

...read moreread less

183 citations

Book Chapter•DOI•

Zero-shot object detection

[...]

Ankan Bansal¹, Karan Sikka², Gaurav Sharma, Rama Chellappa¹, Ajay Divakaran² - Show less +1 more•Institutions (2)

University of Maryland, College Park¹, SRI International²

08 Sep 2018

TL;DR: The problem of zero-shot object detection (ZSD), which aims to detect object classes which are not observed during training, is introduced and the problems associated with selecting a background class are discussed and motivate two background-aware approaches for learning robust detectors.

...read moreread less

Abstract: We introduce and tackle the problem of zero-shot object detection (ZSD), which aims to detect object classes which are not observed during training. We work with a challenging set of object classes, not restricting ourselves to similar and/or fine-grained categories as in prior works on zero-shot classification. We present a principled approach by first adapting visual-semantic embeddings for ZSD. We then discuss the problems associated with selecting a background class and motivate two background-aware approaches for learning robust detectors. One of these models uses a fixed background class and the other is based on iterative latent assignments. We also outline the challenge associated with using a limited number of training classes and propose a solution based on dense sampling of the semantic label space using auxiliary data with a large number of categories. We propose novel splits of two standard detection datasets – MSCOCO and VisualGenome, and present extensive empirical results in both the traditional and generalized zero-shot settings to highlight the benefits of the proposed methods. We provide useful insights into the algorithm and conclude by posing some open questions to encourage further research.

...read moreread less

178 citations

Proceedings Article•DOI•

UMDFaces: An annotated face dataset for training deep networks

[...]

Ankan Bansal¹, Anirudh Nanduri¹, Carlos D. Castillo¹, Rajeev Ranjan¹, Rama Chellappa¹ - Show less +1 more•Institutions (1)

University of Maryland, College Park¹

01 Oct 2017-International Journal of Central Banking

TL;DR: The UMDFaces dataset as mentioned in this paper contains 367,888 annotated faces of 8,277 subjects, and the quality of keypoint annotations has been verified by humans for about 115,000 images.

...read moreread less

Abstract: Recent progress in face detection (including keypoint detection), and recognition is mainly being driven by (i) deeper convolutional neural network architectures, and (ii) larger datasets. However, most of the large datasets are maintained by private companies and are not publicly available. The academic computer vision community needs larger and more varied datasets to make further progress. In this paper, we introduce a new face dataset, called UMDFaces, which has 367,888 annotated faces of 8,277 subjects. We also introduce a new face recognition evaluation protocol which will help advance the state-of-the-art in this area. We discuss how a large dataset can be collected and annotated using human annotators and deep networks. We provide human curated bounding boxes for faces. We also provide estimated pose (roll, pitch and yaw), locations of twenty-one key-points and gender information generated by a pre-trained neural network. In addition, the quality of keypoint annotations has been verified by humans for about 115,000 images. Finally, we compare the quality of the dataset with other publicly available face datasets at similar scales.

...read moreread less

148 citations

Journal Article•DOI•

A Fast and Accurate System for Face Detection, Identification, and Verification

[...]

Rajeev Ranjan¹, Ankan Bansal¹, Jingxiao Zheng¹, Hongyu Xu¹, Joshua Gleason¹, Boyu Lu¹, Anirudh Nanduri¹, Jun-Cheng Chen¹, Carlos D. Castillo¹, Rama Chellappa¹ - Show less +6 more•Institutions (1)

University of Maryland, College Park¹

02 Apr 2019

TL;DR: A novel face detector, deep pyramid single shot face detector (DPSSD), which is fast and detects faces with large scale variations (especially tiny faces), and a new loss function, called crystal loss, for the tasks of face verification and identification.

...read moreread less

Abstract: The availability of large annotated datasets and affordable computation power have led to impressive improvements in the performance of convolutional neural networks (CNNs) on various face analysis tasks In this paper, we describe a deep learning pipeline for unconstrained face identification and verification which achieves state-of-the-art performance on several benchmark datasets We provide the design details of the various modules involved in automatic face recognition: face detection, landmark localization and alignment, and face identification/verification We propose a novel face detector, deep pyramid single shot face detector (DPSSD), which is fast and detects faces with large scale variations (especially tiny faces) Additionally, we propose a new loss function, called crystal loss, for the tasks of face verification and identification Crystal loss restricts the feature descriptors to lie on a hypersphere of a fixed radius, thus minimizing the angular distance between positive subject pairs and maximizing the angular distance between negative subject pairs We provide evaluation results of the proposed face detector on challenging unconstrained face detection datasets Then, we present experimental results for end-to-end face verification and identification on IARPA Janus Benchmarks A, B, and C (IJB-A, IJB-B, IJB-C), and the Janus Challenge Set 5 (CS5)

...read moreread less

114 citations

Posted Content•

Zero-Shot Object Detection

[...]

Ankan Bansal¹, Karan Sikka², Gaurav Sharma, Rama Chellappa¹, Ajay Divakaran² - Show less +1 more•Institutions (2)

University of Maryland, College Park¹, SRI International²

12 Apr 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: Zhang et al. as mentioned in this paper adapted visual-semantic embeddings for zero-shot object detection, which aims to detect object classes which are not observed during training, and proposed two background-aware approaches for learning robust detectors, one using a fixed background class and the other based on iterative latent assignments.

...read moreread less

Abstract: We introduce and tackle the problem of zero-shot object detection (ZSD), which aims to detect object classes which are not observed during training. We work with a challenging set of object classes, not restricting ourselves to similar and/or fine-grained categories as in prior works on zero-shot classification. We present a principled approach by first adapting visual-semantic embeddings for ZSD. We then discuss the problems associated with selecting a background class and motivate two background-aware approaches for learning robust detectors. One of these models uses a fixed background class and the other is based on iterative latent assignments. We also outline the challenge associated with using a limited number of training classes and propose a solution based on dense sampling of the semantic label space using auxiliary data with a large number of categories. We propose novel splits of two standard detection datasets - MSCOCO and VisualGenome, and present extensive empirical results in both the traditional and generalized zero-shot settings to highlight the benefits of the proposed methods. We provide useful insights into the algorithm and conclude by posing some open questions to encourage further research.

...read moreread less

110 citations

1
2
3
4
…
5
6

Collapse

Cited by

PDF

Open Access

More filters

Proceedings Article•DOI•

VGGFace2: A Dataset for Recognising Faces across Pose and Age

[...]

Qiong Cao¹, Li Shen¹, Weidi Xie¹, Omkar M. Parkhi¹, Andrew Zisserman¹ - Show less +1 more•Institutions (1)

University of Oxford¹

15 May 2018

TL;DR: VGGFace2 as discussed by the authors is a large-scale face dataset with 3.31 million images of 9131 subjects, with an average of 362.6 images for each subject.

...read moreread less

Abstract: In this paper, we introduce a new large-scale face dataset named VGGFace2. The dataset contains 3.31 million images of 9131 subjects, with an average of 362.6 images for each subject. Images are downloaded from Google Image Search and have large variations in pose, age, illumination, ethnicity and profession (e.g. actors, athletes, politicians). The dataset was collected with three goals in mind: (i) to have both a large number of identities and also a large number of images for each identity; (ii) to cover a large range of pose, age and ethnicity; and (iii) to minimise the label noise. We describe how the dataset was collected, in particular the automated and manual filtering stages to ensure a high accuracy for the images of each identity. To assess face recognition performance using the new dataset, we train ResNet-50 (with and without Squeeze-and-Excitation blocks) Convolutional Neural Networks on VGGFace2, on MS-Celeb-1M, and on their union, and show that training on VGGFace2 leads to improved recognition performance over pose and age. Finally, using the models trained on these datasets, we demonstrate state-of-the-art performance on the IJB-A and IJB-B face recognition benchmarks, exceeding the previous state-of-the-art by a large margin. The dataset and models are publicly available.

...read moreread less

2,365 citations

Journal Article•DOI•

Deep Learning for Generic Object Detection: A Survey

[...]

Li Liu¹, Li Liu², Wanli Ouyang³, Xiaogang Wang⁴, Paul Fieguth⁵, Jie Chen¹, Xinwang Liu², Matti Pietikäinen¹ - Show less +4 more•Institutions (5)

University of Oulu¹, National University of Defense Technology², University of Sydney³, The Chinese University of Hong Kong⁴, University of Waterloo⁵

01 Feb 2020-International Journal of Computer Vision

TL;DR: A comprehensive survey of the recent achievements in this field brought about by deep learning techniques, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics.

...read moreread less

Abstract: Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful strategy for learning feature representations directly from data and have led to remarkable breakthroughs in the field of generic object detection. Given this period of rapid evolution, the goal of this paper is to provide a comprehensive survey of the recent achievements in this field brought about by deep learning techniques. More than 300 research contributions are included in this survey, covering many aspects of generic object detection: detection frameworks, object feature representation, object proposal generation, context modeling, training strategies, and evaluation metrics. We finish the survey by identifying promising directions for future research.

...read moreread less

1,897 citations

Proceedings Article•DOI•

Single-Image Crowd Counting via Multi-Column Convolutional Neural Network

[...]

Yingying Zhang¹, Desen Zhou¹, Siqin Chen¹, Shenghua Gao¹, Yi Ma¹ - Show less +1 more•Institutions (1)

ShanghaiTech University¹

27 Jun 2016

TL;DR: With the proposed simple MCNN model, the method outperforms all existing methods and experiments show that the model, once trained on one dataset, can be readily transferred to a new dataset.

...read moreread less

Abstract: This paper aims to develop a method than can accurately estimate the crowd count from an individual image with arbitrary crowd density and arbitrary perspective. To this end, we have proposed a simple but effective Multi-column Convolutional Neural Network (MCNN) architecture to map the image to its crowd density map. The proposed MCNN allows the input image to be of arbitrary size or resolution. By utilizing filters with receptive fields of different sizes, the features learned by each column CNN are adaptive to variations in people/head size due to perspective effect or image resolution. Furthermore, the true density map is computed accurately based on geometry-adaptive kernels which do not need knowing the perspective map of the input image. Since exiting crowd counting datasets do not adequately cover all the challenging situations considered in our work, we have collected and labelled a large new dataset that includes 1198 images with about 330,000 heads annotated. On this challenging new dataset, as well as all existing datasets, we conduct extensive experiments to verify the effectiveness of the proposed model and method. In particular, with the proposed simple MCNN model, our method outperforms all existing methods. In addition, experiments show that our model, once trained on one dataset, can be readily transferred to a new dataset.

...read moreread less

1,603 citations

5分で分かる!? 有名論文ナナメ読み：Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding

[...]

柴田知秀

15 Feb 2020

1,595 citations

Journal Article•DOI•

Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey

[...]

Naveed Akhtar¹, Ajmal Mian¹•Institutions (1)

University of Western Australia¹

19 Feb 2018-IEEE Access

TL;DR: A comprehensive survey on adversarial attacks on deep learning in computer vision can be found in this paper, where the authors review the works that design adversarial attack, analyze the existence of such attacks and propose defenses against them.

...read moreread less

Abstract: Deep learning is at the heart of the current rise of artificial intelligence. In the field of computer vision, it has become the workhorse for applications ranging from self-driving cars to surveillance and security. Whereas, deep neural networks have demonstrated phenomenal success (often beyond human capabilities) in solving complex problems, recent studies show that they are vulnerable to adversarial attacks in the form of subtle perturbations to inputs that lead a model to predict incorrect outputs. For images, such perturbations are often too small to be perceptible, yet they completely fool the deep learning models. Adversarial attacks pose a serious threat to the success of deep learning in practice. This fact has recently led to a large influx of contributions in this direction. This paper presents the first comprehensive survey on adversarial attacks on deep learning in computer vision. We review the works that design adversarial attacks, analyze the existence of such attacks and propose defenses against them. To emphasize that adversarial attacks are possible in practical conditions, we separately review the contributions that evaluate adversarial attacks in the real-world scenarios. Finally, drawing on the reviewed literature, we provide a broader outlook of this research direction.

...read moreread less

1,542 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse