Deep joint discriminative learning for vehicle re-identification and retrieval

doi:10.1109/ICIP.2017.8296310

Home
/
Papers
/
Deep joint discriminative learning for vehicle re-identification and retrieval

Proceedings Article•DOI•

Deep joint discriminative learning for vehicle re-identification and retrieval

Yuqi Li¹, Yanghao Li¹, Hongfei Yan¹, Jiaying Liu¹•Institutions (1)

Peking University¹

16 Sep 2017-pp 395-399

TL;DR: A novel vehicle re-identification method based on a Deep Joint Discriminative Learning (DJDL) model, which utilizes a deep convolutional network to effectively extract discriminative representations for vehicle images, is proposed.

read less

Abstract: In this paper, we propose a novel vehicle re-identification method based on a Deep Joint Discriminative Learning (DJDL) model, which utilizes a deep convolutional network to effectively extract discriminative representations for vehicle images. To exploit properties and relationship among samples in different views, we design a unified framework to combine several different tasks efficiently, including identification, attribute recognition, verification and triplet tasks. The whole network is optimized jointly via a specific batch composition design. Extensive experiments are conducted on a large-scale VehicleID [1] dataset. Experimental results demonstrate the effectiveness of our method and show that it achieves the state-of-the-art performance on both vehicle re-identification and retrieval.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Vehicle Re-Identification Using Quadruple Directional Deep Learning Features

[...]

Jianqing Zhu¹, Huanqiang Zeng¹, Jingchang Huang², Shengcai Liao², Zhen Lei², Canhui Cai¹, Lixin Zheng¹ - Show less +3 more•Institutions (2)

Huaqiao University¹, Chinese Academy of Sciences²

01 Jan 2020-IEEE Transactions on Intelligent Transportation Systems

TL;DR: Wang et al. as discussed by the authors proposed quadruple directional deep learning (QD-DLF) for improving vehicle re-identification performance, which is based on the same basic deep learning architecture that is a shortly and densely connected convolutional neural network.

...read moreread less

Abstract: In order to resist the adverse effect of viewpoint variations, we design quadruple directional deep learning networks to extract quadruple directional deep learning features (QD-DLF) of vehicle images for improving vehicle re-identification performance. The quadruple directional deep learning networks are of similar overall architecture, including the same basic deep learning architecture but different directional feature pooling layers. Specifically, the same basic deep learning architecture that is a shortly and densely connected convolutional neural network is utilized to extract the basic feature maps of an input square vehicle image in the first stage. Then, the quadruple directional deep learning networks utilize different directional pooling layers, i.e., horizontal average pooling layer, vertical average pooling layer, diagonal average pooling layer, and anti-diagonal average pooling layer, to compress the basic feature maps into horizontal, vertical, diagonal, and anti-diagonal directional feature maps, respectively. Finally, these directional feature maps are spatially normalized and concatenated together as a quadruple directional deep learning feature for vehicle re-identification. The extensive experiments on both VeRi and VehicleID databases show that the proposed QD-DLF approach outperforms multiple state-of-the-art vehicle re-identification methods.

...read moreread less

123 citations

Journal Article•DOI•

A survey of advances in vision-based vehicle re-identification

[...]

Sultan Daud Khan, Habib Ullah

01 May 2019-Computer Vision and Image Understanding

TL;DR: The detail analysis of different V-reID methods in terms of mean average precision (mAP) and cumulative matching curve (CMC) provide objective insight into the strengths and weaknesses of these methods.

...read moreread less

116 citations

Cites background from "Deep joint discriminative learning ..."

...…the triplet-wise training (TWT) Zhang et al. (2017), the feature fusing model (FFM) Tang et al. (2017), the deep joint discriminative learning (DJDL) Li et al. (2017b), the Null space based Fusion of Color and Attribute feature (NuFACT) Liu et al. (2018), the multi-view feature (MVF) Zhou et al.…...
[...]
...Li et al. (2017a) collected vehicle re-identification dataset-1 (VRID-1)....
[...]
...The 12 deep feature methods are: the progressive vehicle re-identification (PROVID) Liu et al. (2016c), the deep relative distance learning (DRDL) Liu et al. (2016a), the deep color and texture (DCT) Liu et al. (2016b), the orientation invariant model (OIM) Wang et al. (2017), the visual spatio-temporal model (VSTM) Shen et al. (2017), the cross-level vehicle recognition (CLVR) Kanacı et al. (2017), the triplet-wise training (TWT) Zhang et al. (2017), the feature fusing model (FFM) Tang et al. (2017), the deep joint discriminative learning (DJDL) Li et al. (2017b), the Null space based Fusion of Color and Attribute feature (NuFACT) Liu et al. (2018), the multi-view feature (MVF) Zhou et al. (2018), and the group sensitive triplet embedding (GSTE) Bai et al. (2018)....
[...]
...The DJDL model was optimized jointly via a specific batch composition design....
[...]
...Li et al. (2017b) proposed a deep joint discriminative learning (DJDL) model, which extracts discriminative representations for vehicle images....
[...]

Journal Article•DOI•

A Multimodal Malware Detection Technique for Android IoT Devices Using Various Features

[...]

Rajesh Kumar¹, Xiaosong Zhang¹, Wenyong Wang¹, Riaz Ullah Khan¹, Jay Kumar¹, Abubaker Sharif¹ - Show less +2 more•Institutions (1)

University of Electronic Science and Technology of China¹

23 May 2019-IEEE Access

TL;DR: This paper presents a novel framework that combines the advantages of both machine learning techniques and blockchain technology to improve the malware detection for Android IoT devices and uses the permissioned blockchain to store authentic information of extracted features in a distributed malware database blocks to increase the run-time detection of malware with more speed and accuracy.

...read moreread less

Abstract: Internet of things (IoT) is revolutionizing this world with its evolving applications in various aspects of life such as sensing, healthcare, remote monitoring, and so on. Android devices and applications are working hand to hand to realize dreams of the IoT. Recently, there is a rapid increase in threats and malware attacks on Android-based devices. Moreover, due to extensive exploitation of the Android platform in the IoT devices creates a task challenging of securing such kind of malware activities. This paper presents a novel framework that combines the advantages of both machine learning techniques and blockchain technology to improve the malware detection for Android IoT devices. The proposed technique is implemented using a sequential approach, which includes clustering, classification, and blockchain. Machine learning automatically extracts the malware information using clustering and classification technique and store the information into the blockchain. Thereby, all malware information stored in the blockchain history can be communicated through the network, and therefore any latest malware can be detected effectively. The implementation of the clustering technique includes calculation of weights for each feature set, the development of parametric study for optimization and simultaneously iterative reduction of unnecessary features having small weights. The classification algorithm is implemented to extract the various features of Android malware using naive Bayes classifier. Moreover, the naive Bayes classifier is based on decision trees for extracting more important features to provide classification and regression for achieving high accuracy and robustness. Finally, our proposed framework uses the permissioned blockchain to store authentic information of extracted features in a distributed malware database blocks to increase the run-time detection of malware with more speed and accuracy, and further to announce malware information for all users.

...read moreread less

99 citations

Cites methods from "Deep joint discriminative learning ..."

...[43], developed a SIGPID framework to detect the risky permissions....
[...]
...As shown in Table 12, the detection accuracy or the F-measure values of our framework were higher than the other methods including the deep learning based methods [43], [56], [70], [73]....
[...]

Journal Article•DOI•

Structural Analysis of Attributes for Vehicle Re-Identification and Retrieval

[...]

Yanzhu Zhao¹, Chunhua Shen², Huibing Wang³, Shengyong Chen¹•Institutions (3)

Zhejiang University of Technology¹, University of Adelaide², Dalian Maritime University³

01 Feb 2020-IEEE Transactions on Intelligent Transportation Systems

TL;DR: A novel region of interests (ROIs)-based vehicle re-identification and retrieval method in which the ROIs’ deep features are used as discriminative identifiers, encoding the structure information of a vehicle.

...read moreread less

Abstract: Vehicle re-identification plays an important role in video surveillance applications. Despite the efforts made on this problem in the past few years, it remains a challenging task due to various factors such as pose variation, illumination changes, and subtle inter-class difference. We believe that the key information for identification has not been well explored in the literature. In this paper, we first collect a vehicle dataset ‘VAC21’ which contains 7129 images of five types of vehicles. Then, we carefully label the 21 classes of structural attributes hierarchically with bounding boxes. To our knowledge, this is the first dataset with several detailed attributes labeled. Based on this dataset, we use the state-of-the-art one-stage detection method, Single-shot Detection, as a baseline model for detecting attributes. Subsequently, we make a few important modifications tailored for this application to improve accuracy: 1) adding more proposals from low-level layers to improve the accuracy of detecting small objects and 2) employing the focal loss to improve the mean average precision. Furthermore, the results of the attribute detection can be applied to a series of vision tasks that focus on analyzing the images of vehicles. Finally, we propose a novel region of interests (ROIs)-based vehicle re-identification and retrieval method in which the ROIs’ deep features are used as discriminative identifiers, encoding the structure information of a vehicle. These deep features are input to a boosting model to improve the accuracy. A set of experiments are conducted on the dataset VehicleID and the experimental results show that our method outperforms the state-of-the-art methods.

...read moreread less

66 citations

Cites background from "Deep joint discriminative learning ..."

...Existing works [6], [11] draw inspirations from face and person re-identification to typically use distance metric learning for vehicle re-identification and retrieval....
[...]
...[11] proposed a deep joint discriminative learning model for vehicle...
[...]

Journal Article•DOI•

Group-Group Loss-Based Global-Regional Feature Learning for Vehicle Re-Identification

[...]

Xiaobin Liu¹, Shiliang Zhang¹, Xiaoyu Wang, Richang Hong², Qi Tian³ - Show less +1 more•Institutions (3)

Peking University¹, Hefei University of Technology², Huawei³

01 Jan 2020-IEEE Transactions on Image Processing

TL;DR: This work proposes a Group-Group Loss (GGL) to optimize the distance within and across vehicle image groups to accelerate the GRF learning and promote its discrimination power.

...read moreread less

Abstract: Vehicle Re-Identification (Re-ID) is challenging because vehicles of the same model commonly show similar appearance. We tackle this challenge by proposing a Global-Regional Feature (GRF) that depicts extra local details to enhance discrimination power in addition to the global context. It is motivated by the observation that, vehicles of same color, maker, and model can be distinguished by their regional difference, e.g. , the decorations on the windshields. To accelerate the GRF learning and promote its discrimination power, we propose a Group-Group Loss (GGL) to optimize the distance within and across vehicle image groups. Different from the siamese or triplet loss, GGL is directly computed on image groups rather than individual sample pairs or triplets. By avoiding traversing numerous sample combinations, GGL makes the model training easier and more efficient. Those two contributions highlight this work from previous methods on vehicle Re-ID task, which commonly learn global features with triplet loss or its variants. We evaluate our methods on two large-scale vehicle Re-ID datasets, i.e. , VeRi and VehicleID . Experimental results show our methods achieve promising performance in comparison with recent works.

...read moreread less

55 citations

Cites methods from "Deep joint discriminative learning ..."

...Identi+Attr+Verifi+Triplet [38] adopts InceptionBN [26] as the backbone structure, which is much deeper than VGG_CNN_M_1024 used in our methods, and uses additional attribute information in training, while GRF+GGL still outperforms it by 4.8% in Top-1 accuracy on VehicleID small....
[...]
...[38] combine several different tasks, including identification, attribute recognition, verification and triplet tasks to train a vehicle Re-ID model....
[...]
...Identi+Attr+Verifi+Triplet [38] adopts InceptionBN [26] as the backbone structure, which is much deeper than VGG_CNN_M_1024 used in our methods, and uses...
[...]
...On VehicleID, the compared works include OIF [9], VGG+T [1], VGG+CCL [1], SCCN-Ft-CLBL-8-Ft [32], CLVR [27], VAMI [7], Identi+Attr+Verifi+Triplet [38], and Mixed Diff+CCL [1]....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Deep Residual Learning for Image Recognition

[...]

Kaiming He¹, Xiangyu Zhang¹, Shaoqing Ren¹, Jian Sun¹•Institutions (1)

Microsoft¹

27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

...read moreread less

123,388 citations

Proceedings Article•

Very Deep Convolutional Networks for Large-Scale Image Recognition

[...]

Karen Simonyan¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

...read moreread less

49,914 citations

Proceedings Article•

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

[...]

Sergey Ioffe¹, Christian Szegedy¹•Institutions (1)

Google¹

06 Jul 2015

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization, and in some cases eliminates the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.82% top-5 test error, exceeding the accuracy of human raters.

...read moreread less

30,843 citations

"Deep joint discriminative learning ..." refers background or methods in this paper

...The base network can be a common deep convolutional network such as Inception-BN [11], VGG [12] or ResNet [13], which is pre-trained on the ImageNet....
[...]
...We adopt Inception-BN [11] as the base convolutional network in the experiments....
[...]

Proceedings Article•DOI•

FaceNet: A unified embedding for face recognition and clustering

[...]

Florian Schroff¹, Dmitry Kalenichenko¹, James Philbin¹•Institutions (1)

Google¹

07 Jun 2015

TL;DR: A system that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure offace similarity, and achieves state-of-the-art face recognition performance using only 128-bytes perface.

...read moreread less

Abstract: Despite significant recent advances in the field of face recognition [10, 14, 15, 17], implementing face verification and recognition efficiently at scale presents serious challenges to current approaches. In this paper we present a system, called FaceNet, that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. Once this space has been produced, tasks such as face recognition, verification and clustering can be easily implemented using standard techniques with FaceNet embeddings as feature vectors.

...read moreread less

8,289 citations

"Deep joint discriminative learning ..." refers background or methods in this paper

...Although vehicle identification problem is of a great importance, most previous object identification works focus on human face or person [2, 3, 4, 5]....
[...]
...Inspired by some state-ofthe-art methods in face recognition [3, 5], in this paper, we propose a Deep Joint Discriminative Learning (DJDL) model for vehicle re-identification and retrieval problem....
[...]
...The triplet loss [5] wants to ensure that an anchor image i is closer to all positive images j of the same identity (vi = vj) than it is to other negative images k of different identities (vi 6= vk)....
[...]
...Another successful deep learning framework is triplet loss for both face recognition [5] and person re-identification [8]....
[...]

Proceedings Article•

Fast approximate nearest neighbors with automatic algorithm configuration

[...]

Marius Muja¹, David G. Lowe¹•Institutions (1)

University of British Columbia¹

01 Jan 2009

TL;DR: A system that answers the question, “What is the fastest approximate nearest-neighbor algorithm for my data?” and a new algorithm that applies priority search on hierarchical k-means trees, which is found to provide the best known performance on many datasets.

...read moreread less

Abstract: For many computer vision problems, the most time consuming component consists of nearest neighbor matching in high-dimensional spaces. There are no known exact algorithms for solving these high-dimensional problems that are faster than linear search. Approximate algorithms are known to provide large speedups with only minor loss in accuracy, but many such algorithms have been published with only minimal guidance on selecting an algorithm and its parameters for any given problem. In this paper, we describe a system that answers the question, “What is the fastest approximate nearest-neighbor algorithm for my data?” Our system will take any given dataset and desired degree of precision and use these to automatically determine the best algorithm and parameter values. We also describe a new algorithm that applies priority search on hierarchical k-means trees, which we have found to provide the best known performance on many datasets. After testing a range of alternatives, we have found that multiple randomized k-d trees provide the best performance for other datasets. We are releasing public domain code that implements these approaches. This library provides about one order of magnitude improvement in query time over the best previously available software and provides fully automated parameter selection.

...read moreread less

2,934 citations

"Deep joint discriminative learning ..." refers methods in this paper

...To accelerate the retrieval process, we use the fast approximation nearest neighbor searching library Flann [15]....
[...]
...use the fast approximation nearest neighbor searching library Flann [15]....
[...]