Joint Feature and Similarity Deep Learning for Vehicle Re-identification

doi:10.1109/ACCESS.2018.2862382

Home
/
Papers
/
Joint Feature and Similarity Deep Learning for Vehicle Re-identification

Journal Article•DOI•

Joint Feature and Similarity Deep Learning for Vehicle Re-identification

Jianqing Zhu¹, Huanqiang Zeng¹, Yongzhao Du¹, Zhen Lei², Lixin Zheng¹, Canhui Cai¹ - Show less +2 more•Institutions (2)

Huaqiao University¹, Chinese Academy of Sciences²

02 Aug 2018-IEEE Access (IEEE)-Vol. 6, pp 43724-43731

TL;DR: The proposed JFSDL method applies a siamese deep network to extract deep learning features for an input vehicle image pair simultaneously and is superior to multiple state-of-the-art vehicle re-identification methods on both the VehicleID and VeRi data sets.

read less

Abstract: In this paper, a joint feature and similarity deep learning (JFSDL) method for vehicle re-identification is proposed The proposed JFSDL method applies a siamese deep network to extract deep learning features for an input vehicle image pair simultaneously The siamese deep network is learned under the joint identification and verification supervision The joint identification and verification supervision is realized by linearly combining two softmax functions and one hybrid similarity learning function Moreover, based on the hybrid similarity learning function, the similarity score between the input vehicle image pair is also obtained by simultaneously projecting the element-wise absolute difference and multiplication of the corresponding deep learning feature pair with a group of learned weight coefficients Extensive experiments show that the proposed JFSDL method is superior to multiple state-of-the-art vehicle re-identification methods on both the VehicleID and VeRi data sets

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

A Survey of Vehicle Re-Identification Based on Deep Learning

[...]

Hongbo Wang¹, Jiaying Hou¹, Na Chen¹•Institutions (1)

Beijing University of Posts and Telecommunications¹

28 Nov 2019-IEEE Access

TL;DR: This survey gives a comprehensive review of the current five types of deep learning-based methods for vehicle re-identification, and compares them from characteristics, advantages, and disadvantages.

...read moreread less

Abstract: Vehicle re-identification is one of the core technologies of intelligent transportation systems, and it is crucial for the construction of smart cities. With the rapid development of deep learning, vehicle re-identification technologies have made significant progress in recent years. Therefore, making a comprehensive survey about the vehicle re-identification methods based on deep learning is quite indispensable. There are mainly five types of deep learning-based methods designed for vehicle re-identification, i.e. methods based on local features, methods based on representation learning, methods based on metric learning, methods based on unsupervised learning, and methods based on attention mechanism. The major contributions of our survey come from three aspects. First, we give a comprehensive review of the current five types of deep learning-based methods for vehicle re-identification, and we further compare them from characteristics, advantages, and disadvantages. Second, we sort out vehicle public datasets and compare them from multiple dimensions. Third, we further discuss the challenges and possible research directions of vehicle re-identification in the future based on our survey.

...read moreread less

39 citations

Cites methods from "Joint Feature and Similarity Deep L..."

...[57] proposed a joint feature and similarity deep learning (JFSDL) method which applied a...
[...]

Journal Article•DOI•

Deep Quadruplet Appearance Learning for Vehicle Re-Identification

[...]

Jinhui Hou¹, Huanqiang Zeng¹, Jianqing Zhu¹, Junhui Hou², Jing Chen¹, Kai-Kuang Ma³ - Show less +2 more•Institutions (3)

Huaqiao University¹, City University of Hong Kong², Nanyang Technological University³

09 Jul 2019-IEEE Transactions on Vehicular Technology

TL;DR: Extensive experiments conducted on two commonly used datasets VeRi-776 and VehicleID have demonstrated that the proposed DQAL approach outperforms multiple recently reported vehicle Re-ID methods.

...read moreread less

Abstract: Vehicle re-identification (Re-ID) plays an important role in intelligent transportation systems It usually suffers from various challenges encountered on the real-life environments, such as viewpoint variations, illumination changes, object occlusions, and other complicated scenarios To effectively improve the vehicle Re-ID performance, a new method, called the deep quadruplet appearance learning (DQAL), is proposed in this paper The novelty of the proposed DQAL lies on the consideration of the special difficulty in vehicle Re-ID that the vehicles with the same model and color but different identities (IDs) are highly similar to each other For that, the proposed DQAL designs the concept of quadruplet and forms the quadruplets as the input, where each quadruplet is composed of the anchor (or target), positive, negative, and the specially considered high-similar (ie, the same model and color but different IDs with respect to the anchor) vehicle samples Then, the quadruplet network with the incorporation of the proposed quadruplet loss and softmax loss is developed to learn a more discriminative feature for vehicle Re-ID, especially discerning those difficult high-similar cases Extensive experiments conducted on two commonly used datasets VeRi-776 and VehicleID have demonstrated that the proposed DQAL approach outperforms multiple recently reported vehicle Re-ID methods

...read moreread less

26 citations

Cites background or methods from "Joint Feature and Similarity Deep L..."

...[7], JFSDL [27], SDC-CNN [28], VGG+CCL [7], DGD [13], MixedDiff+CCL [7], VAMI [30], MLL+MLSR[29], Improved Triplet CNN [35], DJDL [36], GS-TRS loss W/mean[37]) obtain remarkable improvement than the hand-crafted features based methods (i....
[...]
..., [27], [41]), which could be considered as a complementary part to the proposed quadruplet loss....
[...]
...[27] propose a shared deep network to jointly learn the vehicles’ feature and similarity....
[...]
...Firstly, it can be observed that the deep learning based methods (i.e., GoogLeNet [1], NuFACT [6], FACT [5], DRDL [7], JFSDL [27], SDC-CNN [28], VGG+CCL [7], DGD [13], MixedDiff+CCL [7], VAMI [30], MLL+MLSR[29], Improved Triplet CNN [35], DJDL [36], GS-TRS loss W/mean[37]) obtain remarkable improvement than the hand-crafted features based methods (i.e., BOW-SIFT [24], BOW-CN [23], LOMO [11])....
[...]

Journal Article•DOI•

Multi-label learning with multi-label smoothing regularization for vehicle re-identification

[...]

Jinhui Hou¹, Huanqiang Zeng¹, Lei Cai¹, Jianqing Zhu¹, Jing Chen¹, Kai-Kuang Ma² - Show less +2 more•Institutions (2)

Huaqiao University¹, Nanyang Technological University²

14 Jun 2019-Neurocomputing

TL;DR: The proposed MLL with MLSR approach can effectively improve the performance delivered by the baseline and outperform multiple state-of-the-art vehicle re-ID methods as well.

...read moreread less

25 citations

Journal Article•DOI•

Uncertainty-optimized deep learning model for small-scale person re-identification

[...]

Cairong Zhao¹, Kang Chen¹, Di Zang¹, Zhaoxiang Zhang², Wangmeng Zuo³, Duoqian Miao¹ - Show less +2 more•Institutions (3)

Tongji University¹, Chinese Academy of Sciences², Harbin Institute of Technology³

15 Nov 2019-Science in China Series F: Information Sciences

TL;DR: This study considers the uncertainty of pedestrian representation for small-scale person re-identification and designs an improved Monte Carlo strategy that considers both the average distance and shortest distance for matching and ranking.

...read moreread less

Abstract: In recent years, deep learning has developed rapidly and is widely used in various fields, such as computer vision, speech recognition, and natural language processing. For end-to-end person re-identification, most deep learning methods rely on large-scale datasets. Relatively few methods work with small-scale datasets. Insufficient training samples will affect neural network accuracy significantly. This problem limits the practical application of person re-identification. For small-scale person re-identification, the uncertainty of person representation and the overfitting problem associated with deep learning remain to be solved. Quantifying the uncertainty is difficult owing to complex network structures and the large number of hyperparameters. In this study, we consider the uncertainty of pedestrian representation for small-scale person re-identification. To reduce the impact of uncertain person representations, we transform parameters into distributions and conduct multiple sampling by using multilevel dropout in a testing process. We design an improved Monte Carlo strategy that considers both the average distance and shortest distance for matching and ranking. When compared with state-of-the-art methods, the proposed method significantly improve accuracy on two small-scale person re-identification datasets and is robust on four large-scale datasets.

...read moreread less

25 citations

Journal Article•DOI•

Multi-Label-Based Similarity Learning for Vehicle Re-Identification

[...]

Saghir Ahmed Saghir Alfasly¹, Yongjian Hu¹, Haoliang Li², Liang Tiancai, Xiaofeng Jin, Liu Beibei¹, Zhao Qingli - Show less +3 more•Institutions (2)

South China University of Technology¹, Nanyang Technological University²

22 Oct 2019-IEEE Access

TL;DR: This paper proposes multi-label-based similarity learning (MLSL) for vehicle re-identification obtaining an efficient deep-learning-based model that derives robust vehicle representations and proves the superiority of the model over multiple state-of-the-art methods on the three mentioned datasets.

...read moreread less

Abstract: The massive attention to the surveillance video-based analysis makes the vehicle re-identification one of the current hot areas of interest to study. Extracting discriminative visual representations for vehicle re-identification is a challenging task due to the low-variance among the vehicles that share same model, brand, type, and color. Recently, several methods have been proposed for vehicle re-identification, that either use feature learning or metric learning approach. However, designing an efficient and cost-effective model is significantly demanded. In this paper, we propose multi-label-based similarity learning (MLSL) for vehicle re-identification obtaining an efficient deep-learning-based model that derives robust vehicle representations. Overall, our model features two main parts. First, a multi-label-based similarity learner that employs Siamese network on three different attributes of the vehicles: vehicle ID, color, and type. The second part is a regular CNN-based feature learner that employed to learn feature representations with vehicle ID attribute. The model is trained jointly with both parts. In order to validate the effectiveness of our model, a set of extensive experiments has been conducted on three of the largest well-known datasets VeRi-776, VehicleID, and VERI-Wild datasets. Furthermore, the parts of the proposed model are validated by exploring the influence of each part on the entire model performance. The results prove the superiority of our model over multiple state-of-the-art methods on the three mentioned datasets.

...read moreread less

24 citations

Cites background or methods from "Joint Feature and Similarity Deep L..."

...To validate the effectiveness of our model, we have compared its performance with many recent related works, including: Siamese-Visual [9], GoogLeNet [1], FACT [22], Chain MRF model [9], SCCN-Ft [29], CLBL-8-Ft [29], XVGAN [46], OIF [13], Siamese-CNN [9], VAMI [11], NuFACT [8], PROVID [8], VR-PROUD [16], Path-LSTM [9], JFSDL [27], D-DLF [28], and Mob....
[...]
...These methods can be categorized based on the learning type into semi-supervised/supervised learning methods [3], [8], [9], [11], [12], [14], [27] and unsupervised learning methods [15], [16]....
[...]
...Most Siamesemodels in literature [9], [27], [39]–[42] label the similarity of an image pair of the same vehicle with 1, whereas 0 is the pair label with different vehicle IDs....
[...]
...[27] train, Siamese network with classification and similarity learning jointly on vehicle ID labels....
[...]
...For instance, ResNet feature extractor is employed in [23], VGG-CNN-M in [3], [24], and [34], Inception-based feature extractor in [1], [13], and [22], MobileNet v1 in [17] and [35], DenseNet121 in [23], whereas in some models, new base networks have been designed such as in [11] and [27]....
[...]

1
2
3
4
…
5
6

Collapse

References

PDF

Open Access

More filters

Proceedings Article•

ImageNet Classification with Deep Convolutional Neural Networks

[...]

Alex Krizhevsky¹, Ilya Sutskever¹, Geoffrey E. Hinton¹•Institutions (1)

University of Toronto¹

03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

...read moreread less

73,978 citations

Proceedings Article•

Very Deep Convolutional Networks for Large-Scale Image Recognition

[...]

Karen Simonyan¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

...read moreread less

55,235 citations

"Joint Feature and Similarity Deep L..." refers methods in this paper

...For example, FACT [2], NuFACT [3], and DRDL [4] utilize AlexNet [7], GoogLeNet [9], and VGGNet [8] to extract features of vehicles, respectively....
[...]
...In addition, some well known deep feature learning architectures, such as AlexNet [7], VGGNet [8] and GoogLeNet [9], are employed as feature extractors for vehicle re-identification....
[...]
...In this work, a VGGNet [8] like deep network is deigned as the basic deep network of the proposed JFSDL method, which is shown in Figure 3....
[...]
...43726 VOLUME 6, 2018 C. BASIC DEEP NETWORK In this work, a VGGNet [8] like deep network is deigned as the basic deep network of the proposed JFSDL method, which is shown in Figure 3....
[...]

Proceedings Article•

Very Deep Convolutional Networks for Large-Scale Image Recognition

[...]

Karen Simonyan¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

49,914 citations

Proceedings Article•DOI•

Going deeper with convolutions

[...]

Christian Szegedy¹, Wei Liu², Yangqing Jia¹, Pierre Sermanet¹, Scott Reed³, Dragomir Anguelov¹, Dumitru Erhan¹, Vincent Vanhoucke¹, Andrew Rabinovich - Show less +5 more•Institutions (3)

Google¹, University of North Carolina at Chapel Hill², University of Michigan³

07 Jun 2015

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).

...read moreread less

Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

...read moreread less

40,257 citations

Proceedings Article•

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

[...]

Sergey Ioffe¹, Christian Szegedy¹•Institutions (1)

Google¹

06 Jul 2015

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization, and in some cases eliminates the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.82% top-5 test error, exceeding the accuracy of human raters.

...read moreread less

30,843 citations

"Joint Feature and Similarity Deep L..." refers methods in this paper

...For a convenient description, a convolutional layer, a batch normalization layer [17] and a Leaky ReLU [18] layer are sequently packaged together to construct a CBLR block, as shown in Figure 3(a)....
[...]
...The `2 regularization item is applied to avoid over-fitting by following the common practices in many deep learning algorithms [7]–[9], [15], [17]....
[...]