scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Joint Feature and Similarity Deep Learning for Vehicle Re-identification

02 Aug 2018-IEEE Access (IEEE)-Vol. 6, pp 43724-43731
TL;DR: The proposed JFSDL method applies a siamese deep network to extract deep learning features for an input vehicle image pair simultaneously and is superior to multiple state-of-the-art vehicle re-identification methods on both the VehicleID and VeRi data sets.
Abstract: In this paper, a joint feature and similarity deep learning (JFSDL) method for vehicle re-identification is proposed The proposed JFSDL method applies a siamese deep network to extract deep learning features for an input vehicle image pair simultaneously The siamese deep network is learned under the joint identification and verification supervision The joint identification and verification supervision is realized by linearly combining two softmax functions and one hybrid similarity learning function Moreover, based on the hybrid similarity learning function, the similarity score between the input vehicle image pair is also obtained by simultaneously projecting the element-wise absolute difference and multiplication of the corresponding deep learning feature pair with a group of learned weight coefficients Extensive experiments show that the proposed JFSDL method is superior to multiple state-of-the-art vehicle re-identification methods on both the VehicleID and VeRi data sets
Citations
More filters
Journal ArticleDOI
TL;DR: This survey gives a comprehensive review of the current five types of deep learning-based methods for vehicle re-identification, and compares them from characteristics, advantages, and disadvantages.
Abstract: Vehicle re-identification is one of the core technologies of intelligent transportation systems, and it is crucial for the construction of smart cities. With the rapid development of deep learning, vehicle re-identification technologies have made significant progress in recent years. Therefore, making a comprehensive survey about the vehicle re-identification methods based on deep learning is quite indispensable. There are mainly five types of deep learning-based methods designed for vehicle re-identification, i.e. methods based on local features, methods based on representation learning, methods based on metric learning, methods based on unsupervised learning, and methods based on attention mechanism. The major contributions of our survey come from three aspects. First, we give a comprehensive review of the current five types of deep learning-based methods for vehicle re-identification, and we further compare them from characteristics, advantages, and disadvantages. Second, we sort out vehicle public datasets and compare them from multiple dimensions. Third, we further discuss the challenges and possible research directions of vehicle re-identification in the future based on our survey.

39 citations


Cites methods from "Joint Feature and Similarity Deep L..."

  • ...[57] proposed a joint feature and similarity deep learning (JFSDL) method which applied a...

    [...]

Journal ArticleDOI
TL;DR: Extensive experiments conducted on two commonly used datasets VeRi-776 and VehicleID have demonstrated that the proposed DQAL approach outperforms multiple recently reported vehicle Re-ID methods.
Abstract: Vehicle re-identification (Re-ID) plays an important role in intelligent transportation systems It usually suffers from various challenges encountered on the real-life environments, such as viewpoint variations, illumination changes, object occlusions, and other complicated scenarios To effectively improve the vehicle Re-ID performance, a new method, called the deep quadruplet appearance learning (DQAL), is proposed in this paper The novelty of the proposed DQAL lies on the consideration of the special difficulty in vehicle Re-ID that the vehicles with the same model and color but different identities (IDs) are highly similar to each other For that, the proposed DQAL designs the concept of quadruplet and forms the quadruplets as the input, where each quadruplet is composed of the anchor (or target), positive, negative, and the specially considered high-similar (ie, the same model and color but different IDs with respect to the anchor) vehicle samples Then, the quadruplet network with the incorporation of the proposed quadruplet loss and softmax loss is developed to learn a more discriminative feature for vehicle Re-ID, especially discerning those difficult high-similar cases Extensive experiments conducted on two commonly used datasets VeRi-776 and VehicleID have demonstrated that the proposed DQAL approach outperforms multiple recently reported vehicle Re-ID methods

26 citations


Cites background or methods from "Joint Feature and Similarity Deep L..."

  • ...[7], JFSDL [27], SDC-CNN [28], VGG+CCL [7], DGD [13], MixedDiff+CCL [7], VAMI [30], MLL+MLSR[29], Improved Triplet CNN [35], DJDL [36], GS-TRS loss W/mean[37]) obtain remarkable improvement than the hand-crafted features based methods (i....

    [...]

  • ..., [27], [41]), which could be considered as a complementary part to the proposed quadruplet loss....

    [...]

  • ...[27] propose a shared deep network to jointly learn the vehicles’ feature and similarity....

    [...]

  • ...Firstly, it can be observed that the deep learning based methods (i.e., GoogLeNet [1], NuFACT [6], FACT [5], DRDL [7], JFSDL [27], SDC-CNN [28], VGG+CCL [7], DGD [13], MixedDiff+CCL [7], VAMI [30], MLL+MLSR[29], Improved Triplet CNN [35], DJDL [36], GS-TRS loss W/mean[37]) obtain remarkable improvement than the hand-crafted features based methods (i.e., BOW-SIFT [24], BOW-CN [23], LOMO [11])....

    [...]

Journal ArticleDOI
TL;DR: The proposed MLL with MLSR approach can effectively improve the performance delivered by the baseline and outperform multiple state-of-the-art vehicle re-ID methods as well.

25 citations

Journal ArticleDOI
TL;DR: This study considers the uncertainty of pedestrian representation for small-scale person re-identification and designs an improved Monte Carlo strategy that considers both the average distance and shortest distance for matching and ranking.
Abstract: In recent years, deep learning has developed rapidly and is widely used in various fields, such as computer vision, speech recognition, and natural language processing. For end-to-end person re-identification, most deep learning methods rely on large-scale datasets. Relatively few methods work with small-scale datasets. Insufficient training samples will affect neural network accuracy significantly. This problem limits the practical application of person re-identification. For small-scale person re-identification, the uncertainty of person representation and the overfitting problem associated with deep learning remain to be solved. Quantifying the uncertainty is difficult owing to complex network structures and the large number of hyperparameters. In this study, we consider the uncertainty of pedestrian representation for small-scale person re-identification. To reduce the impact of uncertain person representations, we transform parameters into distributions and conduct multiple sampling by using multilevel dropout in a testing process. We design an improved Monte Carlo strategy that considers both the average distance and shortest distance for matching and ranking. When compared with state-of-the-art methods, the proposed method significantly improve accuracy on two small-scale person re-identification datasets and is robust on four large-scale datasets.

25 citations

Journal ArticleDOI
TL;DR: This paper proposes multi-label-based similarity learning (MLSL) for vehicle re-identification obtaining an efficient deep-learning-based model that derives robust vehicle representations and proves the superiority of the model over multiple state-of-the-art methods on the three mentioned datasets.
Abstract: The massive attention to the surveillance video-based analysis makes the vehicle re-identification one of the current hot areas of interest to study. Extracting discriminative visual representations for vehicle re-identification is a challenging task due to the low-variance among the vehicles that share same model, brand, type, and color. Recently, several methods have been proposed for vehicle re-identification, that either use feature learning or metric learning approach. However, designing an efficient and cost-effective model is significantly demanded. In this paper, we propose multi-label-based similarity learning (MLSL) for vehicle re-identification obtaining an efficient deep-learning-based model that derives robust vehicle representations. Overall, our model features two main parts. First, a multi-label-based similarity learner that employs Siamese network on three different attributes of the vehicles: vehicle ID, color, and type. The second part is a regular CNN-based feature learner that employed to learn feature representations with vehicle ID attribute. The model is trained jointly with both parts. In order to validate the effectiveness of our model, a set of extensive experiments has been conducted on three of the largest well-known datasets VeRi-776, VehicleID, and VERI-Wild datasets. Furthermore, the parts of the proposed model are validated by exploring the influence of each part on the entire model performance. The results prove the superiority of our model over multiple state-of-the-art methods on the three mentioned datasets.

24 citations


Cites background or methods from "Joint Feature and Similarity Deep L..."

  • ...To validate the effectiveness of our model, we have compared its performance with many recent related works, including: Siamese-Visual [9], GoogLeNet [1], FACT [22], Chain MRF model [9], SCCN-Ft [29], CLBL-8-Ft [29], XVGAN [46], OIF [13], Siamese-CNN [9], VAMI [11], NuFACT [8], PROVID [8], VR-PROUD [16], Path-LSTM [9], JFSDL [27], D-DLF [28], and Mob....

    [...]

  • ...These methods can be categorized based on the learning type into semi-supervised/supervised learning methods [3], [8], [9], [11], [12], [14], [27] and unsupervised learning methods [15], [16]....

    [...]

  • ...Most Siamesemodels in literature [9], [27], [39]–[42] label the similarity of an image pair of the same vehicle with 1, whereas 0 is the pair label with different vehicle IDs....

    [...]

  • ...[27] train, Siamese network with classification and similarity learning jointly on vehicle ID labels....

    [...]

  • ...For instance, ResNet feature extractor is employed in [23], VGG-CNN-M in [3], [24], and [34], Inception-based feature extractor in [1], [13], and [22], MobileNet v1 in [17] and [35], DenseNet121 in [23], whereas in some models, new base networks have been designed such as in [11] and [27]....

    [...]

References
More filters
Proceedings Article
03 Dec 2012
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

73,978 citations

Proceedings Article
04 Sep 2014
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

55,235 citations


"Joint Feature and Similarity Deep L..." refers methods in this paper

  • ...For example, FACT [2], NuFACT [3], and DRDL [4] utilize AlexNet [7], GoogLeNet [9], and VGGNet [8] to extract features of vehicles, respectively....

    [...]

  • ...In addition, some well known deep feature learning architectures, such as AlexNet [7], VGGNet [8] and GoogLeNet [9], are employed as feature extractors for vehicle re-identification....

    [...]

  • ...In this work, a VGGNet [8] like deep network is deigned as the basic deep network of the proposed JFSDL method, which is shown in Figure 3....

    [...]

  • ...43726 VOLUME 6, 2018 C. BASIC DEEP NETWORK In this work, a VGGNet [8] like deep network is deigned as the basic deep network of the proposed JFSDL method, which is shown in Figure 3....

    [...]

Proceedings Article
01 Jan 2015
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

49,914 citations

Proceedings ArticleDOI
07 Jun 2015
TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Abstract: We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. By a carefully crafted design, we increased the depth and width of the network while keeping the computational budget constant. To optimize quality, the architectural decisions were based on the Hebbian principle and the intuition of multi-scale processing. One particular incarnation used in our submission for ILSVRC14 is called GoogLeNet, a 22 layers deep network, the quality of which is assessed in the context of classification and detection.

40,257 citations

Proceedings Article
Sergey Ioffe1, Christian Szegedy1
06 Jul 2015
TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization, and in some cases eliminates the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.82% top-5 test error, exceeding the accuracy of human raters.

30,843 citations


"Joint Feature and Similarity Deep L..." refers methods in this paper

  • ...For a convenient description, a convolutional layer, a batch normalization layer [17] and a Leaky ReLU [18] layer are sequently packaged together to construct a CBLR block, as shown in Figure 3(a)....

    [...]

  • ...The `2 regularization item is applied to avoid over-fitting by following the common practices in many deep learning algorithms [7]–[9], [15], [17]....

    [...]

Trending Questions (1)
Does vehicle number change after re registration?

Extensive experiments show that the proposed JFSDL method is superior to multiple state-of-the-art vehicle re-identification methods on both the VehicleID and VeRi data sets.