scispace - formally typeset
Proceedings ArticleDOI

Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-identification

Reads0
Chats0
TLDR
Both the orientation invariant feature embedding and the spatio-temporal regularization achieve considerable improvements in the vehicle Re-identification problem.
Abstract
In this paper, we tackle the vehicle Re-identification (ReID) problem which is of great importance in urban surveillance and can be used for multiple applications. In our vehicle ReID framework, an orientation invariant feature embedding module and a spatial-temporal regularization module are proposed. With orientation invariant feature embedding, local region features of different orientations can be extracted based on 20 key point locations and can be well aligned and combined. With spatial-temporal regularization, the log-normal distribution is adopted to model the spatial-temporal constraints and the retrieval results can be refined. Experiments are conducted on public vehicle ReID datasets and our proposed method achieves state-of-the-art performance. Investigations of the proposed framework is conducted, including the landmark regressor and comparisons with attention mechanism. Both the orientation invariant feature embedding and the spatio-temporal regularization achieve considerable improvements.

read more

Citations
More filters
Journal ArticleDOI

Coarse-to-fine sparse self-attention for vehicle re-identification

TL;DR: Li et al. as discussed by the authors decompose the self-attention process to a coarse stage and a fine stage, where the pixellevel feature map is transformed to patch-level feature maps and the dependencies between similar vehicle parts are captured in a global context.
Posted Content

Attribute-guided Feature Extraction and Augmentation Robust Learning for Vehicle Re-identification

TL;DR: Zhang et al. as mentioned in this paper proposed a multi-guided learning approach which utilizes the information of attributes and introduces two novel random augments to improve the robustness during training, which achieved mAP of 66.83% and rank-1 accuracy 76.05% in the CVPR 2020 AI City Challenge.
Proceedings ArticleDOI

Vehicle View Synthesis by Generative Adversarial Network

TL;DR: In this article , a novel view synthesis method is proposed based on Generative Adversarial Networks (GANs), named PTGAN, which extracts identity-related and pose-unrelated feature representations from input images and concatenates the representation with the pose information to generate the fake image with the assigned pose.
Journal ArticleDOI

Multi-Receptive Field Soft Attention Part Learning for Vehicle Re-Identification

Xi Yu Pang, +2 more
- 31 Mar 2023 - 
TL;DR: Zhang et al. as mentioned in this paper proposed a multi-receptive field soft attention part learning (MRF-SAPL) model to learn semantically diverse vehicle part-level features under different receptive fields through multiple local branches, alleviating the problem of small differences in vehicle appearance.
Posted Content

Discriminative Feature Representation with Spatio-temporal Cues for Vehicle Re-identification.

TL;DR: Experimental results on two public datasets demonstrate DFR-ST outperforms the state-of-the-art methods, which validate the effectiveness of the proposed method.
References
More filters
Proceedings ArticleDOI

Going deeper with convolutions

TL;DR: Inception as mentioned in this paper is a deep convolutional neural network architecture that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC14).
Journal Article

Visualizing Data using t-SNE

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.
Proceedings ArticleDOI

FaceNet: A unified embedding for face recognition and clustering

TL;DR: A system that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure offace similarity, and achieves state-of-the-art face recognition performance using only 128-bytes perface.
Book ChapterDOI

Stacked Hourglass Networks for Human Pose Estimation

TL;DR: This work introduces a novel convolutional network architecture for the task of human pose estimation that is described as a “stacked hourglass” network based on the successive steps of pooling and upsampling that are done to produce a final set of predictions.
Proceedings ArticleDOI

Scalable Person Re-identification: A Benchmark

TL;DR: A minor contribution, inspired by recent advances in large-scale image search, an unsupervised Bag-of-Words descriptor is proposed that yields competitive accuracy on VIPeR, CUHK03, and Market-1501 datasets, and is scalable on the large- scale 500k dataset.
Related Papers (5)