“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告

A morphable model for the synthesis of 3D faces

Offline training for object tracking has recently shown great potentials in balancing tracking accuracy and speed. However, it is still difficult to adapt an offline trained model to a target tracked online. This work presents a Residual Attentional Siamese Network (RASNet) for high performance object tracking. The RASNet model reformulates the correlation filter within a Siamese tracking framework, and introduces different kinds of the attention mechanisms to adapt the model without updating the model online. In particular, by exploiting the offline trained general attention, the target adapted residual attention, and the channel favored feature attention, the RASNet not only mitigates the over-fitting problem in deep network training, but also enhances its discriminative capacity and adaptability due to the separation of representation learning and discriminator learning. The proposed deep architecture is trained from end to end and takes full advantage of the rich spatial temporal information to achieve robust visual tracking. Experimental results on two latest benchmarks, OTB-2015 and VOT2017, show that the RASNet tracker has the state-of-the-art tracking accuracy while runs at more than 80 frames per second.

/pdf/learning-attentions-residual-attentional-siamese-network-for-5ajs3n697g.pdf

Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking

As a fundamental and critical task in various visual applications, image matching can identify then correspond the same or similar structure/content from two or more images. Over the past decades, growing amount and diversity of methods have been proposed for image matching, particularly with the development of deep learning techniques over the recent years. However, it may leave several open questions about which method would be a suitable choice for specific applications with respect to different scenarios and task requirements and how to design better image matching methods with superior performance in accuracy, robustness and efficiency. This encourages us to conduct a comprehensive and systematic review and analysis for those classical and latest techniques. Following the feature-based image matching pipeline, we first introduce feature detection, description, and matching techniques from handcrafted methods to trainable ones and provide an analysis of the development of these methods in theory and practice. Secondly, we briefly introduce several typical image matching-based applications for a comprehensive understanding of the significance of image matching. In addition, we also provide a comprehensive and objective comparison of these classical and latest techniques through extensive experiments on representative datasets. Finally, we conclude with the current status of image matching technologies and deliver insightful discussions and prospects for future works. This survey can serve as a reference for (but not limited to) researchers and engineers in image matching and related fields.

/pdf/image-matching-from-handcrafted-to-deep-features-a-survey-27kn6w026w.pdf

Image Matching from Handcrafted to Deep Features: A Survey

Emotion recognition in human-computer interaction

This paper presents a coupled discriminative feature learning (CDFL) method for heterogeneous face recognition (HFR). Different from most existing HFR approaches which use hand-crafted feature descriptors for face representation, our CDFL directly learns discriminative features from raw pixels for face representation. In particular, a couple of image filters is learned in CDFL to simultaneously exploit discriminative information and to reduce the appearance difference of face images captured across different modalities. With the help of the learned filters, CDFL can maximize the interclass variations and minimize the intraclass variations of the learned feature vectors, and meanwhile maximize the correlation of face images of the same person from different modalities by solving a generalized eigenvalue problem. Experimental results on three different heterogeneous face recognition applications show the effectiveness of our proposed approach.

/pdf/coupled-discriminative-feature-learning-for-heterogeneous-13uc1m4a4r.pdf

Coupled Discriminative Feature Learning for Heterogeneous Face Recognition

With the rapid development of the Internet Web 2.0 technology, the demands of large-scale distributed service and storage in cloud computing have brought great challenges to traditional relational database. NoSQL database which breaks the shackles of RDBMS is becoming the focus of attention. In this paper, the principles and implementation mechanisms of Auto-Sharding in MongoDB database are firstly presented, then an improved algorithm based on the frequency of data operation is proposed in order to solve the problem of uneven distribution of data in auto-sharding. The improved balancing strategy can effectively balance the data among shards, and improve the cluster's concurrent reading and writing performance.

Research on the improvement of MongoDB Auto-Sharding in cloud environment

Recently deep neural networks have been widely employed to deal with the visual tracking problem. In this work, we present a new deep architecture which incorporates the temporal and spatial information to boost the tracking performance. Our deep architecture contains three networks, a Feature Net, a Temporal Net, and a Spatial Net. The Feature Net extracts general feature representations of the target. With these feature representations, the Temporal Net encodes the trajectory of the target and directly learns temporal correspondences to estimate the object state from a global perspective. Based on the learning results of the Temporal Net, the Spatial Net further refines the object tracking state using local spatial object information. Extensive experiments on four of the largest tracking benchmarks, including VOT2014, VOT2016, OTB50, and OTB100, demonstrate competing performance of the proposed tracker over a number of state-of-the-art algorithms.

Robust Object Tracking Based on Temporal and Spatial Deep Networks

Multi-Label Learning (MLL) aims to learn from the training data where each example is represented by a single instance while associated with a set of candidate labels. Most existing MLL methods are typically designed to handle the problem of missing labels. However, in many real-world scenarios, the labeling information for multi-label data is always redundant , which can not be solved by classical MLL methods, thus a novel Partial Multi-label Learning (PML) framework is proposed to cope with such problem, i.e. removing the the noisy labels from the multi-label sets. In this paper, in order to further improve the denoising capability of PML framework, we utilize the low-rank and sparse decomposition scheme and propose a novel Partial Multi-label Learning by Low-Rank and Sparse decomposition (PML-LRS) approach. Specifically, we first reformulate the observed label set into a label matrix, and then decompose it into a groundtruth label matrix and an irrelevant label matrix, where the former is constrained to be low rank and the latter is assumed to be sparse. Next, we utilize the feature mapping matrix to explore the label correlations and meanwhile constrain the feature mapping matrix to be low rank to prevent the proposed method from being overfitting. Finally, we obtain the ground-truth labels via minimizing the label loss, where the Augmented Lagrange Multiplier (ALM) algorithm is incorporated to solve the optimization problem. Enormous experimental results demonstrate that PML-LRS can achieve superior or competitive performance against other state-of-the-art methods.

/pdf/partial-multi-label-learning-by-low-rank-and-sparse-3vwsuta6s9.pdf

Partial Multi-Label Learning by Low-Rank and Sparse Decomposition.

Learning-based approaches to graph matching have been developed and explored for more than a decade, have grown rapidly in scope and popularity in recent years. However, previous learning-based algorithms, with or without deep learning strategy, mainly focus on the learning of node and/or edge affinities generation, and pay less attention on the learning of the combinatorial solver. In this paper we propose a fully trainable framework for graph matching, in which learning of affinities and solving for combinatorial optimization are not explicitly separated as in many previous arts. We firstly convert the problem of building node correspondences between two input graphs to the problem of selecting reliable nodes from a constructed assignment graph. Subsequently, the graph network block module is adopted to perform computation on the graph to form structured representations for each node. It finally predicts a label for each node that is used for node classification, and the training is performed under the supervision of both permutation differences and the one-to-one matching constraints. The proposed method is evaluated on four public benchmarks in comparison with several state-of-the-art algorithms, and the experimental results illustrate its excellent performance.

/pdf/learning-combinatorial-solver-for-graph-matching-55rtr04jjn.pdf

Yi Jin

Papers

Coupled Discriminative Feature Learning for Heterogeneous Face Recognition

Research on the improvement of MongoDB Auto-Sharding in cloud environment

Robust Object Tracking Based on Temporal and Spatial Deep Networks

Partial Multi-Label Learning by Low-Rank and Sparse Decomposition.

Learning Combinatorial Solver for Graph Matching