scispace - formally typeset
Search or ask a question

Showing papers by "Yap-Peng Tan published in 2015"


Proceedings ArticleDOI
07 Jun 2015
TL;DR: This paper proposes a new deep transfer metric learning (DTML) method to learn a set of hierarchical nonlinear transformations for cross-domain visual recognition by transferring discriminative knowledge from the labeled source domain to the unlabeled target domain.
Abstract: Conventional metric learning methods usually assume that the training and test samples are captured in similar scenarios so that their distributions are assumed to be the same. This assumption doesn't hold in many real visual recognition applications, especially when samples are captured across different datasets. In this paper, we propose a new deep transfer metric learning (DTML) method to learn a set of hierarchical nonlinear transformations for cross-domain visual recognition by transferring discriminative knowledge from the labeled source domain to the unlabeled target domain. Specifically, our DTML learns a deep metric network by maximizing the inter-class variations and minimizing the intra-class variations, and minimizing the distribution divergence between the source domain and the target domain at the top layer of the network. To better exploit the discriminative information from the source domain, we further develop a deeply supervised transfer metric learning (DSTML) method by including an additional objective on DTML where the output of both the hidden layers and the top layer are optimized jointly. Experimental results on cross-dataset face verification and person re-identification validate the effectiveness of the proposed methods.

190 citations


Journal ArticleDOI
TL;DR: A complexity-scalable video encoding method and a power-rate-distortion model for video chat that is more accurate in describing the PRD characteristics, yet of lower complexity in online updates of its coefficients are proposed.
Abstract: Wireless video chat is a power-consuming and bitrate-intensive application. Unlike video streaming, which is one-way traffic, video chat features distributed two-way traffic relayed via base stations, where resource allocation of a client affects the video quality seen by its communicating partner. In this paper, we study the mechanism design of this application via dynamic pricing, and seek efficiency and fairness of resource utilization . Specifically, we assume that the base station relays video bitstreams and charges a service price on the clients based on the transmission power consumption . Based on the price and a given power budget, the clients allocate bitrate and power for video coding and transmission such that the service price and the distortion seen by their partners are minimized. We study such network dynamics in Stackelberg game-theoretic framework. To solve the problem, we propose a complexity-scalable video encoding method and a power-rate-distortion (PRD) model for video chat. The model is more accurate in describing the PRD characteristics, yet of lower complexity in online updates of its coefficients. Based on the PRD model, we derive the distributed rate and power allocations for the clients. We show that a simple pricing update in the base stations is sufficient for optimal pricing. The proposed algorithms are optimal and converge to the Stackelberg equilibrium. Existing SNR- and power-based pricing schemes could not ensure fairness and efficiency simultaneously. We propose a hybrid pricing scheme that balances these conflicting criteria. Extensive simulations demonstrate superior performance of the proposed methods and solutions.

12 citations


Proceedings ArticleDOI
19 May 2015
TL;DR: This work collects an unconstrained face dataset which contains 455 pairs of identical twins to generate negative face pairs to evaluate several baseline verification models for fine-grained unconStrained face verification.
Abstract: This paper investigates the problem of fine-grained face verification under unconstrained conditions. For the conventional face verification task, the verification model is trained with some positive and negative face pairs, where each positive sample pair contains two face images of the same person while each negative sample pair usually consists of two face images from different subjects. However, in many real applications, facial appearance of the twins looks very similar even if they are considered as a negative pair in face verification. Therefore, it is important to differentiate a given face pair to determine whether it is from the same person or a twins for a practical face verification system because most existing face verification systems fails to work well in such a scenario. In this work, we define the problem as fine-grained face verification and collect an unconstrained face dataset which contains 455 pairs of identical twins to generate negative face pairs to evaluate several baseline verification models for fine-grained unconstrained face verification. Benchmark results on the unsupervised setting and restricted setting show the challenge of the fine-grained face verification in the wild.

10 citations


Proceedings ArticleDOI
19 May 2015
TL;DR: This paper presents a new discriminative transfer learning (DTL) approach for SSFR, where discriminant analysis is performed on a multiple-sample generic training set and then transferred into the single-sample gallery set.
Abstract: Discriminant analysis is an important technique for face recognition because it can extract discriminative features to classify different persons. However, most existing discriminant analysis methods fail to work for single-sample face recognition (SSFR) because there is only a single training sample per person such that the within-class variation of this person cannot be estimated in such scenario. In this paper, we present a new discriminative transfer learning (DTL) approach for SSFR, where discriminant analysis is performed on a multiple-sample generic training set and then transferred into the single-sample gallery set. Specifically, our DTL learns a feature projection to minimize the intra-class variation and maximize the inter-class variation of samples in the training set, and minimize the difference between the generic training set and the gallery set, simultaneously. Experimental results on three face datasets including the FERET, CAS-PEAL-R1, and LFW datasets are presented to show the efficacy of our method.

7 citations


Proceedings ArticleDOI
10 Dec 2015
TL;DR: The proposed method significantly improves the localized search accuracy over the baseline, which treats each frame independently, and is able to find the top 100 object trajectories in the 5.5-hour dataset within 30 seconds.
Abstract: We present an efficient approach to search for and locate all occurrences of a specific object in large video volumes, given a single query example. Locations of object occurrences are returned as spatio-temporal trajectories in the 3D video volume. Despite much work on object instance search in image datasets, these methods locate the object independently in each image, therefore do not preserve the spatio-temporal consistency in consecutive video frames. This results in sub-optimal performance if directly applied to videos, as will be shown in our experiments. We propose to locate the object jointly across video frames using spatio-temporal search. The efficiency and effectiveness of the proposed approach is demonstrated on a consumer video dataset consisting of crawled YouTube videos and mobile captured consumer clips. Our method significantly improves the localized search accuracy over the baseline, which treats each frame independently. Moreover, it is able to find the top 100 object trajectories in the 5.5-hour dataset within 30 seconds.

6 citations


Journal ArticleDOI
TL;DR: By proving the NP-hardness of the QoERA problem, this work proposes an adaptive solution where resource block assignment, power allocation and modulation selection are jointly optimized to enhance multiuser QoE.

4 citations


Proceedings ArticleDOI
13 Oct 2015
TL;DR: A graph-based optimization framework to leverage category independent object proposals (candidate object regions) for logo search in a large scale image database and an efficient feature descriptor EdgeBoW, which can yield promising results, specially for object categories primarily defined by its shape.
Abstract: We propose a graph-based optimization framework to leverage category independent object proposals (candidate object regions) for logo search in a large scale image database. The proposed contour-based feature descriptor EdgeBoW is robust to view-angle changes, varying illumination conditions and can implicitly capture the significant object shape information. Having been equipped with a local descriptor, it can handle a fair amount of occlusion and deformation frequently present in a real-life scenario. Given a small set of initially retrieved candidate object proposals, a fast graph-based short-listing scheme is designed to exploit the mutual similarities among these proposals for eliminating outliers. In contrast to a coarse image-level pairwise similarity measure, this search focussed on a few specific image regions provides a more accurate method for matching. The proposed query expansion strategy aims to assess each of the remaining better matched proposals against all its neighbors within the same image for a precise localization. Combined with an efficient feature descriptor EdgeBoW, a set of more insightful edge-weights and node-utility measures can yield promising results, specially for object categories primarily defined by its shape. Extensive set of experiments performed on a number of benchmark datasets demonstrates its effectiveness and superior generalization ability in both clutter intensive real-life images and poor quality binary document images.

4 citations


Journal ArticleDOI
TL;DR: A novel method for motion-compensated reference-driven MR image reconstruction is presented and an efficient algorithm is proposed to solve the minimization problem with joint application of Augmented Lagrangian method and alternating direction minimization method.

1 citations