scispace - formally typeset
Search or ask a question
Author

Xueming Qian

Other affiliations: Chinese Ministry of Education
Bio: Xueming Qian is an academic researcher from Xi'an Jiaotong University. The author has contributed to research in topics: Computer science & Image retrieval. The author has an hindex of 31, co-authored 176 publications receiving 3572 citations. Previous affiliations of Xueming Qian include Chinese Ministry of Education.


Papers
More filters
Journal ArticleDOI
TL;DR: Comprehensive experiments demonstrate the superiority of the SDSAE-based high-level feature learning method and the effectiveness of the weakly supervised semantic annotation framework compared with state-of-the-art fully supervised annotation methods.
Abstract: In this paper, we focus on tackling the problem of automatic semantic annotation of high resolution (HR) optical satellite images, which aims to assign one or several predefined semantic concepts to an image according to its content. The main challenges arise from the difficulty of characterizing complex and ambiguous contents of the satellite images and the high human labor cost caused by preparing a large amount of training examples with high-quality pixel-level labels in fully supervised annotation methods. To address these challenges, we propose a unified annotation framework by combining discriminative high-level feature learning and weakly supervised feature transferring. Specifically, an efficient stacked discriminative sparse autoencoder (SDSAE) is first proposed to learn high-level features on an auxiliary satellite image data set for the land-use classification task. Inspired by the motivation that the encoder of the prelearned SDSAE can be regarded as a generic high-level feature extractor for HR optical satellite images, we then transfer the learned high-level features to semantic annotation. To compensate the difference between the auxiliary data set and the annotation data set, the transferred high-level features are further fine-tuned in a weakly supervised scheme by using the tile-level annotated training data. Finally, the fine-tuning process is formulated as an ultimate optimization problem, which can be solved efficiently with our proposed alternate iterative optimization method. Comprehensive experiments on a publicly available land-use classification data set and an annotation data set demonstrate the superiority of our SDSAE-based high-level feature learning method and the effectiveness of our weakly supervised semantic annotation framework compared with state-of-the-art fully supervised annotation methods.

317 citations

Journal ArticleDOI
TL;DR: Three social factors, personal interest, interpersonal interest similarity, and interpersonal influence, fuse into a unified personalized recommendation model based on probabilistic matrix factorization and results show the proposed approach outperforms the existing RS approaches.
Abstract: With the advent and popularity of social network, more and more users like to share their experiences, such as ratings, reviews, and blogs. The new factors of social network like interpersonal influence and interest based on circles of friends bring opportunities and challenges for recommender system (RS) to solve the cold start and sparsity problem of datasets. Some of the social factors have been used in RS, but have not been fully considered. In this paper, three social factors, personal interest, interpersonal interest similarity, and interpersonal influence, fuse into a unified personalized recommendation model based on probabilistic matrix factorization. The factor of personal interest can make the RS recommend items to meet users' individualities, especially for experienced users. Moreover, for cold start users, the interpersonal interest similarity and interpersonal influence can enhance the intrinsic link among features in the latent space. We conduct a series of experiments on three rating datasets: Yelp, MovieLens, and Douban Movie. Experimental results show the proposed approach outperforms the existing RS approaches. Index Terms—Interpersonal influence, personal interest, recommender system, social networks —————————— ——————————

293 citations

Journal ArticleDOI
TL;DR: An author topic model-based collaborative filtering (ATCF) method is proposed to facilitate comprehensive points of interest (POIs) recommendations for social users and advantages and superior performance of this approach are demonstrated by extensive experiments on a large collection of data.
Abstract: From social media has emerged continuous needs for automatic travel recommendations. Collaborative filtering (CF) is the most well-known approach. However, existing approaches generally suffer from various weaknesses. For example , sparsity can significantly degrade the performance of traditional CF. If a user only visits very few locations, accurate similar user identification becomes very challenging due to lack of sufficient information for effective inference. Moreover, existing recommendation approaches often ignore rich user information like textual descriptions of photos which can reflect users’ travel preferences. The topic model (TM) method is an effective way to solve the “sparsity problem,” but is still far from satisfactory. In this paper, an author topic model-based collaborative filtering (ATCF) method is proposed to facilitate comprehensive points of interest (POIs) recommendations for social users. In our approach, user preference topics, such as cultural, cityscape, or landmark, are extracted from the geo-tag constrained textual description of photos via the author topic model instead of only from the geo-tags (GPS locations). Advantages and superior performance of our approach are demonstrated by extensive experiments on a large collection of data.

215 citations

Journal ArticleDOI
TL;DR: The conventional local binary pattern is extended to pyramid transform domain (PLBP) by cascading the LBP information of hierarchical spatial pyramids, PLBP descriptors take texture resolution variations into account and show their effectiveness for texture representation.

193 citations

Journal ArticleDOI
TL;DR: A personalized travel sequence recommendation from both travelogues and community contributed photos and the heterogeneous metadata (e.g., tags, geo-location, and date taken) associated with these photos are presented.
Abstract: Big data increasingly benefit both research and industrial area such as health care, finance service and commercial recommendation. This paper presents a personalized travel sequence recommendation from both travelogues and community-contributed photos and the heterogeneous metadata (e.g., tags, geo-location, and date taken) associated with these photos. Unlike most existing travel recommendation approaches, our approach is not only personalized to user's travel interest but also able to recommend a travel sequence rather than individual Points of Interest (POIs). Topical package space including representative tags, the distributions of cost, visiting time and visiting season of each topic, is mined to bridge the vocabulary gap between user travel preference and travel routes. We take advantage of the complementary of two kinds of social media: travelogue and community-contributed photos. We map both user's and routes’ textual descriptions to the topical package space to get user topical package model and route topical package model (i.e., topical interest, cost, time and season). To recommend personalized POI sequence, first, famous routes are ranked according to the similarity between user package and route package. Then top ranked routes are further optimized by social similar users’ travel records. Representative images with viewpoint and seasonal diversity of POIs are shown to offer a more comprehensive impression. We evaluate our recommendation system on a collection of 7 million Flickr images uploaded by 7,387 users and 24,008 travelogues covering 864 travel POIs in nine famous cities, and show its effectiveness. We also contribute a new dataset with more than 200 K photos with heterogeneous metadata in nine famous cities.

152 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A large-scale data set, termed “NWPU-RESISC45,” is proposed, which is a publicly available benchmark for REmote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU).
Abstract: Remote sensing image scene classification plays an important role in a wide range of applications and hence has been receiving remarkable attention. During the past years, significant efforts have been made to develop various datasets or present a variety of approaches for scene classification from remote sensing images. However, a systematic review of the literature concerning datasets and methods for scene classification is still lacking. In addition, almost all existing datasets have a number of limitations, including the small scale of scene classes and the image numbers, the lack of image variations and diversity, and the saturation of accuracy. These limitations severely limit the development of new approaches especially deep learning-based methods. This paper first provides a comprehensive review of the recent progress. Then, we propose a large-scale dataset, termed "NWPU-RESISC45", which is a publicly available benchmark for REmote Sensing Image Scene Classification (RESISC), created by Northwestern Polytechnical University (NWPU). This dataset contains 31,500 images, covering 45 scene classes with 700 images in each class. The proposed NWPU-RESISC45 (i) is large-scale on the scene classes and the total image number, (ii) holds big variations in translation, spatial resolution, viewpoint, object pose, illumination, background, and occlusion, and (iii) has high within-class diversity and between-class similarity. The creation of this dataset will enable the community to develop and evaluate various data-driven algorithms. Finally, several representative methods are evaluated using the proposed dataset and the results are reported as a useful baseline for future research.

1,424 citations

Journal ArticleDOI
TL;DR: This paper proposes a novel and effective approach to learn a rotation-invariant CNN (RICNN) model for advancing the performance of object detection, which is achieved by introducing and learning a new rotation- Invariant layer on the basis of the existing CNN architectures.
Abstract: Object detection in very high resolution optical remote sensing images is a fundamental problem faced for remote sensing image analysis. Due to the advances of powerful feature representations, machine-learning-based object detection is receiving increasing attention. Although numerous feature representations exist, most of them are handcrafted or shallow-learning-based features. As the object detection task becomes more challenging, their description capability becomes limited or even impoverished. More recently, deep learning algorithms, especially convolutional neural networks (CNNs), have shown their much stronger feature representation power in computer vision. Despite the progress made in nature scene images, it is problematic to directly use the CNN feature for object detection in optical remote sensing images because it is difficult to effectively deal with the problem of object rotation variations. To address this problem, this paper proposes a novel and effective approach to learn a rotation-invariant CNN (RICNN) model for advancing the performance of object detection, which is achieved by introducing and learning a new rotation-invariant layer on the basis of the existing CNN architectures. However, different from the training of traditional CNN models that only optimizes the multinomial logistic regression objective, our RICNN model is trained by optimizing a new objective function via imposing a regularization constraint, which explicitly enforces the feature representations of the training samples before and after rotating to be mapped close to each other, hence achieving rotation invariance. To facilitate training, we first train the rotation-invariant layer and then domain-specifically fine-tune the whole RICNN network to further boost the performance. Comprehensive evaluations on a publicly available ten-class object detection data set demonstrate the effectiveness of the proposed method.

1,370 citations