scispace - formally typeset
Search or ask a question
Author

Evan Wei Xiang

Other affiliations: Baidu
Bio: Evan Wei Xiang is an academic researcher from Hong Kong University of Science and Technology. The author has contributed to research in topics: Transfer of learning & Collaborative filtering. The author has an hindex of 16, co-authored 22 publications receiving 1399 citations. Previous affiliations of Evan Wei Xiang include Baidu.

Papers
More filters
Proceedings Article
11 Jul 2010
TL;DR: This paper discovers the principle coordinates of both users and items in the auxiliary data matrices, and transfers them to the target domain in order to reduce the effect of data sparsity.
Abstract: Data sparsity is a major problem for collaborative filtering (CF) techniques in recommender systems, especially for new users and items. We observe that, while our target data are sparse for CF systems, related and relatively dense auxiliary data may already exist in some other more mature application domains. In this paper, we address the data sparsity problem in a target domain by transferring knowledge about both users and items from auxiliary data sources. We observe that in different domains the user feedbacks are often heterogeneous such as ratings vs. clicks. Our solution is to integrate both user and item knowledge in auxiliary data sources through a principled matrix-based transfer learning framework that takes into account the data heterogeneity. In particular, we discover the principle coordinates of both users and items in the auxiliary data matrices, and transfer them to the target domain in order to reduce the effect of data sparsity. We describe our method, which is known as coordinate system transfer or CST, and demonstrate its effectiveness in alleviating the data sparsity problem in collaborative filtering. We show that our proposed method can significantly outperform several state-of-the-art solutions for this problem.

359 citations

Proceedings ArticleDOI
16 Jul 2011
TL;DR: This paper explores how to use the binary preference data expressed in the form of like/dislike to help reduce the impact of data sparsity of more expressive numerical ratings.
Abstract: Data sparsity due to missing ratings is a major challenge for collaborative filtering (CF) techniques in recommender systems This is especially true for CF domains where the ratings are expressed numerically We observe that, while we may lack the information in numerical ratings, we may have more data in the form of binary ratings This is especially true when users can easily express themselves with their likes and dislikes for certain items In this paper, we explore how to use the binary preference data expressed in the form of like/dislike to help reduce the impact of data sparsity of more expressive numerical ratings We do this by transferring the rating knowledge from some auxiliary data source in binary form (that is, likes or dislikes), to a target numerical rating matrix Our solution is to model both numerical ratings and like/dislike in a principled way, using a novel framework of Transfer by Collective Factorization (TCF) In particular, we construct the shared latent space collectively and learn the data-dependent effect separately A major advantage of the TCF approach over previous collective matrix factorization (or bifactorization) methods is that we are able to capture the data-dependent effect when sharing the data-independent knowledge, so as to increase the over-all quality of knowledge transfer Experimental results demonstrate the effectiveness of TCF at various sparsity levels as compared to several state-of-the-art methods

132 citations

Proceedings ArticleDOI
26 Sep 2010
TL;DR: This paper extended the widely used neighborhood based algorithms by incorporating temporal information and developed an incremental algorithm for updating neighborhood similarities with new data.
Abstract: Collaborative filtering algorithms attempt to predict a user's interests based on his past feedback. In real world applications, a user's feedback is often continuously collected over a long period of time. It is very common for a user's interests or an item's popularity to change over a long period of time. Therefore, the underlying recommendation algorithm should be able to adapt to such changes accordingly. However, most existing algorithms do not distinguish current and historical data when predicting the users' current interests. In this paper, we consider a new problem - online evolutionary collaborative filtering, which tracks user interests over time in order to make timely recommendations. We extended the widely used neighborhood based algorithms by incorporating temporal information and developed an incremental algorithm for updating neighborhood similarities with new data. Experiments on two real world datasets demonstrated both improved effectiveness and efficiency of the proposed approach.

125 citations

Proceedings Article
14 Jul 2013
TL;DR: This paper proposes a framework to construct entity correspondence with limited budget by using active learning to facilitate knowledge transfer across recommender systems, and first iteratively select entities in the target system based on a proposed criterion to query their correspondences in the source system.
Abstract: Recommender systems, especially the newly launched ones, have to deal with the data-sparsity issue, where little existing rating information is available. Recently, transfer learning has been proposed to address this problem by leveraging the knowledge from related recommender systems where rich collaborative data are available. However, most previous transfer learning models assume that entity-correspondences across different systems are given as input, which means that for any entity (e.g., a user or an item) in a target system, its corresponding entity in a source system is known. This assumption can hardly be satisfied in real-world scenarios where entity-correspondences across systems are usually unknown, and the cost of identifying them can be expensive. For example, it is extremely difficult to identify whether a user A from Facebook and a user B from Twitter are the same person. In this paper, we propose a framework to construct entity correspondence with limited budget by using active learning to facilitate knowledge transfer across recommender systems. Specifically, for the purpose of maximizing knowledge transfer, we first iteratively select entities in the target system based on our proposed criterion to query their correspondences in the source system. We then plug the actively constructed entity-correspondence mapping into a general transferred collaborative-filtering model to improve recommendation quality. We perform extensive experiments on real world datasets to verify the effectiveness of our proposed framework for this crosssystem recommendation problem.

106 citations

Proceedings Article
13 Jul 2008
TL;DR: This work introduces a novel semi-supervised Hidden Markov Model (HMM) to transfer the learned model from one time period to another to reduce the calibration effort for the current time period.
Abstract: Learning-based localization methods typically consist of an offline phase to collect the wireless signal data to build a statistical model, and an online phase to apply the model on new data. Many of these methods treat the training data as if their distributions are fixed across time. However, due to complex environmental changes such as temperature changes and multi-path fading effect, the signals can significantly vary from time to time, causing the localization accuracy to drop. We address this problem by introducing a novel semi-supervised Hidden Markov Model (HMM) to transfer the learned model from one time period to another. This adaptive model is referred to as transferred HMM (TrHMM), in which we aim to transfer as much knowledge from the old model as possible to reduce the calibration effort for the current time period. Our contribution is that we can successfully transfer out-of-date model to fit a current model through learning, even though the training data have very different distributions. Experimental results show that the TrHMM method can greatly improve the localization accuracy while saving a great amount of the calibration effort.

100 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift are discussed.
Abstract: A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many real-world applications, this assumption may not hold. For example, we sometimes have a classification task in one domain of interest, but we only have sufficient training data in another domain of interest, where the latter data may be in a different feature space or follow a different data distribution. In such cases, knowledge transfer, if done successfully, would greatly improve the performance of learning by avoiding much expensive data-labeling efforts. In recent years, transfer learning has emerged as a new learning framework to address this problem. This survey focuses on categorizing and reviewing the current progress on transfer learning for classification, regression, and clustering problems. In this survey, we discuss the relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift. We also explore some potential future issues in transfer learning research.

18,616 citations

Journal ArticleDOI
TL;DR: This survey paper formally defines transfer learning, presents information on current solutions, and reviews applications applied toTransfer learning, which can be applied to big data environments.
Abstract: Machine learning and data mining techniques have been used in numerous real-world applications. An assumption of traditional machine learning methodologies is the training data and testing data are taken from the same domain, such that the input feature space and data distribution characteristics are the same. However, in some real-world machine learning scenarios, this assumption does not hold. There are cases where training data is expensive or difficult to collect. Therefore, there is a need to create high-performance learners trained with more easily obtained data from different domains. This methodology is referred to as transfer learning. This survey paper formally defines transfer learning, presents information on current solutions, and reviews applications applied to transfer learning. Lastly, there is information listed on software downloads for various transfer learning solutions and a discussion of possible future research work. The transfer learning solutions surveyed are independent of data size and can be applied to big data environments.

2,900 citations

Journal ArticleDOI
01 Jan 2021
TL;DR: Transfer learning aims to improve the performance of target learners on target domains by transferring the knowledge contained in different but related source domains as discussed by the authors, in which the dependence on a large number of target-domain data can be reduced for constructing target learners.
Abstract: Transfer learning aims at improving the performance of target learners on target domains by transferring the knowledge contained in different but related source domains. In this way, the dependence on a large number of target-domain data can be reduced for constructing target learners. Due to the wide application prospects, transfer learning has become a popular and promising area in machine learning. Although there are already some valuable and impressive surveys on transfer learning, these surveys introduce approaches in a relatively isolated way and lack the recent advances in transfer learning. Due to the rapid expansion of the transfer learning area, it is both necessary and challenging to comprehensively review the relevant studies. This survey attempts to connect and systematize the existing transfer learning research studies, as well as to summarize and interpret the mechanisms and the strategies of transfer learning in a comprehensive way, which may help readers have a better understanding of the current research status and ideas. Unlike previous surveys, this survey article reviews more than 40 representative transfer learning approaches, especially homogeneous transfer learning approaches, from the perspectives of data and model. The applications of transfer learning are also briefly introduced. In order to show the performance of different transfer learning models, over 20 representative transfer learning models are used for experiments. The models are performed on three different data sets, that is, Amazon Reviews, Reuters-21578, and Office-31, and the experimental results demonstrate the importance of selecting appropriate transfer learning models for different applications in practice.

2,433 citations

Journal ArticleDOI
01 Jun 2015
TL;DR: This paper reviews up-to-date application developments of recommender systems, clusters their applications into eight main categories, and summarizes the related recommendation techniques used in each category.
Abstract: A recommender system aims to provide users with personalized online product or service recommendations to handle the increasing online information overload problem and improve customer relationship management. Various recommender system techniques have been proposed since the mid-1990s, and many sorts of recommender system software have been developed recently for a variety of applications. Researchers and managers recognize that recommender systems offer great opportunities and challenges for business, government, education, and other domains, with more recent successful developments of recommender systems for real-world applications becoming apparent. It is thus vital that a high quality, instructive review of current trends should be conducted, not only of the theoretical research results but more importantly of the practical developments in recommender systems. This paper therefore reviews up-to-date application developments of recommender systems, clusters their applications into eight main categories: e-government, e-business, e-commerce/e-shopping, e-library, e-learning, e-tourism, e-resource services and e-group activities, and summarizes the related recommendation techniques used in each category. It systematically examines the reported recommender systems through four dimensions: recommendation methods (such as CF), recommender systems software (such as BizSeeker), real-world application domains (such as e-business) and application platforms (such as mobile-based platforms). Some significant new topics are identified and listed as new directions. By providing a state-of-the-art knowledge, this survey will directly support researchers and practical professionals in their understanding of developments in recommender system applications. Research papers on various recommender system applications are summarized.The recommender systems are examined systematically through four dimensions.The recommender system applications are classified into eight categories.Related recommendation techniques in each category are identified.Several new recommendation techniques and application areas are uncovered.

1,177 citations