scispace - formally typeset
Search or ask a question
Author

Jian Lu

Bio: Jian Lu is an academic researcher from Nanjing University. The author has contributed to research in topics: Collaborative filtering & Inference. The author has an hindex of 10, co-authored 36 publications receiving 355 citations.

Papers
More filters
Proceedings ArticleDOI
03 Nov 2014
TL;DR: This paper addresses the ambiguity challenge by integrating two state-of-the-art one-class collaborative filtering methods to enjoy the best of both worlds, and tackles the sparseness challenge by exploiting the side information from both users and items.
Abstract: Collaborative filtering is a fundamental building block in many recommender systems. While most of the existing collaborative filtering methods focus on explicit, multi-class settings (e.g., 1-5 stars in movie recommendation), many real-world applications actually belong to the one-class setting where user feedback is implicitly expressed (e.g., views in news recommendation and video recommendation). The main challenges in such one-class setting include the ambiguity of the unobserved examples and the sparseness of existing positive examples. In this paper, we propose a dual-regularized model for one-class collaborative filtering. In particular, we address the ambiguity challenge by integrating two state-of-the-art one-class collaborative filtering methods to enjoy the best of both worlds. We tackle the sparseness challenge by exploiting the side information from both users and items. Moreover, we propose efficient algorithms to solve the proposed model. Extensive experimental evaluations on two real data sets demonstrate that our method achieves significant improvement over the state-of-the-art methods. Overall, the proposed method leads to 7.9% - 21.1% improvement over its best known competitors in terms of prediction accuracy, while enjoying the linear scalability.

65 citations

Proceedings ArticleDOI
13 May 2013
TL;DR: The heart of the method is to view the problem as a recommendation problem, and hence opens the door to the rich methodologies in the field of collaborative filtering, and the proposed multi-aspect model directly characterizes multiple latent factors for each trustor and trustee from the locally-generated trust relationships.
Abstract: Trust inference, which is the mechanism to build new pair-wise trustworthiness relationship based on the existing ones, is a fundamental integral part in many real applications, e.g., e-commerce, social networks, peer-to-peer networks, etc. State-of-the-art trust inference approaches mainly employ the transitivity property of trust by propagating trust along connected users (a.k.a. trust propagation), but largely ignore other important properties, e.g., prior knowledge, multi-aspect, etc. In this paper, we propose a multi-aspect trust inference model by exploring an equally important property of trust, i.e., the multi-aspect property. The heart of our method is to view the problem as a recommendation problem, and hence opens the door to the rich methodologies in the field of collaborative filtering. The proposed multi-aspect model directly characterizes multiple latent factors for each trustor and trustee from the locally-generated trust relationships. Moreover, we extend this model to incorporate the prior knowledge as well as trust propagation to further improve inference accuracy. We conduct extensive experimental evaluations on real data sets, which demonstrate that our method achieves significant improvement over several existing benchmark approaches. Overall, the proposed method (MaTrI) leads to 26.7% - 40.7% improvement over its best known competitors in prediction accuracy; and up to 7 orders of magnitude speedup with linear scalability.

50 citations

Journal ArticleDOI
01 Mar 2019
TL;DR: This article briefly review the existing network embedding methods by two taxonomies, summarizes the main findings, analyzes their usefulness, and discusses future directions in this area.
Abstract: Learning the representations of nodes in a network can benefit various analysis tasks such as node classification, link prediction, clustering, and anomaly detection. Such a representation learning problem is referred to as network embedding, and it has attracted significant attention in recent years. In this article, we briefly review the existing network embedding methods by two taxonomies. The technical taxonomy focuses on the specific techniques used and divides the existing network embedding methods into two stages, i.e., context construction and objective design. The non-technical taxonomy focuses on the problem setting aspect and categorizes existing work based on whether to preserve special network properties, to consider special network types, or to incorporate additional inputs. Finally, we summarize the main findings based on the two taxonomies, analyze their usefulness, and discuss future directions in this area.

42 citations

Proceedings ArticleDOI
Yong Wu1, Yuan Yao1, Feng Xu1, Hanghang Tong2, Jian Lu1 
24 Oct 2016
TL;DR: A generative model (Tag2Word), where the words are generated based on the tag-word distribution as well as the tag itself, is proposed, which outperforms several existing methods in terms of recommendation accuracy, while enjoying linear scalability.
Abstract: Tag recommendation is helpful for the categorization and searching of online content. Existing tag recommendation methods can be divided into collaborative filtering methods and content based methods. In this paper, we put our focus on the content based tag recommendation due to its wider applicability. Our key observation is the tag-content co-occurrence, i.e., many tags have appeared multiple times in the corresponding content. Based on this observation, we propose a generative model (Tag2Word), where we generate the words based on the tag-word distribution as well as the tag itself. Experimental evaluations on real data sets demonstrate that the proposed method outperforms several existing methods in terms of recommendation accuracy, while enjoying linear scalability.

33 citations

Journal ArticleDOI
17 Jul 2019
TL;DR: The experimental results demonstrate that the proposed approach significantly outperforms the state-of-theart methods in terms of recommendation accuracy, and that both content modeling and habit modeling contribute significantly to the overall recommendation accuracy.
Abstract: Hashtags can greatly facilitate content navigation and improve user engagement in social media. Meaningful as it might be, recommending hashtags for photo sharing services such as Instagram and Pinterest remains a daunting task due to the following two reasons. On the endogenous side, posts in photo sharing services often contain both images and text, which are likely to be correlated with each other. Therefore, it is crucial to coherently model both image and text as well as the interaction between them. On the exogenous side, hashtags are generated by users and different users might come up with different tags for similar posts, due to their different preference and/or community effect. Therefore, it is highly desirable to characterize the users’ tagging habits. In this paper, we propose an integral and effective hashtag recommendation approach for photo sharing services. In particular, the proposed approach considers both the endogenous and exogenous effects by a content modeling module and a habit modeling module, respectively. For the content modeling module, we adopt the parallel co-attention mechanism to coherently model both image and text as well as the interaction between them; for the habit modeling module, we introduce an external memory unit to characterize the historical tagging habit of each user. The overall hashtag recommendations are generated on the basis of both the post features from the content modeling module and the habit influences from the habit modeling module. We evaluate the proposed approach on real Instagram data. The experimental results demonstrate that the proposed approach significantly outperforms the state-of-theart methods in terms of recommendation accuracy, and that both content modeling and habit modeling contribute significantly to the overall recommendation accuracy.

30 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

01 Jan 2003
TL;DR: In this article, the authors propose a web of trust, in which each user maintains trust in a small number of other users and then composes these trust values into trust values for all other users.
Abstract: Though research on the Semantic Web has progressed at a steady pace, its promise has yet to be realized. One major difficulty is that, by its very nature, the Semantic Web is a large, uncensored system to which anyone may contribute. This raises the question of how much credence to give each source. We cannot expect each user to know the trustworthiness of each source, nor would we want to assign top-down or global credibility values due to the subjective nature of trust. We tackle this problem by employing a web of trust, in which each user maintains trusts in a small number of other users. We then compose these trusts into trust values for all other users. The result of our computation is not an agglomerate "trustworthiness" of each user. Instead, each user receives a personalized set of trusts, which may vary widely from person to person. We define properties for combination functions which merge such trusts, and define a class of functions for which merging may be done locally while maintaining these properties. We give examples of specific functions and apply them to data from Epinions and our BibServ bibliography server. Experiments confirm that the methods are robust to noise, and do not put unreasonable expectations on users. We hope that these methods will help move the Semantic Web closer to fulfilling its promise.

567 citations