scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Shilling attacks against recommender systems: a comprehensive survey

01 Dec 2014-Artificial Intelligence Review (Springer Netherlands)-Vol. 42, Iss: 4, pp 767-799
TL;DR: Various attack types are described and new dimensions for attack classification are introduced and detailed description of the proposed detection and robust recommendation algorithms are given.
Abstract: Online vendors employ collaborative filtering algorithms to provide recommendations to their customers so that they can increase their sales and profits. Although recommendation schemes are successful in e-commerce sites, they are vulnerable to shilling or profile injection attacks. On one hand, online shopping sites utilize collaborative filtering schemes to enhance their competitive edge over other companies. On the other hand, malicious users and/or competing vendors might decide to insert fake profiles into the user-item matrices in such a way so that they can affect the predicted ratings on behalf of their advantages. In the past decade, various studies have been conducted to scrutinize different shilling attacks strategies, profile injection attack types, shilling attack detection schemes, robust algorithms proposed to overcome such attacks, and evaluate them with respect to accuracy, cost/benefit, and overall performance. Due to their popularity and importance, we survey about shilling attacks in collaborative filtering algorithms. Giving an overall picture about various shilling attack types by introducing new classification attributes is imperative for further research. Explaining shilling attack detection schemes in detail and robust algorithms proposed so far might open a lead to develop new detection schemes and enhance such robust algorithms further, even propose new ones. Thus, we describe various attack types and introduce new dimensions for attack classification. Detailed description of the proposed detection and robust recommendation algorithms are given. Moreover, we briefly explain evaluation of the proposed schemes. We conclude the paper by discussing various open questions.
Citations
More filters
Book ChapterDOI
01 Jan 2016
TL;DR: The current trends, issues, challenges, and research opportunities in developing high-quality recommender systems are investigated and the goal towards fine-tuned and high- quality recommender system can be achieved is achieved.
Abstract: A recommender system is an Information Retrieval technology that improves access and proactively recommends relevant items to users by considering the users’ explicitly mentioned preferences and objective behaviors. A recommender system is one of the major techniques that handle information overload problem of Information Retrieval by suggesting users with appropriate and relevant items. Today, several recommender systems have been developed for different domains however, these are not precise enough to fulfil the information needs of users. Therefore, it is necessary to build high quality recommender systems. In designing such recommenders, designers face several issues and challenges that need proper attention. This paper investigates and reports the current trends, issues, challenges, and research opportunities in developing high-quality recommender systems. If properly followed, these issues and challenges will introduce new research avenues and the goal towards fine-tuned and high-quality recommender systems can be achieved.

128 citations

Journal ArticleDOI
TL;DR: This work clusters the users and calculates the reputation of users based on the clustering information by a beta reputation system, and identifies a set of similar services by clustering the services and makes prediction for active users by combining the QoS data of the trustworthy similar users and similar services.
Abstract: With the rapid development of service-oriented computing, cloud computing and big data, a large number of functionally equivalent web services are available on the Internet. Quality of Service (QoS) becomes a differentiating point of services to attract customers. Since the QoS of services varies widely among users due to the unpredicted network, physical location and other objective factors, many Collaborative Filtering based approaches are recently proposed to predict the unknown QoS by employing the historical user-contributed QoS data. However, most existing approaches ignore the data credibility problem and are thus vulnerable to the unreliable QoS data contributed by dishonest users. To address this problem, we propose a trust-aware approach TAP for reliable personalized QoS prediction. Firstly, we cluster the users and calculate the reputation of users based on the clustering information by a beta reputation system. Secondly, a set of trustworthy similar users is identified according to the calculated user reputation and similarity. Finally, we identify a set of similar services by clustering the services and make prediction for active users by combining the QoS data of the trustworthy similar users and similar services. Comprehensive real-world experiments are conducted to demonstrate the effectiveness and robustness of our approach compared with other state-of-the-art approaches.

88 citations


Cites background from "Shilling attacks against recommende..."

  • ...For immediate gain, some malicious users even exagerate their partners’ services while badmouthing their competiors’ services, which are also known as push attack and nuke atack in shilling attacks [13] ....

    [...]

Journal ArticleDOI
TL;DR: This paper briefly discusses the related survey papers about shilling attacks in CFRSs, explains profile injection attack strategies, shilling attack detection schemes and robust recommendation algorithms proposed so far in detail, and briefly explains evaluation metrics of the proposed schemes.
Abstract: Collaborative filtering recommender systems (CFRSs) have already been proved effective to cope with the information overload problem since they merged in the past two decades. However, CFRSs are highly vulnerable to shilling or profile injection attacks since their openness. Ratings injected by malicious users seriously affect the authenticity of the recommendations as well as users’ trustiness in the recommendation systems. In the past two decades, various studies have been conducted to scrutinize different profile injection attack strategies, shilling attack detection schemes, robust recommendation algorithms, and to evaluate them with respect to accuracy and robustness. Due to their popularity and importance, we survey about shilling attacks in CFRSs. We first briefly discuss the related survey papers about shilling attacks and analyze their deficiencies to illustrate the necessity of this paper. Next we give an overall picture about various shilling attack types and their deployment modes. Then we explain profile injection attack strategies, shilling attack detection schemes and robust recommendation algorithms proposed so far in detail. Moreover, we briefly explain evaluation metrics of the proposed schemes. Last, we discuss some research directions to improve shilling attack detection rates robustness of collaborative recommendation, and conclude this paper.

78 citations

Journal ArticleDOI
TL;DR: A novel approach is proposed to detect and correct those inconsistent ratings that might bias recommendations, whose main advantage regarding previous proposals is that it uses only the current ratings in the dataset without needing any additional information.
Abstract: Recommender systems help users to find information that best fits their preferences and needs in an overloaded search space. Most recommender systems research has been focused on the accuracy improvement of recommendation algorithms. Despite this, recently new trends in recommender systems have become important research topics such as, cold start, group recommendations, context-aware recommendations, and natural noise. The concept of natural noise is related to the study and management of inconsistencies in datasets of users’ preferences used in recommender systems. In this paper a novel approach is proposed to detect and correct those inconsistent ratings that might bias recommendations, whose main advantage regarding previous proposals is that it uses only the current ratings in the dataset without needing any additional information. To do so, this proposal detects noisy ratings by characterizing items and users by their profiles, and then a strategy to fix these noisy ratings is carried out to increase the accuracy of such recommender systems. Finally a case study is developed to show the advantage of this proposal to deal with natural noise regarding previous methodologies.

78 citations


Cites background from "Shilling attacks against recommende..."

  • ...The natural noise is produced by different sources: while the malicious noise is usually associated to user profiles that match certain patterns [34], the natural noise identification is more difficult because it tends to appear in several ways dissimilar to each other [35]....

    [...]

  • ...(1) Malicious noise, associated to noise intentionally introduced by an external agent to bias recommender results [34], and (2) Natural noise, involuntarily introduced by users, and that could also affect the recommendation result [35]....

    [...]

Journal ArticleDOI
TL;DR: Reputation-based matrix factorization first calculates the reputation of each user based on their contributed QoS values to quantify the credibility of users, and then takes the users' reputation into consideration for achieving more accurate QoS prediction.
Abstract: With the fast development of Web services in service-oriented systems, the requirement of efficient Quality of Service (QoS) evaluation methods becomes strong. However, many QoS values are unknown in reality. Therefore, it is necessary to predict the unknown QoS values of Web services based on the obtainable QoS values. Generally, the QoS values of similar users are employed to make predictions for the current user. However, the QoS values may be contributed from unreliable users, leading to inaccuracy of the prediction results. To address this problem, we present a highly credible approach, called reputation-based Matrix Factorization (RMF), for predicting the unknown Web service QoS values. RMF first calculates the reputation of each user based on their contributed QoS values to quantify the credibility of users, and then takes the users' reputation into consideration for achieving more accurate QoS prediction. Reputation-based matrix factorization is applicable to the prediction of QoS data in the presence of unreliable user-provided QoS values. Extensive experiments are conducted with real-world Web service QoS data sets, and the experimental results show that our proposed approach outperforms other existing approaches.

75 citations


Cites background from "Shilling attacks against recommende..."

  • ...Indeed, unreliable users have been found in many QoS prediction systems [22]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The key decisions in evaluating collaborative filtering recommender systems are reviewed: the user tasks being evaluated, the types of analysis and datasets being used, the ways in which prediction quality is measured, the evaluation of prediction attributes other than quality, and the user-based evaluation of the system as a whole.
Abstract: Recommender systems have been evaluated in many, often incomparable, ways. In this article, we review the key decisions in evaluating collaborative filtering recommender systems: the user tasks being evaluated, the types of analysis and datasets being used, the ways in which prediction quality is measured, the evaluation of prediction attributes other than quality, and the user-based evaluation of the system as a whole. In addition to reviewing the evaluation strategies used by prior researchers, we present empirical results from the analysis of various accuracy metrics on one content domain where all the tested metrics collapsed roughly into three equivalence classes. Metrics within each equivalency class were strongly correlated, while metrics from different equivalency classes were uncorrelated.

5,686 citations


"Shilling attacks against recommende..." refers background or methods in this paper

  • ...(2006b), Hurley et al. (2007), O’Mahony (2004), where shilling attacks are investigated with respect to gains versus attack costs; Resnick and Sami (2008b) examine robust CF approaches....

    [...]

  • ...CF system estimates similarities between a and each user in the database, forms a neighborhood by selecting the best similar users, and estimate a prediction (paq) or a recommendation list (top-N recommendation) using a CF algorithm (Herlocker et al. 2004)....

    [...]

Posted Content
TL;DR: In this article, the authors compare the predictive accuracy of various methods in a set of representative problem domains, including correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods.
Abstract: Collaborative filtering or recommender systems use a database about user preferences to predict additional topics or products a new user might like. In this paper we describe several algorithms designed for this task, including techniques based on correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods. We compare the predictive accuracy of the various methods in a set of representative problem domains. We use two basic classes of evaluation metrics. The first characterizes accuracy over a set of individual predictions in terms of average absolute deviation. The second estimates the utility of a ranked list of suggested items. This metric uses an estimate of the probability that a user will see a recommendation in an ordered list. Experiments were run for datasets associated with 3 application areas, 4 experimental protocols, and the 2 evaluation metrics for the various algorithms. Results indicate that for a wide range of conditions, Bayesian networks with decision trees at each node and correlation methods outperform Bayesian-clustering and vector-similarity methods. Between correlation and Bayesian networks, the preferred method depends on the nature of the dataset, nature of the application (ranked versus one-by-one presentation), and the availability of votes with which to make predictions. Other considerations include the size of database, speed of predictions, and learning time.

4,883 citations

Proceedings Article
24 Jul 1998
TL;DR: Several algorithms designed for collaborative filtering or recommender systems are described, including techniques based on correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods, to compare the predictive accuracy of the various methods in a set of representative problem domains.
Abstract: Collaborative filtering or recommender systems use a database about user preferences to predict additional topics or products a new user might like. In this paper we describe several algorithms designed for this task, including techniques based on correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods. We compare the predictive accuracy of the various methods in a set of representative problem domains. We use two basic classes of evaluation metrics. The first characterizes accuracy over a set of individual predictions in terms of average absolute deviation. The second estimates the utility of a ranked list of suggested items. This metric uses an estimate of the probability that a user will see a recommendation in an ordered list. Experiments were run for datasets associated with 3 application areas, 4 experimental protocols, and the 2 evaluation metr rics for the various algorithms. Results indicate that for a wide range of conditions, Bayesian networks with decision trees at each node and correlation methods outperform Bayesian-clustering and vector-similarity methods. Between correlation and Bayesian networks, the preferred method depends on the nature of the dataset, nature of the application (ranked versus one-by-one presentation), and the availability of votes with which to make predictions. Other considerations include the size of database, speed of predictions, and learning time.

4,557 citations


"Shilling attacks against recommende..." refers background or methods in this paper

  • ...Model-based algorithms, on the other hand, first create a model off-line from user-item matrix; they then used that model to produce predictions online (Breese et al. 1998)....

    [...]

  • ...Memory-based ones operate over the entire user-item matrix to estimate predictions (Breese et al. 1998)....

    [...]

Journal ArticleDOI
TL;DR: Tapestry is intended to handle any incoming stream of electronic documents and serves both as a mail filter and repository; its components are the indexer, document store, annotation store, filterer, little box, remailer, appraiser and reader/browser.
Abstract: The Tapestry experimental mail system developed at the Xerox Palo Alto Research Center is predicated on the belief that information filtering can be more effective when humans are involved in the filtering process. Tapestry was designed to support both content-based filtering and collaborative filtering, which entails people collaborating to help each other perform filtering by recording their reactions to documents they read. The reactions are called annotations; they can be accessed by other people’s filters. Tapestry is intended to handle any incoming stream of electronic documents and serves both as a mail filter and repository; its components are the indexer, document store, annotation store, filterer, little box, remailer, appraiser and reader/browser. Tapestry’s client/server architecture, its various components, and the Tapestry query language are described.

4,299 citations


"Shilling attacks against recommende..." refers methods in this paper

  • ...The term “collaborative filtering” was first coined by the designers of Tapestry (Goldberg et al. 1992), a mail filtering software developed in the early nineties for the intranet at the Xerox Palo Alto Research Center....

    [...]

BookDOI
28 Oct 2010
TL;DR: This handbook illustrates how recommender systems can support the user in decision-making, planning and purchasing processes, and works for well known corporations such as Amazon, Google, Microsoft and AT&T.
Abstract: The explosive growth of e-commerce and online environments has made the issue of information search and selection increasingly serious; users are overloaded by options to consider and they may not have the time or knowledge to personally evaluate these options. Recommender systems have proven to be a valuable way for online users to cope with the information overload and have become one of the most powerful and popular tools in electronic commerce. Correspondingly, various techniques for recommendation generation have been proposed. During the last decade, many of them have also been successfully deployed in commercial environments. Recommender Systems Handbook, an edited volume, is a multi-disciplinary effort that involves world-wide experts from diverse fields, such as artificial intelligence, human computer interaction, information technology, data mining, statistics, adaptive user interfaces, decision support systems, marketing, and consumer behavior. Theoreticians and practitioners from these fields continually seek techniques for more efficient, cost-effective and accurate recommender systems. This handbook aims to impose a degree of order on this diversity, by presenting a coherent and unified repository of recommender systems major concepts, theories, methodologies, trends, challenges and applications. Extensive artificial applications, a variety of real-world applications, and detailed case studies are included. Recommender Systems Handbook illustrates how this technology can support the user in decision-making, planning and purchasing processes. It works for well known corporations such as Amazon, Google, Microsoft and AT&T. This handbook is suitable for researchers and advanced-level students in computer science as a reference.

2,401 citations