scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Item-based collaborative filtering recommendation algorithms

01 Apr 2001-pp 285-295
TL;DR: This paper analyzes item-based collaborative ltering techniques and suggests that item- based algorithms provide dramatically better performance than user-based algorithms, while at the same time providing better quality than the best available userbased algorithms.
Abstract: Recommender systems apply knowledge discovery techniques to the problem of making personalized recommendations for information, products or services during a live interaction. These systems, especially the k-nearest neighbor collaborative ltering based ones, are achieving widespread success on the Web. The tremendous growth in the amount of available information and the number of visitors to Web sites in recent years poses some key challenges for recommender systems. These are: producing high quality recommendations, performing many recommendations per second for millions of users and items and achieving high coverage in the face of data sparsity. In traditional collaborative ltering systems the amount of work increases with the number of participants in the system. New recommender system technologies are needed that can quickly produce high quality recommendations, even for very large-scale problems. To address these issues we have explored item-based collaborative ltering techniques. Item-based techniques rst analyze the user-item matrix to identify relationships between di erent items, and then use these relationships to indirectly compute recommendations for users. In this paper we analyze di erent item-based recommendation generation algorithms. We look into di erent techniques for computing item-item similarities (e.g., item-item correlation vs. cosine similarities between item vectors) and di erent techniques for obtaining recommendations from them (e.g., weighted sum vs. regression model). Finally, we experimentally evaluate our results and compare them to the basic k-nearest neighbor approach. Our experiments suggest that item-based algorithms provide dramatically better performance than user-based algorithms, while at the same time providing better quality than the best available userbased algorithms.

Summary (6 min read)

1. INTRODUCTION

  • The amount of information in the world is increasing far more quickly than their ability to process it.
  • A new user, Neo, is matched against the database to discover neighbors, which are other users who have historically had similar taste to Neo.
  • In some ways these two challenges are in con ict, since the less time an algorithm spends searching for neighbors, the more scalable it will be, and the worse its quality.
  • The bottleneck in conventional collaborative ltering algorithms is the search for neighbors among a large user population of potential neighbors [12].
  • Recommendations for users are computed by nding items that are similar to other items the user has liked.

1.3 Organization

  • The next section provides a brief background in collaborative ltering algorithms.
  • The authors rst formally describe the collaborative ltering process and then discuss its two variants memorybased and model-based approaches.
  • The authors then present some challenges associated with the memory-based approach.

2. COLLABORATIVE FILTERING BASED RECOMMENDER SYSTEMS

  • Recommender systems systems apply data analysis techniques to the problem of helping users nd the items they would like to purchase at E-Commerce sites by producing a predicted likeliness score or a list of top{N recommended items for a given user.
  • 0.1 Overview of the Collaborative Filtering Process Opinions can be explicitly given by the user as a rating score, generally within a certain numerical scale, or can be implicitly derived from purchase records, by analyzing timing logs, by mining web hyperlinks and so on [28, 16].
  • Researchers have devised a number of collaborative ltering algorithms that can be divided into two main categories|.
  • Memory-based algorithms utilize the entire user-item database to generate a prediction.

3. ITEM-BASED COLLABORATIVE FILTERING ALGORITHM

  • Unlike the user-based collaborative ltering algorithm discussed in Section 2, the item-based approach looks into the set of items the target user has rated and computes how similar they are to the target item i and then selects k most similar items fi1; i2; : : : ; ikg.
  • At the same time their corresponding similarities fsi1; si2; : : : ; sikg are also computed.
  • Once the most similar items are found, the prediction is then computed by taking a weighted average of the target user's ratings on these similar items.
  • The authors describe these two aspects, namely, the similarity computation and the prediction generation in details here.

3.1 Item Similarity Computation

  • One critical step in the item-based collaborative ltering algorithm is to compute the similarity between items and then to select the most similar items.
  • The basic idea in similarity computation between two items i and j is to rst isolate the users who have rated both of these items and then to apply a similarity computation technique to determine the similarity si;j .
  • Figure 2 illustrates this process; here the matrix rows represent users and the columns represent items.
  • There are a number of di erent ways to compute the similarity between items.
  • These are cosine-based similarity, correlation-based similarity and adjusted-cosine similarity.

3.1.3 Adjusted Cosine Similarity

  • Computing similarity using basic cosine measure in item-based case has one important drawback|the di erences in rating scale between di erent users are not taken into account.
  • The adjusted cosine similarity o sets this drawback by subtracting the corresponding user average from each co-rated pair.
  • Formally, the similarity between items i and j using this scheme is given by sim(i; j) = P u2U(Ru;i Ru)(Ru;j Ru)qP u2U(Ru;i Ru)2 qP u2U (Ru;j Ru)2 : Here Item-item similarity is computed by looking into co-rated items only.
  • Isolation of the co-rated items and similarity computation, also known as Figure 2.

3.2.1 Weighted Sum

  • Each ratings is weighted by the corresponding similarity si;j between items i and j.
  • Formally, using the notion shown in Figure 3 the authors can denote the prediction Pu;i as Pu;i = P all similar items, N (si;N Ru;N)P all similar items, N (jsi;N j) Basically, this approach tries to capture how the active user rates the similar items.
  • The weighted sum is scaled by the sum of the similarity terms to make sure the prediction is within the prede ned range.

3.2.2 Regression

  • This approach is similar to the weighted sum method but instead of directly using the ratings of similar items it uses an approximation of the ratings based on regression model.
  • In practice, the similarities computed using cosine or correlation measures may be misleading in the sense that two rating vectors may be distant (in Euclidean sense) yet may have very high similarity.
  • In that case using the raw ratings of the \so-called" similar item may result in poor prediction.
  • The basic idea is to use the same formula as the weighted sum technique, but instead of using the similar item N 's \raw" ratings values Ru;N 's, this model uses their approximated values R 0 u;N based on a linear regression model.
  • The regression model parameters and are determined by going over both of the rating vectors.

3.3 Performance Implications

  • The largest E-Commerce sites operate at a scale that stresses the direct implementation of collaborative ltering.
  • For each item j the authors compute the k most similar items, where k n and record these item numbers and their similarities with j.
  • Based on this model building step, their prediction generation algorithm works as follows.
  • The authors observe a quality-performance trade-o here: to ensure good quality they must have a large model size, which leads to the performance problems discussed above.
  • Then the authors perform experiments to compute prediction and response-time to determine the impact of the model size on quality and performance of the whole system.

4.1 Data set

  • The authors used experimental data from their research website to evaluate di erent variants of item-based recommendation algorithms.
  • Each week hundreds of users visit MovieLens to rate and receive recommendations for movies.
  • For this purpose, the authors introduced a variable that determines what percentage of data is used as training and test sets; they call this variable x.
  • For their experiments, the authors also take another factor into consideration, sparsity level of data sets.

4.2 Evaluation Metrics

  • Recommender systems research has used several types of measures for evaluating the quality of a recommender system.
  • They can be mainly categorized into two classes: Statistical accuracy metrics evaluate the accuracy of a system by comparing the numerical recommendation scores against the actual user ratings for the user-item pairs in the test dataset.
  • Mean Absolute Error (MAE) between ratings and predictions is a widely used metric.
  • The MAE is computed by rst summing these absolute errors of the N corresponding ratings-prediction pairs and then computing the average.
  • The most commonly used decision support accuracy metrics are reversal rate, weighted errors and ROC sensitivity [23].

4.2.1 Experimental Procedure

  • The authors started their experiments by rst dividing the data set into a training and a test portion.
  • Before starting full experimental evaluation of di erent algorithms the authors determined the sensitivity of di erent parameters to di erent algorithms and from the sensitivity plots they xed the optimum values of these parameters and used them for the rest of the experiments.
  • For conducted a 10-fold cross validation of their experiments by randomly choosing di erent training and test sets each time and taking the average of the MAE values.
  • To compare the performance of item-based prediction the authors also entered the training ratings set into a collaborative ltering recommendation engine that employs the Pearson nearest neighbor algorithm (user-user).
  • All their experiments were implemented using C and compiled using optimization ag 06.

4.3 Experimental Results

  • In this section the authors present their experimental results of applying item-based collaborative ltering techniques for generating predictions.
  • The authors results are mainly divided into two parts|quality results and performance results.
  • In assessing the quality of recommendations, the authors rst determined the sensitivity of some parameters before running the main experiment.
  • These parameters include the neighborhood size, the value of the training/test ratio x, and e ects of di erent similarity measures.
  • For determining the sensitivity of various parameters, the authors focused only on the training data set and further divided it into a training and a test portion and used them to learn the parameters.

4.3.1 Effect of Similarity Algorithms

  • The authors implemented three di erent similarity algorithms basic cosine, adjusted cosine and correlation as described in Section 3.1 and tested them on their data sets.
  • For each simi- Relative performance of different similarity measures M A E larity algorithms, the authors implemented the algorithm to compute the neighborhood and used weighted sum algorithm to generate the prediction.
  • The authors ran these experiments on their training data and used test set to compute Mean Absolute Error (MAE).
  • It can be observed from the results that o setting the user-average for cosine similarity computation has a clear advantage, as the MAE is signi cantly lower in this case.
  • Hence, the authors select the adjusted cosine similarity for the rest of their experiments.

4.3.2 Sensitivity of Training/Test Ratio

  • For each of these training/test ratio values the authors ran their experiments using the two prediction generation techniques{basic weighted sum and regression based approach.
  • The regression-based approach shows better results than the basic scheme for low values of x but as the authors increase x the quality tends to fall below the basic scheme.

4.3.3 Experiments with neighborhood size

  • The size of the neighborhood has signi cant impact on the prediction quality [12].
  • The authors can observe that the size of neighborhood does a ect the quality of prediction.
  • But the two methods show di erent types of sensitivity.
  • The basic item-item algorithm improves as the authors increase the neighborhood size from 10 to 30, after that the rate of increase diminishes and the curve tends to be at.
  • On the other hand, the regression-based algorithm shows decrease in prediction quality with increased number of neighbors.

4.3.4 Quality Experiments

  • Once the authors obtain the optimal values of the parameters, they compare both of their item-based approaches with the benchmark user-based algorithm.
  • It can be observed from the charts that the basic Sensitivity of the parameter x M A E Sensitivity of the Neighborhood Size item-item algorithm out performs the user based algorithm at all values of x (neighborhood size = 30) and all values of neighborhood size (x = 0:8).
  • Similarly at a neighborhood size of 60 user-user and item-item schemes show MAE of 0:732 and 0:726 respectively.
  • The authors draw two conclusions from these results.
  • Second, regression-based algorithms perform better with very sparse data set, but as the authors add more data the quality goes down.

4.3.5 Performance Results

  • After showing that the item-based algorithm provides better quality of prediction than the user-based algorithm, the authors focus on the scalability issues.
  • As discussed earlier, itembased similarity is more static and allows us to precompute the item neighborhood.
  • This precomputation of the model has certain performance bene ts.
  • To make the system even more scalable the authors looked into the sensitivity of the model size and then looked into the impact of model size on the response time and throughput.

4.4 Sensitivity of the Model Size

  • Using the training data set the authors precomputed the item similarity using di erent model sizes and then used only the weighted sum prediction generation technique to provide the predictions.
  • The authors then used the test data Sensitivity of the model size (at selected train/test ratio) M A E set to compute MAE and plotted the values.
  • The authors repeated the entire process for three di erent x values (training/test ratios).
  • The most important observation from these plots is the high accuracy that can be achieved using only a fraction of items.
  • It appears from the plots that it is useful to precompute the item similarities using only a fraction of items and yet possible to obtain good prediction quality.

4.4.1 Impact of the model size on run-time and throughput

  • Given the quality of prediction is reasonably good with small model size, the authors focus on the run-time and throughput of the system.
  • This di erence is even more prominent with x = 0:8 where a model size of 200 requires only 1:292 seconds and the basic item-item case requires 36:34 seconds.
  • To make the numbers comparable the authors compute the throughput (predictions generated per second) for the model based and basic item-item schemes.

4.5 Discussion

  • From the experimental evaluation of the item-item collaborative ltering scheme the authors make some important observations.
  • First, the item-item scheme provides better quality of predictions than the use-user (k-nearest neighbor) scheme.
  • The improvement in quality is consistent over di erent neighborhood size and training/test ratio.
  • The improvement is not signi cantly large.
  • The authors experimental results support that claim.

5. CONCLUSION

  • Recommender systems are a powerful new technology for extracting additional value for a business from its user databases.
  • These systems help users nd items they want to buy from a business.
  • Conversely, they help the business by generating more sales.
  • Recommender systems are being stressed by the huge volume of user data in existing corporate databases, and will be stressed even more by the increasing volume of user data available on the Web.
  • In this paper the authors presented and experimentally evaluated a new algorithm for CF-based recommender systems.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Item-Based Collaborative Filtering Recommendation
Algorithms
Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl
f
sarwar, karypis, konstan, riedl
g
@cs.umn.edu
GroupLens Research Group/Army HPC Research Center
Department of Computer Science and Engineering
University of Minnesota, Minneapolis, MN 55455
ABSTRACT
Recommender systems apply knowledge discovery techniques
to the problem of making personalized recommendations for
information, pro ducts or services during a liveinteraction.
These systems, esp ecially the k-nearest neighbor collabora-
tive ltering based ones, are achieving widespread success on
the Web. The tremendous growth in the amountof avail-
able information and the number of visitors to Web sites in
recentyears p oses some key challenges for recommender sys-
tems. These are: pro ducing high quality recommendations,
performing many recommendations p er second for millions
of users and items and achieving high coverage in the face of
data sparsity. In traditional collaborative ltering systems
the amountof work increases with the number of partici-
pants in the system. New recommender system technologies
are needed that can quickly produce high quality recom-
mendations, even for very large-scale problems. To address
these issues we have explored item-based collaborative l-
tering techniques. Item-based techniques rst analyze the
user-item matrix to identify relationships b etween dierent
items, and then use these relationships to indirectly compute
recommendations for users.
In this pap er we analyze dierent item-based recommen-
dation generation algorithms. Welook into dierenttech-
niques for computing item-item similarities (e.g., item-item
correlation vs. cosine similarities between item vectors) and
dierenttechniques for obtaining recommendations from them
(e.g., weighted sum vs. regression model). Finally, weex-
perimentally evaluate our results and compare them to the
basic k-nearest neighbor approach. Our experiments sug-
gest that item-based algorithms provide dramatically better
performance than user-based algorithms, while at the same
time providing better quality than the b est available user-
based algorithms.
1. INTRODUCTION
The amount of information in the world is increasing far
more quickly than our ability to pro cess it. All of us have
known the feeling of b eing overwhelmed by the number of
new bo oks, journal articles, and conference pro ceedings com-
ing out eachyear. Technology has dramatically reduced the
barriers to publishing and distributing information. Now
it is time to create the technologies that can help us sift
Copyright is held by the author/owner.
WWW10, May 1-5, 2001, Hong Kong.
ACM 1-58113-348-0/01/0005.
through all the available information to nd that whichis
most valuable to us.
One of the most promising suchtechnologies is
col labora-
tive ltering
[19, 27, 14, 16]. Collab orative ltering works by
building a database of preferences for items by users. Anew
user, Neo, is matched against the database to discover
neigh-
bors
, which are other users who have historically had similar
taste to Neo. Items that the neighbors like are then recom-
mended to Neo, as he will probably also like them. Collab-
orative ltering has been very successful in both research
and practice, and in b oth information ltering applications
and E-commerce applications. However, there remain im-
portant research questions in overcoming two fundamental
challenges for collaborative ltering recommender systems.
The rst challenge is to improve the scalability of the col-
laborative ltering algorithms. These algorithms are able to
search tens of thousands of p otential neighbors in real-time,
but the demands of modern systems are to searchtensof
millions of p otential neighbors. Further, existing algorithms
have p erformance problems with individual users for whom
the site has large amounts of information. For instance,
if a site is using browsing patterns as indications of con-
tent preference, it mayhave thousands of data p oints for its
most frequent visitors. These \long user rows" slow down
the number of neighb ors that can b e searched p er second,
further reducing scalability.
The second challenge is to improve the quality of the rec-
ommendations for the users. Users need recommendations
they can trust to help them nd items they will like. Users
will "vote with their feet" by refusing to use recommender
systems that are not consistently accurate for them.
In some ways these twochallenges are in conict, since the
less time an algorithm sp ends searching for neighbors, the
more scalable it will b e, and the worse its quality. For this
reason, it is important to treat the two challenges simul-
taneously so the solutions discovered are b oth useful and
practical.
In this pap er, we address these issues of recommender
systems by applying a dierent approach{item-based algo-
rithm. The b ottleneckinconventional collab orative lter-
ing algorithms is the search for neighbors among a large
user p opulation of p otential neighbors [12]. Item-based al-
gorithms avoid this b ottleneckby exploring the relationships
between items rst, rather than the relationships between
users. Recommendations for users are computed by nding
items that are similar to other items the user has liked. Be-
cause the relationships between items are relatively static,
285

item-based algorithms may b e able to provide the same qual-
ity as the user-based algorithms with less online computa-
tion.
1.1 Related Work
In this section we briey present some of the researchlit-
erature related to collab orative ltering, recommender sys-
tems, data mining and p ersonalization.
Tapestry [10] is one of the earliest implementations of col-
laborative ltering-based recommender systems. This sys-
tem relied on the explicit opinions of people from a close-knit
community,such as an oÆce workgroup. However, recom-
mender system for large communities cannot depend on each
person knowing the others. Later, several ratings-based au-
tomated recommender systems were developed. The Grou-
pLens research system [19, 16] provides a pseudonymous
collaborative ltering solution for Usenet news and movies.
Ringo [27] and Video Recommender [14] are email and web-
based systems that generate recommendations on music and
movies, resp ectively. A sp ecial issue of Communications of
the ACM [20] presents a number of dierent recommender
systems.
Other technologies have also been applied to recommender
systems, including Bayesian networks, clustering, and Hort-
ing. Bayesian networks create a mo del based on a training
set with a decision tree at each no de and edges represent-
ing user information. The mo del can be built o-line over a
matter of hours or days. The resulting model is very small,
very fast, and essentially as accurate as nearest neighbor
methods [6]. Bayesian networks mayprove practical for en-
vironments in whichknowledge of user preferences changes
slowly with resp ect to the time needed to build the model
but are not suitable for environments in which user prefer-
ence models must b e updated rapidly or frequently.
Clustering techniques work by identifying groups of users
who appear to have similar preferences. Once the clusters
are created, predictions for an individual can b e made byav-
eraging the opinions of the other users in that cluster. Some
clustering techniques represent each user with partial par-
ticipation in several clusters. The prediction is then an aver-
age across the clusters, weighted by degree of participation.
Clustering techniques usually pro duce less-p ersonal recom-
mendations than other metho ds, and in some cases, the clus-
ters haveworse accuracy than nearest neighbor algorithms
[6]. Once the clustering is complete, however, p erformance
can b e very goo d, since the size of the group that must b e
analyzed is much smaller. Clustering techniques can also
be applied as a "rst step" for shrinking the candidate set
in a nearest neighb or algorithm or for distributing nearest-
neighbor computation across several recommender engines.
While dividing the p opulation into clusters may hurt the
accuracy or recommendations to users near the fringes of
their assigned cluster, pre-clustering maybe a worthwhile
trade-o between accuracy and throughput.
Horting is a graph-based technique in which no des are
users, and edges between nodes indicate degree of similarity
between two users [1]. Predictions are produced bywalking
the graph to nearbynodesand combining the opinions of
the nearby users. Horting diers from nearest neighbor as
the graph maybewalked through other users who have not
rated the item in question, thus exploring transitive rela-
tionships that nearest neighbor algorithms do not consider.
In one study using synthetic data, Horting pro duced better
predictions than a nearest neighbor algorithm [1].
Schafer et al., [26] present a detailed taxonomy and exam-
ples of recommender systems used in E-commerce and how
they can provide one-to-one personalization and at the same
can capture customer loyalty. Although these systems have
been successful in the past, their widespread use has exp osed
some of their limitations such as the problems of sparsityin
the data set, problems asso ciated with high dimensionality
and so on. Sparsity problem in recommender system has
been addressed in [23, 11]. The problems asso ciated with
high dimensionality in recommender systems havebeendis-
cussed in [4], and application of dimensionality reduction
techniques to address these issues has b een investigated in
[24].
Our work explores the extentto which item-based recom-
menders, a new class of recommender algorithms, are able
to solve these problems.
1.2 Contributions
This paper has three primary researchcontributions:
1. Analysis of the item-based prediction algorithms and
identication of dierentways to implementits sub-
tasks.
2. Formulation of a precomputed mo del of item similarity
to increase the online scalability of item-based recom-
mendations.
3. An exp erimental comparison of the qualityof several
dierent item-based algorithms to the classic user-based
(nearest neighbor) algorithms.
1.3 Organization
The rest of the pap er is organized as follows. The next
section provides a brief background in collab orative lter-
ing algorithms. We rst formally describ e the collaborative
ltering pro cess and then discuss its twovariants memory-
based and mo del-based approaches. We then present some
challenges asso ciated with the memory-based approach. In
section 3, we present the item-based approach and describ e
dierent sub-tasks of the algorithm in detail. Section 4 de-
scribes our exp erimental work. It provides details of our
data sets, evaluation metrics, metho dology and results of
dierent experiments and discussion of the results. The -
nal section provides some concluding remarks and directions
for future research.
2. COLLABORATIVE FILTERING BASED
RECOMMENDER SYSTEMS
Recommender systems
systems apply data analysis tech-
niques to the problem of helping users nd the items they
would like to purchase at E-Commerce sites by pro ducing
a predicted likeliness score or a list of
top{
N
recommended
items for a given user. Item recommendations can b e made
using dierent methods. Recommendations can be based
on demographics of the users, overall top selling items, or
past buying habit of users as a predictor of future items.
Collaborative Filtering (CF) [19, 27] is the most success-
ful recommendation technique to date. The basic idea of
CF-based algorithms is to provide item recommendations
or predictions based on the opinions of other like-minded
286

users. The opinions of users can be obtained
explicitly
from
the users or by using some
implicit
measures.
2.0.1 Overview of the Collaborative Filtering Pro-
cess
The goal of a collab orative ltering algorithm is to sug-
gest new items or to predict the utility of a certain item for
a particular user based on the user's previous likings and
the opinions of other like-minded users. In a typical CF sce-
nario, there is a list of
m
users
U
=
f
u
1
;u
2
;::: ;u
m
g
and a
list of
n
items
I
=
f
i
1
;i
2
;::: ;i
n
g
. Eachuser
u
i
has a list
of items
I
u
i
, which the user has expressed his/her opinions
about. Opinions can be explicitly given by the user as a
rat-
ing score
, generally within a certain numerical scale, or can
be implicitly derived from purchase records, by analyzing
timing logs, by mining web hyperlinks and so on [28, 16].
Note that
I
u
i
I
and it is possible for
I
u
i
to be a
nul l-set
.
There exists a distinguished user
u
a
2U
called the
active
user
for whom the task of a collaborative ltering algorithm
is to nd an item likeliness that can be of two forms.
Prediction
isanumerical value,
P
a;j
, expressing the
predicted likeliness of item
i
j
62
I
u
a
for the activeuser
u
a
. This predicted value is within the same scale (e.g.,
from 1 to 5) as the opinion values provided by
u
a
.
Recommendation
isalistof
N
items,
I
r
I
, that
the active user will like the most. Note that the recom-
mended list must b e on items not already purchased by
the active user, i.e.,
I
r
\
I
u
a
=. This interface of CF
algorithms is also known as
Top-N recommendation
.
Figure 1 shows the schematic diagram of the collab orative
ltering pro cess. CF algorithms represent the entire
m
n
user-item data as a ratings matrix,
A
. Eachentry
a
i;j
in
A
represents the preference score (ratings) of the
i
th user on
the
j
th item. Each individual ratings is within a numerical
scale and it can as well be 0 indicating that the user has
not yet rated that item. Researchers have devised a num-
ber of collaborative ltering algorithms that can b e divided
into two main categories|
Memory-based (user-based)
and
Model-based (item-based)
algorithms [6]. In this section we
provide a detailed analysis of CF-based recommender sys-
tem algorithms.
Memory-based Collab orative Filtering Algorithms
.
Memory-based algorithms utilize the entire user-item data-
base to generate a prediction. These systems employsta-
tistical techniques to nd a set of users, known as
neigh-
bors
, that have a history of agreeing with the target user
(i.e., they either rate dierent items similarly or they tend
to buy similar set of items). Once a neighborhood of users
is formed, these systems use dierent algorithms to com-
bine the preferences of neighbors to pro duce a prediction or
top-
N
recommendation for the activeuser. The techniques,
also known as
nearest-neighbor
or user-based collaborative
ltering, are more popular and widely used in practice.
Model-based Collab orative Filtering Algorithms
.
Mo-
del-based collab orative ltering algorithms provide item rec-
ommendation by rst developing a mo del of user ratings. Al-
gorithms in this category take a probabilistic approach and
envision the collaborative ltering pro cess as computing the
expected value of a user prediction, given his/her ratings
on other items. The mo del building process is p erformed
by dierent
machine learning
algorithms suchas
Bayesian
network, clustering,
and
rule-based
approaches. The
Bayesian network model [6] formulates a probabilistic mo del
for collab orative ltering problem. Clustering mo del treats
collaborative ltering as a classication problem [2, 6, 29]
and works by clustering similar users in same class and esti-
mating the probability that a particular user is in a partic-
ular class
C
, and from there computes the conditional prob-
ability of ratings. The rule-based approach applies associ-
ation rule discovery algorithms to nd asso ciation b etween
co-purchased items and then generates item recommenda-
tion based on the strength of the association b etween items
[25].
2.0.2 Challenges of User-based Collaborative Filter-
ing Algorithms
User-based collaborative ltering systems havebeenvery
successful in past, but their widespread use has revealed
some potential challenges suchas:
Sparsity.
In practice, many commercial recommender
systems are used to evaluate large item sets (e.g., Ama-
zon.com recommends b o oks and CDnow.com recom-
mends music albums). In these systems, even active
users mayhavepurchased well under 1% of the items
(1% of 2 million b ooks is 20
;
000 bo oks). Accordingly,
a recommender system based on nearest neighbor al-
gorithms may be unable to makeany item recommen-
dations for a particular user. As a result the accuracy
of recommendations maybe poor.
Scalability.
Nearest neighbor algorithms require com-
putation that grows with both the number of users
and the number of items. With millions of users and
items, a typical web-based recommender system run-
ning existing algorithms will suer serious scalability
problems.
The weakness of nearest neighbor algorithm for large,
sparse databases led us to explore alternative recommender
system algorithms. Our rst approach attempted to bridge
the sparsityby incorp orating semi-intelligent ltering agents
into the system [23, 11]. These agents evaluated and rated
each item using syntactic features. By providing a dense rat-
ings set, they helped alleviate coverage and improved qual-
ity. The ltering agent solution, however, did not address
the fundamental problem of p oor relationships among like-
minded but sparse-rating users. To explore that we to ok
an algorithmic approach and used Latent Semantic Index-
ing (LSI) to capture the similaritybetween users and items
in a reduced dimensional space [24, 25]. In this pap er we
look into another technique, the model-based approach, in
addressing these challenges, esp ecially the scalability chal-
lenge. The main idea here is to analyze the user-item repre-
sentation matrix to identify relations between dierent items
and then to use these relations to compute the prediction
score for a given user-item pair. The intuition b ehind this
approach is that a user would be interested in purchasing
items that are similar to the items the user liked earlier
and would tend to avoid items that are similar to the items
the user didn't like earlier. These techniques don't require
to identify the neighborhood of similar users when a rec-
ommendation is requested; as a result they tend to pro-
duce much faster recommendations. Anumber of dierent
287

u
1
u
2
u
a
u
m
.
.
.
.
i
1
i
2
i
j
i
n
. .
. .
Input (ratings table)
Active user
Item for which prediction
is sought
Prediction
Recommendation
CF-Algorithm
P
a,j
(prediction on
item j for the active
user)
{T
i1
, T
i2
, ..., T
iN
} Top-N
list of items for the
active user
Output interface
Figure 1: The Collaborative Filtering Process.
schemes have been prop osed to compute the association b e-
tween items ranging from probabilistic approach [6] to more
traditional item-item correlations [15, 13]. We present a de-
tailed analysis of our approach in the next section.
3. ITEM-BASED COLLABORATIVE FILT-
ERING ALGORITHM
In this section we study a class of item-based recommen-
dation algorithms for producing predictions to users. Unlike
the user-based collab orative ltering algorithm discussed in
Section 2, the item-based approach lo oks into the set of
items the target user has rated and computes how simi-
lar they are to the target item
i
and then selects
k
most
similar items
f
i
1
;i
2
;::: ;i
k
g
. At the same time their cor-
responding similarities
f
s
i
1
;s
i
2
;::: ;s
ik
g
are also computed.
Once the most similar items are found, the prediction is
then computed by taking a weighted average of the target
user's ratings on these similar items. We describ e these two
aspects, namely, the similarity computation and the predic-
tion generation in details here.
3.1 Item Similarity Computation
One critical step in the item-based collab orative ltering
algorithm is to compute the similarity between items and
then to select the most similar items. The basic idea in
similarity computation between two items
i
and
j
is to rst
isolate the users who have rated b oth of these items and then
to apply a similarity computation technique to determine
the similarity
s
i;j
. Figure 2 illustrates this pro cess; here
the matrix rows represent users and the columns represent
items.
There are a number of dierentways to compute the sim-
ilaritybetween items. Here we present three such metho ds.
These are cosine-based similarity, correlation-based similar-
ity and adjusted-cosine similarity.
3.1.1 Cosine-based Similarity
In this case, two items are thoughtofas two vectors in
the
m
dimensional user-space. The similaritybetween them
is measured by computing the cosine of the angle between
these two vectors. Formally, in the
m
n
ratings matrix
in Figure 2, similarity between items
i
and
j
, denoted by
sim
(
i; j
)isgiven by
sim
(
i; j
)=cos(
~
i;
~
j
)=
~
i
~
j
k
~
i
k
2
k
~
j
k
2
where \
" denotes the dot-product of the twovectors.
3.1.2 Correlation-based Similarity
In this case, similaritybetween two items
i
and
j
is mea-
sured by computing the
Pearson-r
correlation
corr
i;j
. To
make the correlation computation accurate we must rst
isolate the co-rated cases (i.e., cases where the users rated
both
i
and
j
) as shown in Figure 2. Let the set of users who
both rated
i
and
j
are denoted by
U
then the correlation
similarity is given by
sim
(
i; j
)=
P
u
2
U
(
R
u;i
R
i
)(
R
u;j
R
j
)
q
P
u
2
U
(
R
u;i
R
i
)
2
q
P
u
2
U
(
R
u;j
R
j
)
2
:
Here
R
u;i
denotes the rating of user
u
on item
i
,
R
i
is the
average rating of the
i
-th item.
3.1.3 Adjusted Cosine Similarity
One fundamental dierence between the similaritycom-
putation in user-based CF and item-based CF is that in case
of user-based CF the similarity is computed along the rows
of the matrix but in case of the item-based CF the simi-
larity is computed along the columns, i.e., each pair in the
co-rated set corresponds to a dierent user (Figure 2). Com-
puting similarity using basic cosine measure in item-based
case has one imp ortant drawback|the dierences in rat-
ing scale between dierent users are not taken into account.
The adjusted cosine similarity osets this drawbackby sub-
tracting the corresp onding user average from each co-rated
pair. Formally, the similaritybetween items
i
and
j
using
this scheme is given by
sim
(
i; j
)=
P
u
2
U
(
R
u;i
R
u
)(
R
u;j
R
u
)
q
P
u
2
U
(
R
u;i
R
u
)
2
q
P
u
2
U
(
R
u;j
R
u
)
2
:
Here
R
u
is the average of the
u
-th user's ratings.
288

1
2
3 i n-1 n
1
2
u
m
m-1
j
R-
R -
R R
R R
R R
Item-item similarity is computed by
looking into co-rated items only. In
case of items i and j the similarity s
i,j
is
computed by looking into them. Note:
each of these co-rated pairs are
obtained from different users, in this
example they come from users 1, u
and m-1.
s
i,j
=?
Figure 2: Isolation of the co-rated items and similarity computation
3.2 Prediction Computation
The most important step in a collab orative ltering sys-
tem is to generate the output interface in terms of prediction.
Once we isolate the set of most similar items based on the
similarity measures, the next step is to lo ok into the tar-
get users ratings and use a technique to obtain predictions.
Here we consider two such techniques.
3.2.1 Weighted Sum
As the name implies, this metho d computes the prediction
on an item
i
for a user
u
by computing the sum of the ratings
given by the user on the items similar to
i
. Each ratings is
weighted by the corresponding similarity
s
i;j
between items
i
and
j
. Formally, using the notion shown in Figure 3 we
can denote the prediction
P
u;i
as
P
u;i
=
P
all similar items, N
(
s
i;N
R
u;N
)
P
all similar items, N
(
j
s
i;N
j
)
Basically, this approach tries to capture how the active
user rates the similar items. The weighted sum is scaled by
the sum of the similarity terms to make sure the prediction
is within the predened range.
3.2.2 Regression
This approach is similar to the weighted sum method but
instead of directly using the ratings of similar items it uses
an approximation of the ratings based on regression model.
In practice, the similarities computed using cosine or cor-
relation measures may be misleading in the sense that two
rating vectors may b e distant (in Euclidean sense) yet may
havevery high similarity. In that case using the raw ratings
of the \so-called" similar item may result in po or prediction.
The basic idea is to use the same formula as the weighted
sum technique, but instead of using the similar item
N
's
\raw" ratings values
R
u;N
's, this mo del uses their approx-
imated values
R
0
u;N
based on a linear regression mo del. If
we denote the respectivevectors of the target item
i
and the
similar item
N
by
R
i
and
R
N
the linear regression model
can be expressed as
R
0
N
=
R
i
+
+
The regression model parameters
and
are determined
by going over b oth of the rating vectors.
is the error of
the regression mo del.
3.3 Performance Implications
The largest E-Commerce sites op erate at a scale that
stresses the direct implementation of collab orative ltering.
In neighborho od-based CF systems, the neighb orhoo d for-
mation process, esp ecially the user-user similarity computa-
tion step turns out to be the performance b ottleneck, which
in turn can make the whole pro cess unsuitable for real-time
recommendation generation. One way of ensuring high scal-
ability is to use a mo del-based approach. Model-based sys-
tems have the p otential to contribute to recommender sys-
tems to operate at a high scale. The main idea here to iso-
late the neighborho od generation and prediction generation
steps.
In this pap er, we present a mo del-based approach to pre-
compute item-item similarity scores. The similarity compu-
tation scheme is still correlation-based but the computation
is performed on the item space. In a typical E-Commerce
scenario, we usually have a set of item that is static com-
pared to the number of users that changes most often. The
static nature of items leads us to the idea of precomput-
ing the item similarities. One p ossible way of precomputing
the item similarities is to compute all-to-all similarityand
then p erforming a quick table lo ok-up to retrieve the re-
quired similarityvalues. This method, although saves time,
requires an
O
(
n
2
) space for
n
items.
The fact that we only need a small fraction of similar items
to compute predictions leads us to an alternate model-based
scheme. In this scheme, we retain only a small number of
similar items. For each item
j
we compute the
k
most sim-
ilar items, where
k
n
and record these item numbers
and their similarities with
j
. Weterm
k
as the
model size
.
Based on this mo del building step, our prediction genera-
tion algorithm works as follows. For generating predictions
for a user
u
on item
i
, our algorithm rst retrieves the pre-
computed
k
most similar items corresp onding to the target
item
i
. Then it lo oks howmany of those
k
items were pur-
chased by the user
u
, based on this intersection then the
289

Citations
More filters
Journal ArticleDOI
TL;DR: This paper presents an overview of the field of recommender systems and describes the current generation of recommendation methods that are usually classified into the following three main categories: content-based, collaborative, and hybrid recommendation approaches.
Abstract: This paper presents an overview of the field of recommender systems and describes the current generation of recommendation methods that are usually classified into the following three main categories: content-based, collaborative, and hybrid recommendation approaches. This paper also describes various limitations of current recommendation methods and discusses possible extensions that can improve recommendation capabilities and make recommender systems applicable to an even broader range of applications. These extensions include, among others, an improvement of understanding of users and items, incorporation of the contextual information into the recommendation process, support for multicriteria ratings, and a provision of more flexible and less intrusive types of recommendations.

9,873 citations


Cites background or methods from "Item-based collaborative filtering ..."

  • ...Other collaborative filtering methods include a Bayesian model [20], a probabilistic relational model [37], a linear regression [91], and a maximum entropy model [75]....

    [...]

  • ...[91] proposed using the same correlation-based and cosinebased techniques to compute similarities between items instead and obtain the ratings from them....

    [...]

  • ...Other important research issues that have been explored in the recommender systems literature include explainability [12], [42], trustworthiness [28], scalability [4], [39], [91], [93], and privacy [82], [93] issues of recommender systems....

    [...]

  • ...In the cosine-based approach [15], [91], the two users x and y are treated as two vectors in m-dimensional space, where m 1⁄4 jSxyj....

    [...]

  • ...In addition, [29], [91] present empirical evidence that item-based algorithms can provide better computational performance than traditional user-based collaborative methods while, at the same time, providing comparable or better quality than the best available userbased algorithms....

    [...]

Journal ArticleDOI
TL;DR: The key decisions in evaluating collaborative filtering recommender systems are reviewed: the user tasks being evaluated, the types of analysis and datasets being used, the ways in which prediction quality is measured, the evaluation of prediction attributes other than quality, and the user-based evaluation of the system as a whole.
Abstract: Recommender systems have been evaluated in many, often incomparable, ways. In this article, we review the key decisions in evaluating collaborative filtering recommender systems: the user tasks being evaluated, the types of analysis and datasets being used, the ways in which prediction quality is measured, the evaluation of prediction attributes other than quality, and the user-based evaluation of the system as a whole. In addition to reviewing the evaluation strategies used by prior researchers, we present empirical results from the analysis of various accuracy metrics on one content domain where all the tested metrics collapsed roughly into three equivalence classes. Metrics within each equivalency class were strongly correlated, while metrics from different equivalency classes were uncorrelated.

5,686 citations


Cites background or methods from "Item-based collaborative filtering ..."

  • ...Several researchers have studied novelty and serendipity in the context of collaborative .ltering systems [Sarwar et al. 2001]....

    [...]

  • ...Shardanand and Maes [1995] measured “reversals”—large errors between the predicted and actual rating; we have used the signal-processing measure of the Receiver Operating Characteristic curve [Swets 1963] to measure a recommender’s potential as a filter [Konstan et al....

    [...]

  • ...Several researchers have studied novelty and serendipity in the context of collaborative filtering systems [Sarwar et al. 2001]....

    [...]

Journal Article
TL;DR: This work compares three common approaches to solving the recommendation problem: traditional collaborative filtering, cluster models, and search-based methods, and their algorithm, which is called item-to-item collaborative filtering.
Abstract: Recommendation algorithms are best known for their use on e-commerce Web sites, where they use input about a customer's interests to generate a list of recommended items. Many applications use only the items that customers purchase and explicitly rate to represent their interests, but they can also use other attributes, including items viewed, demographic data, subject interests, and favorite artists. At Amazon.com, we use recommendation algorithms to personalize the online store for each customer. The store radically changes based on customer interests, showing programming titles to a software engineer and baby toys to a new mother. There are three common approaches to solving the recommendation problem: traditional collaborative filtering, cluster models, and search-based methods. Here, we compare these methods with our algorithm, which we call item-to-item collaborative filtering. Unlike traditional collaborative filtering, our algorithm's online computation scales independently of the number of customers and number of items in the product catalog. Our algorithm produces recommendations in real-time, scales to massive data sets, and generates high quality recommendations.

4,788 citations

Proceedings ArticleDOI
03 Apr 2017
TL;DR: This work strives to develop techniques based on neural networks to tackle the key problem in recommendation --- collaborative filtering --- on the basis of implicit feedback, and presents a general framework named NCF, short for Neural network-based Collaborative Filtering.
Abstract: In recent years, deep neural networks have yielded immense success on speech recognition, computer vision and natural language processing. However, the exploration of deep neural networks on recommender systems has received relatively less scrutiny. In this work, we strive to develop techniques based on neural networks to tackle the key problem in recommendation --- collaborative filtering --- on the basis of implicit feedback. Although some recent work has employed deep learning for recommendation, they primarily used it to model auxiliary information, such as textual descriptions of items and acoustic features of musics. When it comes to model the key factor in collaborative filtering --- the interaction between user and item features, they still resorted to matrix factorization and applied an inner product on the latent features of users and items. By replacing the inner product with a neural architecture that can learn an arbitrary function from data, we present a general framework named NCF, short for Neural network-based Collaborative Filtering. NCF is generic and can express and generalize matrix factorization under its framework. To supercharge NCF modelling with non-linearities, we propose to leverage a multi-layer perceptron to learn the user-item interaction function. Extensive experiments on two real-world datasets show significant improvements of our proposed NCF framework over the state-of-the-art methods. Empirical evidence shows that using deeper layers of neural networks offers better recommendation performance.

4,419 citations


Cites background or methods from "Item-based collaborative filtering ..."

  • ...While early literature on recommendation has largely focused on explicit feedback [30, 31], recent attention is increasingly shifting towards implicit data [1, 14, 23]....

    [...]

  • ...In terms of user personalization, this approach shares a similar spirit as the item–item model [31, 25] that represents a user as her rated items....

    [...]

  • ..., ratings and clicks), known as collaborative filtering [31, 46]....

    [...]

Journal ArticleDOI
TL;DR: Item-to-item collaborative filtering (ITF) as mentioned in this paper is a popular recommendation algorithm for e-commerce Web sites that scales independently of the number of customers and number of items in the product catalog.
Abstract: Recommendation algorithms are best known for their use on e-commerce Web sites, where they use input about a customer's interests to generate a list of recommended items. Many applications use only the items that customers purchase and explicitly rate to represent their interests, but they can also use other attributes, including items viewed, demographic data, subject interests, and favorite artists. At Amazon.com, we use recommendation algorithms to personalize the online store for each customer. The store radically changes based on customer interests, showing programming titles to a software engineer and baby toys to a new mother. There are three common approaches to solving the recommendation problem: traditional collaborative filtering, cluster models, and search-based methods. Here, we compare these methods with our algorithm, which we call item-to-item collaborative filtering. Unlike traditional collaborative filtering, our algorithm's online computation scales independently of the number of customers and number of items in the product catalog. Our algorithm produces recommendations in real-time, scales to massive data sets, and generates high quality recommendations.

4,372 citations

References
More filters
Journal ArticleDOI
TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Abstract: A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. initial tests find this completely automatic method for retrieval to be promising.

12,443 citations

Journal Article
TL;DR: Defection rates are not just a measure of service quality; they are also a guide for achieving it; by listening to the reasons why customers defect, managers learn exactly where the company is falling short and where to direct their resources.
Abstract: Companies that want to improve their service quality should take a cue from manufacturing and focus on their own kind of scrap heap: customers who won't come back. Because that scrap heap can be every bit as costly as broken parts and misfit components, service company managers should strive to reduce it. They should aim for "zero defections"--keeping every customer they can profitably serve. As companies reduce customer defection rates, amazing things happen to their financials. Although the magnitude of the change varies by company and industry, the pattern holds: profits rise sharply. Reducing the defection rate just 5% generates 85% more profits in one bank's branch system, 50% more in an insurance brokerage, and 30% more in an auto-service chain. And when MBNA America, a Delaware-based credit card company, cut its 10% defection rate in half, profits rose a whopping 125%. But defection rates are not just a measure of service quality; they are also a guide for achieving it. By listening to the reasons why customers defect, managers learn exactly where the company is falling short and where to direct their resources. Staples, the stationery supplies retailer, uses feedback from customers to pinpoint products that are priced too high. That way, the company avoids expensive broad-brush promotions that pitch everything to everyone. Like any important change, managing for zero defections requires training and reinforcement. Great-West Life Assurance Company pays a 50% premium to group health-insurance brokers that hit customer-retention targets, and MBNA America gives bonuses to departments that hit theirs.

5,915 citations

Proceedings ArticleDOI
22 Oct 1994
TL;DR: GroupLens is a system for collaborative filtering of netnews, to help people find articles they will like in the huge stream of available articles, and protect their privacy by entering ratings under pseudonyms, without reducing the effectiveness of the score prediction.
Abstract: Collaborative filters help people make choices based on the opinions of other people. GroupLens is a system for collaborative filtering of netnews, to help people find articles they will like in the huge stream of available articles. News reader clients display predicted scores and make it easy for users to rate articles after they read them. Rating servers, called Better Bit Bureaus, gather and disseminate the ratings. The rating servers predict scores based on the heuristic that people who agreed in the past will probably agree again. Users can protect their privacy by entering ratings under pseudonyms, without reducing the effectiveness of the score prediction. The entire architecture is open: alternative software for news clients and Better Bit Bureaus can be developed independently and can interoperate with the components we have developed.

5,644 citations


"Item-based collaborative filtering ..." refers background or methods in this paper

  • ...One of the most promising such technologies is collaborative ltering [19, 27, 14, 16]....

    [...]

  • ...Collaborative Filtering (CF) [19, 27] is the most successful recommendation technique to date....

    [...]

  • ...The GroupLens research system [19, 16] provides a pseudonymous collaborative ltering solution for Usenet news and movies....

    [...]

Posted Content
TL;DR: In this article, the authors compare the predictive accuracy of various methods in a set of representative problem domains, including correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods.
Abstract: Collaborative filtering or recommender systems use a database about user preferences to predict additional topics or products a new user might like. In this paper we describe several algorithms designed for this task, including techniques based on correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods. We compare the predictive accuracy of the various methods in a set of representative problem domains. We use two basic classes of evaluation metrics. The first characterizes accuracy over a set of individual predictions in terms of average absolute deviation. The second estimates the utility of a ranked list of suggested items. This metric uses an estimate of the probability that a user will see a recommendation in an ordered list. Experiments were run for datasets associated with 3 application areas, 4 experimental protocols, and the 2 evaluation metrics for the various algorithms. Results indicate that for a wide range of conditions, Bayesian networks with decision trees at each node and correlation methods outperform Bayesian-clustering and vector-similarity methods. Between correlation and Bayesian networks, the preferred method depends on the nature of the dataset, nature of the application (ranked versus one-by-one presentation), and the availability of votes with which to make predictions. Other considerations include the size of database, speed of predictions, and learning time.

4,883 citations

Proceedings Article
24 Jul 1998
TL;DR: Several algorithms designed for collaborative filtering or recommender systems are described, including techniques based on correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods, to compare the predictive accuracy of the various methods in a set of representative problem domains.
Abstract: Collaborative filtering or recommender systems use a database about user preferences to predict additional topics or products a new user might like. In this paper we describe several algorithms designed for this task, including techniques based on correlation coefficients, vector-based similarity calculations, and statistical Bayesian methods. We compare the predictive accuracy of the various methods in a set of representative problem domains. We use two basic classes of evaluation metrics. The first characterizes accuracy over a set of individual predictions in terms of average absolute deviation. The second estimates the utility of a ranked list of suggested items. This metric uses an estimate of the probability that a user will see a recommendation in an ordered list. Experiments were run for datasets associated with 3 application areas, 4 experimental protocols, and the 2 evaluation metr rics for the various algorithms. Results indicate that for a wide range of conditions, Bayesian networks with decision trees at each node and correlation methods outperform Bayesian-clustering and vector-similarity methods. Between correlation and Bayesian networks, the preferred method depends on the nature of the dataset, nature of the application (ranked versus one-by-one presentation), and the availability of votes with which to make predictions. Other considerations include the size of database, speed of predictions, and learning time.

4,557 citations


"Item-based collaborative filtering ..." refers background or methods in this paper

  • ...schemes have been proposed to compute the association between items ranging from probabilistic approach [6] to more traditional item-item correlations [15, 13]....

    [...]

  • ...Clustering model treats collaborative ltering as a classi cation problem [2, 6, 29] and works by clustering similar users in same class and estimating the probability that a particular user is in a particular class C, and from there computes the conditional probability of ratings....

    [...]

  • ...The resulting model is very small, very fast, and essentially as accurate as nearest neighbor methods [6]....

    [...]

  • ...The Bayesian network model [6] formulates a probabilistic model for collaborative ltering problem....

    [...]

  • ...Clustering techniques usually produce less-personal recommendations than other methods, and in some cases, the clusters have worse accuracy than nearest neighbor algorithms [6]....

    [...]

Frequently Asked Questions (1)
Q1. What have the authors contributed in "Item-based collaborative filtering recommendation algorithms" ?

In this paper the authors analyze di erent item-based recommendation generation algorithms. Finally, the authors experimentally evaluate their results and compare them to the basic k-nearest neighbor approach. Their experiments suggest that item-based algorithms provide dramatically better performance than user-based algorithms, while at the same time providing better quality than the best available userbased algorithms.