scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

A cross-cultural user evaluation of product recommender interfaces

23 Oct 2008-pp 75-82
TL;DR: Through this user study, the dominating role of the recommender system's decision-aiding competence in stimulating both oriental and western users' return intention to an e-commerce website where the system is applied is identified.
Abstract: We present a cross-cultural user evaluation of an organization-based product recommender interface, by comparing it with the traditional list view. The results show that it performed significantly better, for all study participants, in improving on their competence perceptions, including perceived recommendation quality, perceived ease of use and perceived usefulness, and positively impacting users' behavioral intentions such as intention to save effort in the next visit. Additionally, oriental users were observed reacting more significantly strongly to the organization interface regarding some subjective aspects, compared to western subjects. Through this user study, we also identified the dominating role of the recommender system's decision-aiding competence in stimulating both oriental and western users' return intention to an e-commerce website where the system is applied.

Summary (4 min read)

1. INTRODUCTION

  • Online systems that help users select the most preferential item from a large electronic catalog are known as product search and recommender systems.
  • Studies show that customer trust is positively associated with customers’ intentions to transact, purchase a product, and return to the website [9].
  • The authors have primarily studied trust-building by the different design dimensions of explanation interfaces, given explanations’ potential benefits to improve users’ confidence about recommendations and their acceptance of the system [10,18].
  • In order to accelerate users’ decision process by saving their information-searching effort in reviewing all recommended items, the authors have proposed a so called preference-based organization technique.

1.1 Summary of Previous Studies

  • A carefully conducted user survey (53 subjects) first showed some interesting observations regarding the influence of explanations on trust building and the effectiveness of the organization-based recommender interface [4].
  • Moreover, the organized view of recommendations was largely favored than the traditional “why”based list view, since it was perceived to more likely accelerate the process of product comparison and choice making.
  • Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page.

1.2 Contribution of Our Current Work

  • The previous two experiments pointed out promising benefits of the organization interface regarding its trust-inspiring ability.
  • They motivated us to further evaluate the interface’s practical performance in a more realistic and interactive system where it serves as the computation and explanation of personalized recommendations according to users’ preferences (rather than based on products’ general popularity).
  • In addition, the authors were interested in identifying whether people from different categories of cultural backgrounds (i.e., oriental and western cultures) would all react actively to the organizationbased system.
  • As for upper-level competence perceptions, perceived ease of use and perceived usefulness, the two primary determining elements of convincing users to accept a technology [6], were included, besides decision confidence, perceived effort and satisfaction.
  • This paper is hence organized as follows: section 2 and 3 describes the organization-based interface and its function in an implemented prototype system; section 4 introduces the cross-cultural user evaluation’s design and experimental procedure; section 5 presents results from the study; and section 6 concludes the paper’s work.

2. ORGANIZATION-BASED

  • The organization interface has been developed to compute and categorize recommended products, and use the category title (e.g. “these products have cheaper price and longer battery life, but slower processor speed and heavier weight”) as the explanation of multiple products .
  • To derive effective principles for this interface design, the authors tested 13 paper prototypes by means of pilot studies and user interviews, and finally concluded five design principles.
  • Briefly speaking, the algorithm contains three main steps: Step 1: the user preferences over all products are represented as a weighted additive form of value functions according to the multiattribute utility theory (MAUT) [11].
  • Each tradeoff vector is a set of (attribute, tradeoff) pairs, where tradeoff indicates the improved (denoted as ↑) or compromised (↓) property of the product’s attribute value compared to the same attribute of the top candidate.
  • The authors select ones with higher tradeoff utilities (i.e., gains against losses relative to the top candidate and user preferences) in consideration of both category titles and their associated products.

3. PROTOTYPE SYSTEM

  • The authors implemented the organization interface in a product recommender system, which is in particular to assist users in searing for high-involvement products (e.g., notebooks, digital cameras, and cars) for which people will be willing to spend considerable effort in locating a desired choice, in order to avoid any financial damage or emotional burden.
  • A typical interaction procedure with the system can be as follows.
  • Among these products, the user can either choose one as her final choice, or select a neartarget and click “Better Features” to view recommended products with some better values than the selected one.
  • Specifically, the weight of improved attribute(s) that appears in the examined category title will be increased and the weight of compromised one(s) be decreased.
  • The user could choose to optimize any attributes’ values (e.g., $100 cheaper) and accept compromise(s) on one or more less important attributes, which revisions will be directly reflected in her preference model.

4.1 Cultural Difference

  • It is commonly recognized that elements of a user interface appropriate for one culture may not be appropriate for another.
  • Barber and Badre [2] claimed that Americans prefer websites with a white background, while Japanese dislike the white and Chinese favor the red background.
  • People are deeply influenced by the cultural values and norms they hold.
  • The most typical classification is Oriental vs. Western cultures.
  • In their experiment, the participants were mainly coming from two nations respectively representing the two different cultures: China (oriental culture) and Switzerland (western culture).

4.2 Participants and Materials

  • In total, 120 participants volunteered to take part in the experiment.
  • Another 60 subjects are mainly students in their university, and 41 of them are Swiss and the others are from European countries nearby like France, Italy and Germany.
  • Another system differs from it only in respect of the recommendation display.
  • Users can also freely specify and revise preferences, examine products’ detailed specifications, and in-depth compare neartargets in a consideration set.
  • They were both developed with two product catalogs: 64 digital cameras each constrained by 8 main attributes (manufacturer, price, resolution, optical zoom, etc), and 55 tablet PCs by 10 main attributes (manufacturer, price, processor speed, weight, etc).

4.3 Evaluation Criteria

  • The measured variables used in previous user studies (e.g., perceived effort, return intention) [17] were extended to include more subjective aspects, which are essentially related to the competence-based trust model the authors have established for recommender systems [4].
  • The model consists of three main constructs: system-design features, competence-inspired trust, and trust-induced behavioral intentions.
  • Besides, the authors included questions about decision confidence, cognitive effort, and satisfaction.
  • Most of them came from existing literatures where they have been repeatedly shown to exhibit strong content validity [12].
  • Except for these subjective criteria, the authors also measured participants’ objective decision accuracy and effort.

4.4 Experiment Design and Procedure

  • A 22 full-factorial between-group experiment design was used.
  • (oriental culture, western culture) and (ORG, LIST), also known as The manipulated factors are.
  • At the beginning, the participant was required to fill in a prequestionnaire about her/his personal information and subjective opinions on the priority order of different factors in influencing her/his general trust formation in an e-commerce website.
  • Then s/he was asked to use the assigned system to locate a product that s/he most preferred and would purchase if given the opportunity.
  • After the choice was made, the participant was asked to answer post-study questions related to all of the measured subjective variables.

4.5 Hypotheses

  • Regarding the culture difference, the authors postulated that it would not have significant influence on users’ decision behavior in either ORG or LIST.
  • That is, people would react similarly to the system no matter which cultural background s/he is from.
  • The ORG system was further hypothesized to outperform LIST, especially in terms of subjective constructs related to user trust, owing to the replacement of the list view of recommendations with the organized view.

5.1 Objective Measures

  • The authors first measured users’ objective performance in the two systems (see Table 3).
  • The authors respectively compared the results between two groups of people from the same cultural background but used different systems, two groups of people using the same system but from different cultures, and the overall comparison of ORG and LIST taking into account of all study participants.
  • The between-group analyses were done by the Student t-test assuming unequal variances, with estimated power of 86% under the assumption of “large” effect size, which power indicates a high likelihood of detecting a significant effect provided one exists.
  • All of the differences are not significant.
  • The overall interaction cycles consumed in ORG is higher than in LIST, but the difference does not reach to a significant level.

5.2 Subjective Measures

  • The authors further examined whether the cultural background would influence users’ subjective perceptions with the system, and which system would perform better respecting these subjective aspects.
  • Analysis of all users’ responses indicates that ORG obtained positively higher scores on all of them, 6 of which are significantly better (see Table 4).
  • More concretely, the participants using ORG on average expressed significantly higher perceived recommendation quality, higher perceived ease of use, higher perceived usefulness, lower perceived effort, higher satisfaction and higher intention to save effort in repeated visits, compared to the rates of another group with LIST.
  • As for the other two system-design features, the two systems did not exhibit significant differences, which might be because they both provide explanations (for recommendation transparency) and preference revision tools (for user-control).
  • All of the results hence infer that oriental subjects’ reaction to ORG was indeed more positively stronger than western users’, which is primarily reflected on their perceived recommendation quality, decision confidence and cognitive effort.

5.3 Other Results

  • In the pre-questionnaire, the authors asked each participant to rate a set of statements about the relative importance of factors influencing their perception of an e-commerce website’s general trustworthiness, their intention to purchase a product on the website and intention to repeatedly visit it for products’ information.
  • Table 5 shows the priority order of these factors for each question from both oriental and western subjects.
  • All average scores are beyond the medium level (“moderately important”).
  • For the trustworthiness perception, the priority order of the five factors is the same between two groups of users: the website’s integrity (e.g., product quality, security, delivery service, etc) is the most important, followed by its reputation, price info, and competences in helping users find ideal products and providing good recommendations.
  • As a matter of fact, the most important factor leading to users’ return intention is that the website can help them effectively find a product they really like.

6. CONCLUSION

  • The authors presented a user study that evaluated the organization-based recommender system in a cross-cultural experiment setup.
  • In-depth analysis concerning cultural impacts further shows that some of these significant phenomena were observably stronger among oriental participants, implying that oriental users will likely be more actively reacting to the organization interface once it replaces the traditional list view.
  • Another implication is for the user evaluation of recommender systems.
  • The authors believe that other researchers will profit from their evaluation methods when they conduct similar types of experiments.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

A Cross-Cultural User Evaluation of Product
Recommender Interfaces
Li Chen and Pearl Pu
Human Computer Interaction Group, School of Computer and Communication Sciences
Swiss Federal Institute of Technology in Lausanne (EPFL)
CH-1015, Lausanne, Switzerland
{li.chen, pearl.pu}@epfl.ch
ABSTRACT
We present a cross-cultural user evaluation of an organization-
based product recommender interface, by comparing it with the
traditional list view. The results show that it performed
significantly better, for all study participants, in improving on
their competence perceptions, including perceived
recommendation quality, perceived ease of use and perceived
usefulness, and positively impacting users’ behavioral intentions
such as intention to save effort in the next visit. Additionally,
oriental users were observed reacting more significantly strongly
to the organization interface regarding some subjective aspects,
compared to western subjects. Through this user study, we also
identified the dominating role of the recommender system’s
decision-aiding competence in stimulating both oriental and
western users’ return intention to an e-commerce website where
the system is applied.
Categories and Subject Descriptors
H.5.2 [Information interfaces and presentation]: User
Interfaces – evaluation/methodology, graphical user interfaces
(GUI), user-centered design.
General Terms
Design, Experimentation, Human Factors.
Keywords
Product recommender systems, organization interface, list view,
cross-cultural user study.
1. INTRODUCTION
Online systems that help users select the most preferential item
from a large electronic catalog are known as product search and
recommender systems. In recent years, much research work has
emphasized on developing and improving the underlying
algorithms, whereas many of the user issues such as acceptance of
recommendations and trust building received little attention.
Trust is seen as a long-term relationship between a user and the
organization that the online technology represents. It is critical to
study especially for e-commerce environments where the
traditional salesperson, and subsequent relationship, is replaced
by a virtual vendor or a more intelligent product recommender
agent. Studies show that customer trust is positively associated
with customers’ intentions to transact, purchase a product, and
return to the website [9]. However, these results have mainly been
derived from online shops’ ability to ensure security, privacy, and
reputation (i.e., the integrity and benevolence aspects of trust
formation) [8], and less from the website’s competence such as its
decision agent’s ability in providing good recommendations and
explaining its results.
We have always been engaged in investigating the effective
recommender design factors that may positively impact the
promotion of users’ trust and furthermore their behavioral
intentions. Previously, we have conceptualized a competence-based
trust model for recommender systems [4]. We have primarily
studied trust-building by the different design dimensions of
explanation interfaces, given explanations’ potential benefits to
improve users’ confidence about recommendations and their
acceptance of the system [10,18].
The traditional strategy of displaying and explaining
recommendations, as popularly adopted in most of case-based
reasoning recommender systems [15] and commercial websites
(www.activedecisions.com), is to display the recommendation
content in a rank ordered list and use a “why” component along
with each item to explain the computational reasoning behind it.
In order to accelerate users’ decision process by saving their
information-searching effort in reviewing all recommended items,
we have proposed a so called preference-based organization
technique. The main idea is that, rather than explaining each item
one by one, a group of products can be explained together by a
category title, provided that they have shared tradeoff characteristics
compared to a reference product (e.g., the top candidate) [17]. In the
following, we first summarize previous studies on the organization
method and then give the contribution of our current work.
1.1 Summary of Previous Studies
A carefully conducted user survey (53 subjects) first showed
some interesting observations regarding the influence of
explanations on trust building and the effectiveness of the
organization-based recommender interface [4]. That is, most of
surveyed users strongly agreed that they shall trust more in a
system with the explanation of how it computed the
recommended items. Moreover, the organized view of
recommendations was largely favored than the traditional “why”-
based list view, since it was perceived to more likely accelerate
the process of product comparison and choice making.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
copies bear this notice and the full citation on the first page. To copy
otherwise, or republish, to post on servers or to redistribute to lists,
requires prior specific permission and/or a fee.
RecSys’08, October 23–25, 2008, Lausanne, Switzerland.
Copyright 2008 ACM 978-1-60558-093-7/08/10...$5.00.
75

A follow-up user study asked 72 participants to evaluate the two
types of recommender interfaces in a within-subject procedure
[17]. The user task was to find a product s/he most preferred
among a set of most popular products recommended in either an
organized view or a list view with “why” components.
Results show that while both interfaces enabled trust-building, the
organized view was significantly more effective in increasing
users’ task efficiency, saving their cognitive effort and prompting
them to intend to return to the interface for future use.
1.2 Contribution of Our Current Work
The previous two experiments pointed out promising benefits of
the organization interface regarding its trust-inspiring ability.
They motivated us to further evaluate the interface’s practical
performance in a more realistic and interactive system where it
serves as the computation and explanation of personalized
recommendations according to users’ preferences (rather than
based on products’ general popularity). In such system,
preference specification/revision tools are provided for users to
input and refine their preferences, and the recommender interface
is returned whenever the user’s preferences are revised.
In addition, we were interested in identifying whether people
from different categories of cultural backgrounds (i.e., oriental
and western cultures) would all react actively to the organization-
based system. Thus, a relatively larger scale cross-cultural
experiment was set up, and a comparative user study was
additionally involved to compare the organization interface with
the “why”-based list view which was implemented in a similar
interactive system setting.
An extended evaluation framework based on previous
measurements was also established to assess the system’s actual
benefits in respect of three design aspects: recommendation
quality, transparency, and user-control. As for upper-level
competence perceptions, perceived ease of use and perceived
usefulness, the two primary determining elements of convincing
users to accept a technology [6], were included, besides decision
confidence, perceived effort and satisfaction. Three trust-induced
behavioral intentions were also contained, which are intention to
purchase, intention to return and intention to save effort in the
next visit.
This paper is hence organized as follows: section 2 and 3 describes
the organization-based interface and its function in an implemented
prototype system; section 4 introduces the cross-cultural user
evaluation’s design and experimental procedure; section 5 presents
results from the study; and section 6 concludes the paper’s work.
2. ORGANIZATION-BASED
RECOMMENDER INTERFACE
The organization interface has been developed to compute and
categorize recommended products, and use the category title (e.g.
“these products have cheaper price and longer battery life, but
slower processor speed and heavier weight”) as the explanation of
multiple products (see Figure 1). Each presented title essentially
details the representative tradeoff properties shared by a set of
recommended products by comparing them with the top candidate
(the best matching product according to the user’s current
preferences). It exposes the recommendation opportunities and
indicates the reason of why these products are recommended, by
revealing their superior values on some important attributes, and
compromises on less important ones.
To derive effective principles for this interface design, we tested
13 paper prototypes by means of pilot studies and user interviews,
and finally concluded five design principles. The principles
include: proposing improvements and compromises in the
category title using the conversational language, keeping the
number of tradeoff attributes in the category title under five,
including a few of actual products within each category, and
diversifying the proposed category titles as well as associated
products (see details in [17]). We accordingly proposed an
algorithm to generate such organization interfaces [5]. Briefly
speaking, the algorithm contains three main steps:
Step 1: the user preferences over all products are represented as a
weighted additive form of value functions according to the multi-
attribute utility theory (MAUT) [11]. Based on this compensatory
preference model, we can resolve conflicting values explicitly by
considering tradeoffs between different attributes;
Step 2: all alternatives are ranked by their weighted utilities
calculated according to the MAUT model. Then, each of them,
except the ranked first one (i.e., the top candidate), is converted
into a tradeoff vector. Each tradeoff vector is a set of (attribute,
tradeoff) pairs, where tradeoff indicates the improved (denoted as )
or compromised () property of the product’s attribute value
compared to the same attribute of the top candidate. For the
attributes without explicitly stated preferences, default properties
are suggested (e.g., the cheaper, the better). For example, a
tradeoff vector is {(price, ), (processor speed, ), (memory, ),
(hard drive size, ), …}, meaning that the corresponding laptop has
lower price, slower processor speed, less memory, more hard drive
size, etc, in comparison with the top recommended laptop;
Step 3: all of the tradeoff vectors are then organized into different
categories by utilizing an association rule mining tool [1] to
discover the recurring subsets of (attribute, tradeoff) pairs among
them. Each subset hence represents a category of products with
the same tradeoff properties. Since a large amount of category
candidates would be produced by the mining algorithm, they are
further ranked and diversified. We select ones with higher tradeoff
utilities (i.e., gains against losses relative to the top candidate and
user preferences) in consideration of both category titles and their
associated products.
Therefore, the presented category titles can in nature stimulate users
to consider hidden needs and even guide them to conduct tradeoff
navigations for a better choice. For instance, after the user saw the
products that “have faster processor speed and longer battery life,
although they are slightly more expensive”, she may likely
change to that direction from the top candidate, if she realized that
the processor speed is more important than the price to her, or she
likes “longer battery life” although she did not state any
preference on this attribute before. The support for this kind of
tradeoff navigation process has been demonstrated to have
significant effect on increasing users’ decision accuracy and
preference certainty [16]. We have previously compared our
organization algorithm with other typical tradeoff supporting
approaches (such as the data-driven dynamic critiquing system
[14]), and found that it achieved significantly higher accuracy in
predicting tradeoff criteria and targeted products that users
actually made, mainly owing to its preference-focused clustering
and selection strategies [5].
76

Figure 1. Screenshot of the organization-based recommender interface.
Figure 2. Screenshot of the list view of recommendations.
77

3. PROTOTYPE SYSTEM
We implemented the organization interface in a product
recommender system, which is in particular to assist users in
searing for high-involvement products (e.g., notebooks, digital
cameras, and cars) for which people will be willing to spend
considerable effort in locating a desired choice, in order to avoid
any financial damage or emotional burden.
A typical interaction procedure with the system can be as follows.
A user initially starts her search by specifying any number of
preferences in a query area. Each preference is composed of one
acceptable attribute value and its relative weight from 1 “least
important” to 5 “most important”. A preference structure is hence
a set of (attribute value, weight) pairs of all participating
attributes, as required by the MAUT model. After a user states her
initial preferences, the best matching product will be computed
and returned at the top, followed by k categories of other
recommended products as outcomes of the organization algorithm
(k = 4 in our prototype, see Figure 1). If the user is interested in
one of the suggested categories, she can click “Show All” to see
more products (up to 6) belonging to it. Among these products,
the user can either choose one as her final choice, or select a near-
target and click “Better Features” to view recommended products
with some better values than the selected one. In the latter case,
the user’s preference model will be automatically refined to
respect her current needs. Specifically, the weight of improved
attribute(s) that appears in the examined category title will be
increased and the weight of compromised one(s) be decreased. All
attributes’ acceptable values will be also updated according to the
selected new reference product.
On the other hand, the user can revise preferences on her own
through clicking the button “Specify your own criteria for ‘Better
Features’”. A critiquing page will be then activated that provides
her with options for making self-specified tradeoff criteria to a
near-target. For example, the user could choose to optimize any
attributes’ values (e.g., $100 cheaper) and accept compromise(s)
on one or more less important attributes, which revisions will be
directly reflected in her preference model. A small set of tradeoff
alternatives that best satisfy the stated tradeoff criteria will be
then returned, among which she either makes the final choice or
proceeds to conduct any further tradeoff navigations in either the
organization interface or by her self-initiated way.
Moreover, the system allows the user to view the product's
detailed specifications via a “detail” link, and to record all of her
interesting products in a consideration set to facilitate comparing
them before checking out.
4. CROSS-CULTURAL EVALUATION
4.1 Cultural Difference
It is commonly recognized that elements of a user interface
appropriate for one culture may not be appropriate for another.
For example, Barber and Badre [2] claimed that Americans prefer
websites with a white background, while Japanese dislike the
white and Chinese favor the red background.
People are deeply influenced by the cultural values and norms
they hold. Many researchers have classified cultures around the
world in various categories. The most typical classification is
Oriental vs. Western cultures. The Oriental culture, influenced by
the ancient Chinese culture, focuses on holistic thought,
continuity, and interrelationships of objects. On the contrary, the
Western culture, influenced by the ancient Greek culture, puts
greater emphasis on analytical thought, detachment, and attributes
of objects [13].
In online user-experience researches, one primary reason
identified for consumer behavior differences has been based on
the belief that western countries generally have individualism and
a low context culture, whereas eastern countries generally have
collectivism and a high context culture [3].
Thus, we were interested in recruiting people from the two
different cultural backgrounds to see whether the culture
difference would influence their actual behavior and subjective
perceptions with our product recommender system, while they use
it to make a purchase decision. In our experiment, the participants
were mainly coming from two nations respectively representing
the two different cultures: China (oriental culture) and
Switzerland (western culture).
4.2 Participants and Materials
In total, 120 participants volunteered to take part in the
experiment. In collaboration with the HCI lab at Tsinghua
University in China, we recruited 60 native Chinese. Most of
them are students in the university pursuing Bachelor, Master or
PhD degrees, and a few of them work as engineers in domains of
software development, architecture, etc. Another 60 subjects are
mainly students in our university, and 41 of them are Swiss and
the others are from European countries nearby like France, Italy
and Germany. Table 1 lists demographical profiles of study
subjects from the two cultural backgrounds.
Table 1. Demographical profiles of study subjects from two
cultures (the number of users is in the bracket)
Oriental Culture (60) Western Culture (60)
Nation China (60) Switzerland (41); Other
European countries (19)
Gender Female (23); Male (37) Female (15); Male (45)
Average age 21~30 (57); >30 (3) <21 (14); 21~30 (44);
>30 (2)
Major/
job domain
Computer, mathematics,
environment, electronics,
architecture, etc.
Computer, education,
mechanics, electronics,,
architecture, etc.
Computer
knowledge
4.34 (advanced) 4.08 (advanced)
Internet
usage
4.83 (almost daily) 4.98 (almost daily)
e-commerce
site visits
3.69 (1-3 times a month) 3.36 (a few times every 3
months)
e-shopping
experiences
3.25 (a few times every 3
months)
2.92 (a few times every 3
months)
Two systems were prepared for this user study. One is the
prototype system with the organization-based recommender
interface, as described in Section 3. Another system differs from
it only in respect of the recommendation display. That is, it does
not show an organized view of recommendations, but a traditional
ranked list with a “why” component to explain each
recommended product. More specifically, in the list view, k
products (e.g., k = 25 in our implementation) that are with the
78

highest weighted utilities according to the user’s current
preferences are listed, and the “why” gives the reason of why the
corresponding product is presented (i.e., its pros and cons
compared to the top candidate) (see Figure 2). In this system,
users can also freely specify and revise preferences, examine
products’ detailed specifications, and in-depth compare near-
targets in a consideration set.
Henceforth, the two compared systems are respectively
abbreviated as ORG and LIST. They were both developed with
two product catalogs: 64 digital cameras each constrained by 8
main attributes (manufacturer, price, resolution, optical zoom,
etc), and 55 tablet PCs by 10 main attributes (manufacturer, price,
processor speed, weight, etc). All products were extracted from a
real e-commerce website.
4.3 Evaluation Criteria
In this experiment, the measured variables used in previous user
studies (e.g., perceived effort, return intention) [17] were
extended to include more subjective aspects, which are essentially
related to the competence-based trust model we have established
for recommender systems [4]. The model consists of three main
constructs: system-design features, competence-inspired trust, and
trust-induced behavioral intentions. As for system-design features
that may directly contribute to the promotion of overall
competence perceptions, we included three dimensions:
recommendation quality, transparency, and user-control. The
overall competence is composed of two crucial variables:
perceived ease of use and perceived usefulness, which have been
determined as the primary factors of persuading users to accept
and use a technology [6]. Besides, we included questions about
decision confidence, cognitive effort, and satisfaction. Trusting
intentions are behavioral attitudes expected from users once their
trust has been built. In addition to commonly addressed purchase
and return intentions, we were interested in the intention to save
effort, because it examines whether users will potentially reduce
their decision-making effort in repeated visits upon establishing a
certain trust level with the recommender system.
Table 2 lists all of the questions as measurements of these
subjective variables. Most of them came from existing literatures
where they have been repeatedly shown to exhibit strong content
validity [12]. Each question was required to respond on a 5-point
Likert scale from “strongly disagree” to “strongly agree”.
Except for these subjective criteria, we also measured
participants’ objective decision accuracy and effort. The objective
accuracy was defined as the percentage of users who stood by
their choice found using the assigned recommender system, when
they have the chance to review all alternatives in the database.
The objective effort was quantitatively measured in terms of both
task completion time and interaction cycles.
4.4 Experiment Design and Procedure
A 2
2
full-factorial between-group experiment design was used.
The manipulated factors are: (oriental culture, western culture)
and (ORG, LIST). Participants were evenly distributed into the
four conditions, resulting in a sample size of 30 for each condition
cell. Each participant was further randomly assigned one product
catalog (digital camera or tablet PC) to search.
An online procedure containing instructions, evaluated interfaces
and questionnaires was implemented, so that participants could
easily follow and we could record all of their actions in a log file.
At the beginning, the participant was required to fill in a pre-
questionnaire about her/his personal information and subjective
opinions on the priority order of different factors in influencing
her/his general trust formation in an e-commerce website. Then
s/he was asked to use the assigned system to locate a product that
s/he most preferred and would purchase if given the opportunity.
After the choice was made, the participant was asked to answer
post-study questions related to all of the measured subjective
variables. Then the interface’s decision accuracy was assessed by
revealing all of products to the participant to determine whether
s/he prefers another product in the catalog or sticks with the
choice just made using the recommender system.
Table 2. Questions to measure subjective variables
Measured
variable
Question responded on a 5-point Likert scale
from “strongly disagree” to “strongly agree”
Subjective perceptions of system-design features
Recommendation
quality
This interface gave me some really good
recommendations.
Transparency
I understand why the products were returned
through the explanations in the interface.
User control
I felt in control of specifying and changing my
preferences in this interface.
Overall competence perceptions
Perceived ease of
use
I find this interface easy to use.
This interface is competent to help me effectively
find products I really like.
I find this interface is useful to improve my
“shopping” performance.
Perceived
usefulness
Cronbach’s alpha = 0.69
Decision
confidence
I am confident that the product I just “purchased” is
really the best choice for me.
I easily found the information I was looking for.
Looking for a product using this interface required
too much effort (reverse scale).
Perceived effort
Cronbach’s alpha = 0.54
Satisfaction My overall satisfaction with the interface is high.
Trusting intentions
Intention to
purchase
I would purchase the product I just chose if given
the opportunity.
If I had to search for a product online in the future
and an interface like this was available, I would be
very likely to use it.
I don't like this interface, so I would not use it again
(reverse scale).
Intention to return
Cronbach’s alpha = 0.80
Intention to save
effort in next visit
If I had a chance to use this interface again, I
would likely make my choice more quickly.
Note: The Cronbach’s alpha value represents how well the two items are
related and unified to one construct.
4.5 Hypotheses
Regarding the culture difference, we postulated that it would not
have significant influence on users’ decision behavior in either
ORG or LIST. That is, people would react similarly to the system
no matter which cultural background s/he is from. The ORG
system was further hypothesized to outperform LIST, especially
in terms of subjective constructs related to user trust, owing to the
replacement of the list view of recommendations with the
organized view.
79

Citations
More filters
Proceedings ArticleDOI
23 Oct 2011
TL;DR: A unifying evaluation framework, called ResQue (Recommender systems' Quality of user experience), which aimed at measuring the qualities of the recommended items, the system's usability, usefulness, interface and interaction qualities, users' satisfaction with the systems, and the influence of these qualities on users' behavioral intentions.
Abstract: This research was motivated by our interest in understanding the criteria for measuring the success of a recommender system from users' point view. Even though existing work has suggested a wide range of criteria, the consistency and validity of the combined criteria have not been tested. In this paper, we describe a unifying evaluation framework, called ResQue (Recommender systems' Quality of user experience), which aimed at measuring the qualities of the recommended items, the system's usability, usefulness, interface and interaction qualities, users' satisfaction with the systems, and the influence of these qualities on users' behavioral intentions, including their intention to purchase the products recommended to them and return to the system. We also show the results of applying psychometric methods to validate the combined criteria using data collected from a large user survey. The outcomes of the validation are able to 1) support the consistency, validity and reliability of the selected criteria; and 2) explain the quality of user experience and the key determinants motivating users to adopt the recommender technology. The final model consists of thirty two questions and fifteen constructs, defining the essential qualities of an effective and satisfying recommender system, as well as providing practitioners and scholars with a cost-effective way to evaluate the success of a recommender system and identify important areas in which to invest development resources.

705 citations


Cites methods from "A cross-cultural user evaluation of..."

  • ...In over 10 user studies, we have carefully and progressively developed and employed user satisfaction questionnaires to evaluate recommenders’ perceived qualities such as ease of use, perceived usefulness and users’ satisfaction and behavioral intentions [4,5,6,12,13,14,23,24]....

    [...]

  • ...Between 2005 and 2010, we have administered 11 subjective questionnaires on a total of 807 subjects [4,5,6,12,13,14,23,24]....

    [...]

Journal ArticleDOI
TL;DR: This paper proposes a framework that takes a user-centric approach to recommender system evaluation that links objective system aspects to objective user behavior through a series of perceptual and evaluative constructs (called subjective system aspects and experience, respectively).
Abstract: Research on recommender systems typically focuses on the accuracy of prediction algorithms. Because accuracy only partially constitutes the user experience of a recommender system, this paper proposes a framework that takes a user-centric approach to recommender system evaluation. The framework links objective system aspects to objective user behavior through a series of perceptual and evaluative constructs (called subjective system aspects and experience, respectively). Furthermore, it incorporates the influence of personal and situational characteristics on the user experience. This paper reviews how current literature maps to the framework and identifies several gaps in existing work. Consequently, the framework is validated with four field trials and two controlled experiments and analyzed using Structural Equation Modeling. The results of these studies show that subjective system aspects and experience variables are invaluable in explaining why and how the user experience of recommender systems comes about. In all studies we observe that perceptions of recommendation quality and/or variety are important mediators in predicting the effects of objective system aspects on the three components of user experience: process (e.g. perceived effort, difficulty), system (e.g. perceived system effectiveness) and outcome (e.g. choice satisfaction). Furthermore, we find that these subjective aspects have strong and sometimes interesting behavioral correlates (e.g. reduced browsing indicates higher system effectiveness). They also show several tradeoffs between system aspects and personal and situational characteristics (e.g. the amount of preference feedback users provide is a tradeoff between perceived system usefulness and privacy concerns). These results, as well as the validated framework itself, provide a platform for future research on the user-centric evaluation of recommender systems.

651 citations

Journal ArticleDOI
TL;DR: This paper surveys the state of the art of user experience research in RS by examining how researchers have evaluated design methods that augment RS’s ability to help users find the information or product that they truly prefer, interact with ease with the system, and form trust with RS through system transparency, control and privacy preserving mechanisms.
Abstract: A recommender system is a Web technology that proactively suggests items of interest to users based on their objective behavior or explicitly stated preferences. Evaluations of recommender systems (RS) have traditionally focused on the performance of algorithms. However, many researchers have recently started investigating system effectiveness and evaluation criteria from users' perspectives. In this paper, we survey the state of the art of user experience research in RS by examining how researchers have evaluated design methods that augment RS's ability to help users find the information or product that they truly prefer, interact with ease with the system, and form trust with RS through system transparency, control and privacy preserving mechanisms finally, we examine how these system design features influence users' adoption of the technology. We summarize existing work concerning three crucial interaction activities between the user and the system: the initial preference elicitation process, the preference refinement process, and the presentation of the system's recommendation results. Additionally, we will also cover recent evaluation frameworks that measure a recommender system's overall perceptive qualities and how these qualities influence users' behavioral intentions. The key results are summarized in a set of design guidelines that can provide useful suggestions to scholars and practitioners concerning the design and development of effective recommender systems. The survey also lays groundwork for researchers to pursue future topics that have not been covered by existing methods.

323 citations


Cites background from "A cross-cultural user evaluation of..."

  • ...Because too much information impedes users' information processing speed, while overwhelming results require the designers to organize the results to ease and enhance information processing (Chen and Pu 2008)....

    [...]

  • ...results convinces the user that the best results are being found (Chen and Pu 2008)....

    [...]

Journal ArticleDOI
19 Sep 2017
TL;DR: This work provides a comprehensive overview on the existing literature on user interaction aspects in recommender systems, covering existing approaches for preference elicitation and result presentation, as well as proposals that consider recommendation as an interactive process.
Abstract: Automated recommendations have become a ubiquitous part of today’s online user experience. These systems point us to additional items to purchase in online shops, they make suggestions to us on movies to watch, or recommend us people to connect with on social websites. In many of today’s applications, however, the only way for users to interact with the system is to inspect the recommended items. Often, no mechanisms are implemented for users to give the system feedback on the recommendations or to explicitly specify preferences, which can limit the potential overall value of the system for its users. Academic research in recommender systems is largely focused on algorithmic approaches for item selection and ranking. Nonetheless, over the years a variety of proposals were made on how to design more interactive recommenders. This work provides a comprehensive overview on the existing literature on user interaction aspects in recommender systems. We cover existing approaches for preference elicitation and result presentation, as well as proposals that consider recommendation as an interactive process. Throughout the work, we furthermore discuss examples of real-world systems and outline possible directions for future works.

130 citations

Journal ArticleDOI
TL;DR: The adoption of an RS can affect both the lift factor and the conversion rate, determining an increased volume of sales and influencing the user’s decision to actually buy one of the recommended products, and the perceived novelty of recommendations is likely to be more influential than their perceived accuracy.
Abstract: Recommender Systems (RSs) help users search large amounts of digital contents and services by allowing them to identify the items that are likely to be more attractive or useful RSs play an important persuasion role, as they can potentially augment the users’ trust towards in an application and orient their decisions or actions towards specific directions This article explores the persuasiveness of RSs, presenting two vast empirical studies that address a number of research questionsFirst, we investigate if a design property of RSs, defined by the statistically measured quality of algorithms, is a reliable predictor of their potential for persuasion This factor is measured in terms of perceived quality, defined by the overall satisfaction, as well as by how users judge the accuracy and novelty of recommendations For our purposes, we designed an empirical study involving 210 subjects and implemented seven full-sized versions of a commercial RS, each one using the same interface and dataset (a subset of Netflix), but each with a different recommender algorithm In each experimental configuration we computed the statistical quality (recall and F-measures) and collected data regarding the quality perceived by 30 users The results show us that algorithmic attributes are less crucial than we might expect in determining the user’s perception of an RS’s quality, and suggest that the user’s judgment and attitude towards a recommender are likely to be more affected by factors related to the user experienceSecond, we explore the persuasiveness of RSs in the context of large interactive TV services We report a study aimed at assessing whether measurable persuasion effects (eg, changes of shopping behavior) can be achieved through the introduction of a recommender Our data, collected for more than one year, allow us to conclude that, (1) the adoption of an RS can affect both the lift factor and the conversion rate, determining an increased volume of sales and influencing the user’s decision to actually buy one of the recommended products, (2) the introduction of an RS tends to diversify purchases and orient users towards less obvious choices (the long tail), and (3) the perceived novelty of recommendations is likely to be more influential than their perceived accuracyOverall, the results of these studies improve our understanding of the persuasion phenomena induced by RSs, and have implications that can be of interest to academic scholars, designers, and adopters of this class of systems

113 citations


Cites background or methods from "A cross-cultural user evaluation of..."

  • ...2008, 2009], music [Hu and Pu 2009], and movies [Jones and Pu 2007]), or investigating different perceptions of quality in culturally heterogeneous user groups [Chen and Pu 2008], thus obtaining a variety of incomparable results....

    [...]

  • ...…P. Cremonesi et al. different domains (e.g., e-commerce [Pu et al. 2008, 2009], music [Hu and Pu 2009], and movies [Jones and Pu 2007]), or investigating different perceptions of quality in culturally heterogeneous user groups [Chen and Pu 2008], thus obtaining a variety of incomparable results....

    [...]

  • ...Several user-centric evaluations are reported in literature employing ResQue attributes [Chen and Pu 2008; Hu and Pu 2009; Jones and Pu 2007; Pu and Chen 2006; Pu et al. 2008, 2009]....

    [...]

  • ...Several user-centric evaluations are reported in literature employing ResQue attributes [Chen and Pu 2008; Hu and Pu 2009; Jones and Pu 2007; Pu and Chen 2006; Pu et al. 2008, 2009]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The study finds that perceived control and shopping enjoyment can increase the intention of new Web customers to return, but seemingly do not influence repeat customers toReturn, and that a Web store that utilizes valueadded search mechanisms and presents a positively challenging experience can increase customers' shopping enjoyment.
Abstract: Electronic commerce challenges companies to design electronic systems and interactions that retain customers and increase sales. This exploratory study examines the impact of consumer experience and attitudes on intention to return and unplanned purchases on-line. It also examines how certain consumer and Web site factors influence the on-line consumer experience. The study finds that perceived control and shopping enjoyment can increase the intention of new Web customers to return, but seemingly do not influence repeat customers to return. It also finds that a Web store that utilizes valueadded search mechanisms and presents a positively challenging experience can increase customers' shopping enjoyment. Further, the more often customers return to a Web store, the more their shopping enjoyment is determined by their product involvement. Customers with low need specificity (i.e., who do not know what they are looking for) are more likely to use value-added search mechanisms. Finally, neither perceived control nor shopping enjoyment has any significant impact on unplanned purchases.

472 citations


"A cross-cultural user evaluation of..." refers background in this paper

  • ...Most of them came from existing literatures where they have been repeatedly shown to exhibit strong content validity [12]....

    [...]

Journal ArticleDOI
TL;DR: Understanding how different cultures use the Net---as well as perceive the same Web sites---can translate to truly global e-commerce.
Abstract: Understanding how different cultures use the Net---as well as perceive the same Web sites---can translate to truly global e-commerce.

389 citations


"A cross-cultural user evaluation of..." refers background in this paper

  • ...In online user-experience researches, one primary reason identified for consumer behavior differences has been based on the belief that western countries generally have individualism and a low context culture, whereas eastern countries generally have collectivism and a high context culture [3]....

    [...]

Proceedings ArticleDOI
29 Jan 2006
TL;DR: Results of a significant-scale user study indicate that the organization-based explanation is highly effective in building users' trust in the recommendation interface, with the benefit of increasing users' intention to return to the agent and save cognitive effort.
Abstract: Based on our recent work on the development of a trust model for recommender agents and a qualitative survey, we explore the potential of building users' trust with explanation interfaces. We present the major results from the survey, which provided a roadmap identifying the most promising areas for investigating design issues for trust-inducing interfaces. We then describe a set of general principles derived from an in-depth examination of various design dimensions for constructing explanation interfaces, which most contribute to trust formation. We present results of a significant-scale user study, which indicate that the organization-based explanation is highly effective in building users' trust in the recommendation interface, with the benefit of increasing users' intention to return to the agent and save cognitive effort.

288 citations


"A cross-cultural user evaluation of..." refers background in this paper

  • ...The principles include: proposing improvements and compromises in the category title using the conversational language, keeping the number of tradeoff attributes in the category title under five, including a few of actual products within each category, and diversifying the proposed category titles as well as associated products (see details in [17])....

    [...]

  • ..., perceived effort, return intention) [17] were extended to include more subjective aspects, which are essentially related to the competence-based trust model we have established for recommender systems [4]....

    [...]

  • ...A follow-up user study asked 72 participants to evaluate the two types of recommender interfaces in a within-subject procedure [17]....

    [...]

Journal ArticleDOI
David McSherry1
TL;DR: It is shown how the relevance of any question the user is asked can be explained in terms of its ability to discriminate between competing cases, thus giving users a unique insight into the recommendation process.
Abstract: There is increasing awareness in recommender systems research of the need to make the recommendation process more transparent to users. Following a brief review of existing approaches to explanation in recommender systems, we focus in this paper on a case-based reasoning (CBR) approach to product recommendation that offers important benefits in terms of the ease with which the recommendation process can be explained and the system's recommendations can be justified. For example, recommendations based on incomplete queries can be justified on the grounds that the user's preferences with respect to attributes not mentioned in her query cannot affect the outcome. We also show how the relevance of any question the user is asked can be explained in terms of its ability to discriminate between competing cases, thus giving users a unique insight into the recommendation process.

201 citations


"A cross-cultural user evaluation of..." refers background in this paper

  • ...The traditional strategy of displaying and explaining recommendations, as popularly adopted in most of case-based reasoning recommender systems [15] and commercial websites (www....

    [...]

Proceedings ArticleDOI
10 Jan 2005
TL;DR: A novel approach to Critiquing is reviewed, dynamic critiquing, that allows users to modify multiple features simultaneously by choosing from a range of so-called compound critiques that are automatically proposed based on their current position within the product-space.
Abstract: Conversational recommender systems are commonly used to help users to navigate through complex product-spaces by alternatively making product suggestions and soliciting user feedback in order to guide subsequent suggestions. Recently, there has been a surge of interest in developing effective interfaces that support user interaction in domains of limited user expertise. Critiquing has proven to be a popular and successful user feedback mechanism in this regard, but is typically limited to the modification of single features. We review a novel approach to critiquing, dynamic critiquing, that allows users to modify multiple features simultaneously by choosing from a range of so-called compound critiques that are automatically proposed based on their current position within the product-space. In addition, we introduce the results of an important new live-user study that evaluates the practical benefits of dynamic critiquing.

98 citations


"A cross-cultural user evaluation of..." refers result in this paper

  • ...We have previously compared our organization algorithm with other typical tradeoff supporting approaches (such as the data-driven dynamic critiquing system [14]), and found that it achieved significantly higher accuracy in predicting tradeoff criteria and targeted products that users actually made, mainly owing to its preference-focused clustering and selection strategies [5]....

    [...]