A New Information Theory-Based Serendipitous Algorithm Design

doi:10.1007/978-3-319-58524-6_26

A New Information Theory-Based

Serendipitous Algorithm Design

Xiaosong Zhou

1

, Zhan Xu

1

, Xu Sun

1(&)

, and Qingfeng Wang

2

1

Faculty of Science and Engineering,

University of Nottingham Ningbo China, Ningbo, China

Xu.sun@nottingham.edu.cn

2

Business School, University of Nottingham Ningbo China, Ningbo, China

Abstract. The development of information technology has stimulated an

increasing number of researchers to investigate how to provide serendipitous

experience to users in the digital environment, especially in the ﬁelds of

information research and recommendation systems. Although a number of

achievements have been made in understanding the nature of serendipity in the

context of information research, few of these achievements have been employed

in the design of information systems. This paper proposes a new serendipitous

recommendation algorithm based on previous empirical studies by taking into

considerations of the three important elements of serendipity, namely “unex-

pectedness”, “insight” and “value”. We consider our design of the algorithm as

an important attempt to bridge the research fruits between the two areas of

information research and recommendation systems. By applying the designed

algorithm to a game-based application in a real life experiment with target users,

we have found that comparing to the conventional designed method; the pro-

posed algorithm has successfully provided more possibilities to the participants

to experience serendipitous encountering.

Keywords: Serendipity

 Recommendation system  Information theory

1 Introduction

Serendipity is widely exp erienced in human history, it is deﬁned as “an unexpected

experience prompted by an individual’s valuable interaction with ideas, information,

objects, or phenomena” [1]. So far studies relating to serendipity mainly focus on the

following two directions: theoretical studies in the area of information research which

aim to investigate the nature of serendipity [2–4], and the empirical studies with the

purpose to develop applications or algorithms that provide users with serendipitous

encountering especially in the digital environment [5–7].

One of the areas which try to employ serendipity applications is the design of

recommender system. The overloaded information in the cyber space has made current

users no longer satisﬁed by recommending them those “accurate” information, instead,

users aims to be recommended with the information that are more serendipitous and

interesting to them [8–10]. However, a rising concern identiﬁed in our reviewing of

relevant studies is that those discoveries from information research regarding the nature

S. Yamamoto (Ed.): HIMI 2017, Part II, LNCS 10274, pp. 314–327, 2017.

DOI: 10.1007/978-3-319-58524-6_26

of serendipity do not receive sufﬁcient attentions in the recommender system designs.

This paper proposes a new algor ithm to support serendipitous recommendation by

applying recent research fruits on serendipity in the area of information research.

2 Problem and Research Question

Recommender system researchers often consider serendipity as “unexpected” and

“useful” [11], and have designed recommendation algorithms through either

content-based ﬁltering [12] or collaborative ﬁltering [13]. However, most of the rec-

ommendation algorithms mainly focus on providing “unexpectedness” to the users, and

treated the “usefulness” as only a metric value to measure the effectiveness of their

algorithms rather than considering it as a design clue [14].

As a comparison, serendipity in information research is often considered with three

main characteristics: unexpectedness, insight and value [4]. “Unexpectedness” is con-

sidered as the encountered information should be unexpected or a surprise to the

information actor, while “value” speciﬁes that the encountered information should be

considered as useful and beneﬁcial to the information actor. These two understandings of

“unexpectedness” and “value” consist with the current view of serendipity in designing

recommender systems [11, 14]; however, the “insight” aspect tends to be neglected.

“Insight” is considered as an ability to ﬁnd some clue in curren t environment, then

“making connections” between the clue and one’ s previous knowledge or experience,

and ﬁnally shift the attention to the new discovered clue [15]. Some researchers have

found such ability of “making connections” is actually a key facet in experiencing

serendipity [4] and can be quite different among individuals and result in a range of

serendipity encounterers from the super-encounterers to occasional-encounterers [16].

The connections can be made between different pieces of information, people and ideas

[3]; therefore, to support or “trigger

” connection-making in order to bring more pos-

sibilities of experiencing serendipity have always been considered as an imp ortant

design clue for those information researchers [17, 18].

Based on the discussed issues, we then raise our research question: is it possible to

combine the theoretical studies of serendipity in information research, especially the

ignored aspect of “insight” or “making connection”, into the recommender system

design?

Followed by our research question, we proposed a collaborative-ﬁltering based

algorithm by considering the theoretical discoveries of serendipity from the area of

information research. Based on the discovery from information research that serendipity

is often encountered in a relaxed and leisure personal state [1, 3], we then applied the

algorithm into a game based application and conducted an empirical experiment.

3 Proposed Algorithm

There are two major concerns in providing serendipitous encountering in the recom-

mendation system design: the ﬁrst concern is how to balance “unexpectedness” and

“useful”. As pointed out by [14], there should be “a most preferred distance” between

A New Information Theory-Based Serendipitous Algorithm Design 315

the two values, as the high level of unexpectedness may cause user’ s dissatisfaction of

the recommended information, while users may also lose interest to that information

with a low unexpectedness. The second concern is how to combine “insight” into

system design to stimulate the process of “making connections”.

The two concerns are addressed from the following perspective of “relev ance” with

two hypotheses:

• Hypothesis 1: Given the information that is highly relevant to a user’s personal

proﬁle, the information would also of a high potential value to the user;

• Hypothesis 2: A user will be unexpected to the information that is relevant to his

proﬁle while is not previ ous acknowledged or known by the user.

Consider a target user A, who is the user that will be provided with the recom-

mended information, a user B who is highly relevant to user A and a user C who is

highly relevant to user B while is not known by user A. The user A may experience

serendipity by providing the information of user C, which is unexpected to him/her,

and by providing the relationship between user B and user C, which may further cause

interestingness or usefulness to user A. The following part of this section illustrates a

detailed implementation of the algorithm.

1. Target user

Consider a table of a target user proﬁle U

1

with a category set C = {C

1

,C

2

,C

3

…C

i

…

C

n

}, where C

i

represents the i-th category of the user proﬁle. All the categories are

arranged through the value of their weights in the user proﬁle. The weight can either be

a given weight by the dataset or calculated through clustering analysis [19]. In order to

simplify the introduction of our proposed algorithm here, it is more convenient to set

the weight for each C

i

which is given by the dataset in the very beginning. The weight

of C

i

is larger than C

j

(i > j)inC set:

w

c

¼ w

C

1

; w

C

2

; ...; w

C

i

; ...; w

C

i

; ...; w

C

n

w

C

i

 w

C

j

; i [ j





ð1Þ

For each category set C

i

, consider C

i

= {a

1

,a

2

,a

3

… a

i

… a

n

}, wher e a

i

is the

corresponded attribute to each vector C

i

. In particular, for each a

i

represents the

dimension according to which a new user proﬁle may be produced (i.e. author of

literatures; musicians). The values for each a

i

are also arranged by their weight in each

vector C

i

and can be calculated through semantic analysis such as the tf*idf weight

(term-frequency times inverse document frequency) calculation [20]:

wðt; dÞ¼

tf

t;d

log

N

df

t



ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

P

i

tf

t

i

;d



2

log

N

df

t

i



2

s

ð2Þ

Where w(t,d) represents for the weight of a term t in a document d, and it is a

function of the frequency of t in the document (tft,d), the number of documents that

316 X. Zhou et al.

contain the term (dft) and the number of documents in the collection (N). As a result,

the weight for a category set C

i

is determined by the weight of each attribute in the set:

w

c

i

¼ w

a

1

; w

a

2

; ...; w

a

i

; .. .; w

a

j

; ...; w

a

n

w

a

i

 w

a

j

; i [ j





ð3Þ

2. Screen the weight

As been pre-deﬁned that C

1

with the largest weight in the C set and a

1

with the largest

weight in the C

i

set. Set a threshold s to eliminate the low weight value from the user

proﬁle U

1

:

w

c

i

¼ w

a

1

; w

a

2

; ...; w

a

i

; .. .; w

a

j

; ...; w

a

n

w

a

i

 w

a

j

; i [ j





ð4Þ

Similarly, set a threshold h to eliminate the low weight value from the C

i

set:

w

c

i

¼ w

C

i

a

1

; w

C

i

a

2

; w

C

i

a

3

; ...; w

C

i

a

i

w

C

i

a

i

 h

jfg

ð5Þ

3. Generate a new user proﬁle

A new user proﬁle U

i+1

is produce d according to each a

i

in the C

i

set. Here, the

generation of the user proﬁle arrang es from the largest weight of w

C

i

;a

1

to the smallest

weight of w

C

i

;a

i

.

4. Iteration and End condition

Based on the weight arrangement in a user proﬁle, it is intuitional that for an attribute a

i

with a large weight, it is also with more possibility for the current user to have

acknowledged about the information of a

i

. In other words, the probability for a current

user U

i

to make connection with the next user proﬁle U

i+1

is proportional to the weight

of the attribute in current user proﬁle:

PðU

i þ 1

U

i

j

Þ¼kw

c

i

 w

c

i

;a

i

ð6Þ

where k is the proportionality coefﬁcient of the probability to the relev ant weight.

The probability of making connections by target user U

1

to i-th user can be further

extended if only the generated user is always new to the prior generated ones:

PðU

i

U

1

j

Þ¼PðU

2

U

1

j

ÞPðU

3

U

2

j

Þ...  PðU

i

U

i1

j

Þð7Þ

The iteration to ﬁnd the next user would not continue until it meets the following

two end conditions:

• the generated user is no longer new to all the previous generated users;

• PðU

i

U

1

j

Þ comes to a threshold d , where d represents an appropriate threshold of the

probability.

A New Information Theory-Based Serendipitous Algorithm Design 317

The reason to set the threshold d is to ensure the effectiveness of the iteration

process. This is because if PðU

i

U

1

j

Þ comes too large, the recommended information

may fail to bring the target user with the sense of unexpectedness, as the recommen-

dation may probably have been acknowledged by the user; however, if the value of

PðU

i

U

1

j

Þ is too small, the recommended information may be too irrelevant to the target

user and he/she may lose interest on it. Hence the setting of the threshold d is a very

important step for the iteration process and it needs to be further identiﬁed based on

empirical studies in the future. Once the recommendation list is generated within the

threshold d, they can be recommended to the target user by selecting the item with the

highest values of PðU

i

U

1

j

Þ.

5. Recommendation

When the iteration is ﬁnished, the content with the largest weighted category in current

candidate will be provided to the target user, in ad dition with the relevant information

of the previous searched users that result in the current user.

6. An example of the proposed algorithm

An example of the proposed algorithm is provided in Fig. 1. Consider Ann as the target

user (U

1

) with different literature categories of {A, B, C} in her person al library, whose

weight is {0.5, 0.3, 0.2} (Fig. 1a). The author names of the literatures are set as the

attributes for each category and according to the tf*idf weight calculation, there are

three values {a

1

,a

2

,a

3

} in category A with the weight W’A = {0.6, 0.3, 0.1}. Set k =1

for each probability of the current user to ﬁnd the next user pro ﬁle, the probability for

Ann to ﬁnd a1’s proﬁle (U

2

) can be calculated according to Eq. (6):

PðU

2

U

1

j

Þ¼w

A

 w

A;a

1

¼ 0: 5  0:6 ¼ 0:3 ð8Þ

The proﬁle of a1 is then produced as Fig. 1b. Likewise, among the four authors in

the D catego ry, author d1 (U

3

) weights largest and then produce d1’s proﬁle (Fig. 1c):

PðU

3

U

2

j

Þ¼w

D

 w

D;d

1

¼ 0:4  0:5 ¼ 0:2 ð9Þ

According to Eq. (7), the probabili ty for Ann (U

1

) to ﬁnd d1’s proﬁle (U

3

) is:

PðU

3

U

1

j

Þ¼PðU

2

U

1

j

ÞPðU

3

U

2

j

Þ¼0:3  0:2 ¼ 0:06 ð10Þ

Set the threshold d as 0.06, then the iteration of the algorithm stops and recommend

literatures of category F in d1’s proﬁle to Ann, in addition with the relevant information

of d1 and a1. For example, the recommended information can be “these papers (category

F) are most stored by d1, who had published papers (d1, d2, d3, d4) with a1 before”.

7. Description of the Proposed Algorithm

The proposed algorithm is collaborative ﬁltering based, hence it is more appropriate to

those dataset whose content is generated by different users, according to which the next

user’s proﬁle will be easier to produce for a current user.

318 X. Zhou et al.

A New Information Theory-Based Serendipitous Algorithm Design

Figures

Citations

Serendipitous Recommendation in E-Commerce Using Innovator-Based Collaborative Filtering

Serendipity in Recommender Systems: A Systematic Literature Review

Neural Serendipity Recommendation: Exploring the Balance between Accuracy and Novelty with Sparse Explicit Feedback

Investigating the impact of emotions on perceiving serendipitous information encountering

CHESTNUT: Improve Serendipity in Movie Recommendation by an Information Theory-Based Collaborative Filtering Approach

References

On Unexpectedness in Recommender Systems: Or How to Better Expect the Unexpected

Introducing Serendipity in a Content-Based Recommender System

Metrics for evaluating the serendipity of recommendation lists

Investigation of information encountering in the controlled research environment

Coming across information serendipitously - Part 1: A process model

Related Papers (5)

Serendipity in Recommender Systems: A Systematic Literature Review

Framing serendipitous information‐seeking behavior for facilitating literature‐based discovery: A proposed model

Research Commentary on Recommendations with Side Information: A Survey and Research Directions

Use of Deep Learning in Modern Recommendation System: A Summary of Recent Works

Scientific Paper Recommendation: A Survey