Home
/
Authors
/
Xiaoyuan Su

Author

Xiaoyuan Su

Other affiliations: University of Miami, Miami University, University of Alberta

Bio: Xiaoyuan Su is an academic researcher from Florida Atlantic University. The author has contributed to research in topics: Collaborative filtering & Imputation (statistics). The author has an hindex of 11, co-authored 19 publications receiving 3509 citations. Previous affiliations of Xiaoyuan Su include University of Miami & Miami University.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A survey of collaborative filtering techniques

[...]

Xiaoyuan Su¹, Taghi M. Khoshgoftaar¹•Institutions (1)

Florida Atlantic University¹

01 Jan 2009-Advances in Artificial Intelligence

TL;DR: From basic techniques to the state-of-the-art, this paper attempts to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area.

...read moreread less

Abstract: As one of the most successful approaches to building recommender systems, collaborative filtering (CF) uses the known preferences of a group of users to make recommendations or predictions of the unknown preferences for other users. In this paper, we first introduce CF tasks and their main challenges, such as data sparsity, scalability, synonymy, gray sheep, shilling attacks, privacy protection, etc., and their possible solutions. We then present three main categories of CF techniques: memory-based, modelbased, and hybrid CF algorithms (that combine CF with other recommendation techniques), with examples for representative algorithms of each category, and analysis of their predictive performance and their ability to address the challenges. From basic techniques to the state-of-the-art, we attempt to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area.

...read moreread less

3,406 citations

Journal Article•DOI•

Structural Extension to Logistic Regression: Discriminative Parameter Learning of Belief Net Classifiers

[...]

Russell Greiner¹, Xiaoyuan Su², Bin Shen¹, Wei Zhou³•Institutions (3)

University of Alberta¹, University of Miami², University of Waterloo³

01 Jun 2005-Machine Learning

TL;DR: In this article, the authors propose a discriminative parameter learning approach to find the BN that maximizes a different objective function, i.e., likelihood, rather than classification accuracy.

...read moreread less

Abstract: Bayesian belief nets (BNs) are often used for classification tasks--typically to return the most likely class label for each specified instance. Many BN-learners, however, attempt to find the BN that maximizes a different objective function--viz., likelihood, rather than classification accuracy--typically by first learning an appropriate graphical structure, then finding the parameters for that structure that maximize the likelihood of the data. As these parameters may not maximize the classification accuracy, "discriminative parameter learners" follow the alternative approach of seeking the parameters that maximize conditional likelihood (CL), over the distribution of instances the BN will have to classify. This paper first formally specifies this task, shows how it extends standard logistic regression, and analyzes its inherent sample and computational complexity. We then present a general algorithm for this task, ELR, that applies to arbitrary BN structures and that works effectively even when given incomplete training data. Unfortunately, ELR is not guaranteed to find the parameters that optimize conditional likelihood; moreover, even the optimal-CL parameters need not have minimal classification error. This paper therefore presents empirical evidence that ELR produces effective classifiers, often superior to the ones produced by the standard "generative" algorithms, especially in common situations where the given BN-structure is incorrect.

...read moreread less

138 citations

Proceedings Article•DOI•

Collaborative Filtering for Multi-class Data Using Belief Nets Algorithms

[...]

Xiaoyuan Su¹, Taghi M. Khoshgoftaar¹•Institutions (1)

Florida Atlantic University¹

13 Nov 2006

TL;DR: This work applies advanced BNs models to CF tasks instead of simple ones, and work on real-world multi-class CF data instead of synthetic binary-class data, showing that the ELR-optimized BNs CF models consistently perform better than the state-of-the-art Pearson correlation-based CF algorithm.

...read moreread less

Abstract: As one of the most successful recommender systems, collaborative filtering (CF) algorithms can deal with high sparsity and high requirement of scalability amongst other challenges. Bayesian belief nets (BNs), one of the most frequently used classifiers, can be used for CF tasks. Previous works of applying BNs to CF tasks were mainly focused on binary-class data, and used simple or basic Bayesian classifiers (Miyahara and Pazzani, 2002; Breese et al., 1998). In this work, we apply advanced BNs models to CF tasks instead of simple ones, and work on real-world multi-class CF data instead of synthetic binary-class data. Empirical results show that with their ability to deal with incomplete data, extended logistic regression on naive Bayes and tree augmented naive Bayes (NB-ELR and TAN-ELR) models (Greiner et al., 2005) consistently perform better than the state-of-the-art Pearson correlation-based CF algorithm. In addition, the ELR-optimized BNs CF models are robust in terms of the ability to make predictions, while the robustness of the Pearson correlation-based CF algorithm degrades as the sparseness of the data increases

...read moreread less

122 citations

Proceedings Article•DOI•

Imputation-boosted collaborative filtering using machine learning classifiers

[...]

Xiaoyuan Su¹, Taghi M. Khoshgoftaar¹, Xingquan Zhu¹, Russell Greiner²•Institutions (2)

Florida Atlantic University¹, University of Alberta²

16 Mar 2008

TL;DR: A framework of imputation-boosted collaborative filtering (IBCF), which first uses an imputation technique, or perhaps machine learned classifier, to fill-in the sparse user-item rating matrix, then runs a traditional Pearson correlation-based CF algorithm on this matrix to predict a novel rating.

...read moreread less

Abstract: As data sparsity remains a significant challenge for collaborative filtering (CF, we conjecture that predicted ratings based on imputed data may be more accurate than those based on the originally very sparse rating data. In this paper, we propose a framework of imputation-boosted collaborative filtering (IBCF), which first uses an imputation technique, or perhaps machine learned classifier, to fill-in the sparse user-item rating matrix, then runs a traditional Pearson correlation-based CF algorithm on this matrix to predict a novel rating. Empirical results show that IBCF using machine learning classifiers can improve predictive accuracy of CF tasks. In particular, IBCF using a classifier capable of dealing well with missing data, such as naive Bayes, can outperform the content-boosted CF (a representative hybrid CF algorithm) and IBCF using PMM (predictive mean matching, a state-of-the-art imputation technique), without using external content information.

...read moreread less

68 citations

Proceedings Article•DOI•

Hybrid Collaborative Filtering Algorithms Using a Mixture of Experts

[...]

Xiaoyuan Su¹, Russell Greiner², Taghi M. Khoshgoftaar¹, Xingquan Zhu¹•Institutions (2)

Florida Atlantic University¹, University of Alberta²

02 Nov 2007

TL;DR: This paper proposes two hybrid CF algorithms, sequential mixture CF and joint mixture CF, each combining advice from multiple experts for effective recommendation, and shows that these algorithms outperform their peers, especially when the underlying data are very sparse.

...read moreread less

Abstract: Collaborative filtering (CF) is one of the most successful approaches for recommendation. In this paper, we propose two hybrid CF algorithms, sequential mixture CF and joint mixture CF, each combining advice from multiple experts for effective recommendation. These proposed hybrid CF models work particularly well in the common situation when data are very sparse. By combining multiple experts to form a mixture CF, our systems are able to cope with sparse data to obtain satisfactory performance. Empirical studies show that our algorithms outperform their peers, such as memory-based, pure model-based, pure content-based CF algorithms, and the content- boosted CF (a representative hybrid CF algorithm), especially when the underlying data are very sparse.

...read moreread less

39 citations

1
2
3
4
…

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Data Mining - Concepts and Techniques.

[...]

Petra Perner

01 Jan 2002

9,314 citations

Journal Article•DOI•

A survey of collaborative filtering techniques

[...]

Xiaoyuan Su¹, Taghi M. Khoshgoftaar¹•Institutions (1)

Florida Atlantic University¹

01 Jan 2009-Advances in Artificial Intelligence

TL;DR: From basic techniques to the state-of-the-art, this paper attempts to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area.

...read moreread less

3,406 citations

Journal Article•DOI•

Recommender systems survey

[...]

Jesús Bobadilla¹, Fernando Ortega¹, Antonio Hernando¹, Abraham Gutiérrez¹•Institutions (1)

Technical University of Madrid¹

01 Jul 2013-Knowledge Based Systems

TL;DR: An overview of recommender systems as well as collaborative filtering methods and algorithms is provided, which explains their evolution, provides an original classification for these systems, identifies areas of future implementation and develops certain areas selected for past, present or future importance.

...read moreread less

Abstract: Recommender systems have developed in parallel with the web. They were initially based on demographic, content-based and collaborative filtering. Currently, these systems are incorporating social information. In the future, they will use implicit, local and personal information from the Internet of things. This article provides an overview of recommender systems as well as collaborative filtering methods and algorithms; it also explains their evolution, provides an original classification for these systems, identifies areas of future implementation and develops certain areas selected for past, present or future importance.

...read moreread less

2,639 citations

Journal Article•DOI•

Link prediction in complex networks: A survey

[...]

Linyuan Lü¹, Linyuan Lü², Linyuan Lü³, Tao Zhou⁴, Tao Zhou² - Show less +1 more•Institutions (4)

University of Shanghai for Science and Technology¹, University of Electronic Science and Technology of China², University of Fribourg³, University of Science and Technology of China⁴

15 Mar 2011-Physica A-statistical Mechanics and Its Applications

TL;DR: Recent progress about link prediction algorithms is summarized, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods.

...read moreread less

Abstract: Link prediction in complex networks has attracted increasing attention from both physical and computer science communities. The algorithms can be used to extract missing information, identify spurious interactions, evaluate network evolving mechanisms, and so on. This article summaries recent progress about link prediction algorithms, emphasizing on the contributions from physical perspectives and approaches, such as the random-walk-based methods and the maximum likelihood methods. We also introduce three typical applications: reconstruction of networks, evaluation of network evolving mechanism and classification of partially labeled networks. Finally, we introduce some applications and outline future challenges of link prediction algorithms.

...read moreread less

2,530 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse