scispace - formally typeset
Search or ask a question
Author

Sarabjot Singh Anand

Other affiliations: Ulster University
Bio: Sarabjot Singh Anand is an academic researcher from University of Warwick. The author has contributed to research in topics: Personalization & Web intelligence. The author has an hindex of 21, co-authored 80 publications receiving 2205 citations. Previous affiliations of Sarabjot Singh Anand include Ulster University.


Papers
More filters
Journal Article
TL;DR: A comprehensive overview of the state-of-the-art in web personalization can be found in this article, where the authors discuss the various sources of data available to personalization systems, the modelling approaches employed and the current approaches to evaluating these systems.
Abstract: In this chapter we provide a comprehensive overview of the topic of Intelligent Techniques for Web Personalization. Web Personalization is viewed as an application of data mining and machine learning techniques to build models of user behaviour that can be applied to the task of predicting user needs and adapting future interactions with the ultimate goal of improved user satisfaction. This chapter survey's the state-of-the-art in Web personalization. We start by providing a description of the personalization process and a classification of the current approaches to Web personalization. We discuss the various sources of data available to personalization systems, the modelling approaches employed and the current approaches to evaluating these systems. A number of challenges faced by researchers developing these systems are described as are solutions to these challenges proposed in literature. The chapter concludes with a discussion on the open challenges that must be addressed by the research community if this technology is to make a positive impact on user satisfaction with the Web.

240 citations

Journal ArticleDOI
TL;DR: The approach is novel in measuring similarity between users in that it first derives factors, referred to as impacts, driving the observed user behavior and then uses these factors within the similarity computation.
Abstract: Traditional collaborative filtering generates recommendations for the active user based solely on ratings of items by other users. However, most businesses today have item ontologies that provide a useful source of content descriptors that can be used to enhance the quality of recommendations generated. In this article, we present a novel approach to integrating user rating vectors with an item ontology to generate recommendations. The approach is novel in measuring similarity between users in that it first derives factors, referred to as impacts, driving the observed user behavior and then uses these factors within the similarity computation. In doing so, a more comprehensive user model is learned that is sensitive to the context of the user visit.An evaluation of our recommendation algorithm was carried out using data from an online retailer of movies with over 94,000 movies, 44,000 actors, and 10,000 directors within the item knowledge base. The evaluation showed a statistically significant improvement in the prediction accuracy over traditional collaborative filtering. Additionally, the algorithm was shown to generate recommendations for visitors that belong to sparse sections of the user space, areas where traditional collaborative filtering would generally fail to generate accurate recommendations.

99 citations

15 May 1999
TL;DR: A new algorithm called MiDAS is introduced that extends traditional sequence discovery with a wide range of web-specific features and allows the detection of sequences across monitored attributes, such as URLs and http referrers.
Abstract: Electronic commerce sites need to learn as much as possible about their customers and those browsing their virtual premises, in order to maximize their marketing effort. The discovery of marketing related navigation patterns requires the development of data mining algorithms capable of discovering sequential access patterns from web logs. This paper introduces a new algorithm called MiDAS that extends traditional sequence discovery with a wide range of web-specific features. Domain knowledge is described as flexible navigation templates that can specify navigational behavior, as network structures for the capture of web site topologies, in addition to concept hierarchies and syntactic constraints. Unlike existing approaches, field dependency has been implemented, which allows the detection of sequences across monitored attributes, such as URLs and http referrers. Three different types of contained-in relationships are supported, which express different types of browsing behavior. The carried out experimental evaluation have shown promising results in terms of functionality as well as scalability.

99 citations

Proceedings ArticleDOI
02 Dec 1995
TL;DR: The advantages of using domain knowledge within the discovery process are highlighted by providing results from the application of the STRIP algorithm in the actuarial domain.
Abstract: The ideal situation for a Data Mining or Knowledge Discovery system would be for the user to be able to pose a query of the form “Give me something interesting that could be useful” and for the system to discover some useful knowledge for the user. But such a system would be unrealistic as databases in the real world are very large and so it would be too inefficient to be workable. So the role of the human within the discovery process is essential. Moreover, the measure of what is meant by “interesting to the user” is dependent on the user as well as the domain within which the Data Mining system is being used. In this paper we discuss the use of domain knowledge within Data Mining. We define three classes of domain knowledge: Hierarchical Generalization Trees ( HG-Trees), Attribute Relationship Rules (AR-rules) and EnvironmentBased Constraints (EBC). We discuss how each one of these types of domain knowledge is incorporated into the discovery process within the EDM (Evidential Data Mining) framework for Data Mining proposed earlier by the authors [ANAN94], and in particular within the STRIP (Strong Rule Induction in Parallel) algorithm [ANAN95] implemented within the EDM framework. We highlight the advantages of using domain knowledge within the discovery process by providing results from the application of the STRIP algorithm in the actuarial domain.

98 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Book
01 May 2012
TL;DR: Sentiment analysis and opinion mining is the field of study that analyzes people's opinions, sentiments, evaluations, attitudes, and emotions from written language as discussed by the authors and is one of the most active research areas in natural language processing and is also widely studied in data mining, Web mining, and text mining.
Abstract: Sentiment analysis and opinion mining is the field of study that analyzes people's opinions, sentiments, evaluations, attitudes, and emotions from written language. It is one of the most active research areas in natural language processing and is also widely studied in data mining, Web mining, and text mining. In fact, this research has spread outside of computer science to the management sciences and social sciences due to its importance to business and society as a whole. The growing importance of sentiment analysis coincides with the growth of social media such as reviews, forum discussions, blogs, micro-blogs, Twitter, and social networks. For the first time in human history, we now have a huge volume of opinionated data recorded in digital form for analysis. Sentiment analysis systems are being applied in almost every business and social domain because opinions are central to almost all human activities and are key influencers of our behaviors. Our beliefs and perceptions of reality, and the choices we make, are largely conditioned on how others see and evaluate the world. For this reason, when we need to make a decision we often seek out the opinions of others. This is true not only for individuals but also for organizations. This book is a comprehensive introductory and survey text. It covers all important topics and the latest developments in the field with over 400 references. It is suitable for students, researchers and practitioners who are interested in social media analysis in general and sentiment analysis in particular. Lecturers can readily use it in class for courses on natural language processing, social media analysis, text mining, and data mining. Lecture slides are also available online.

4,515 citations

Journal ArticleDOI
TL;DR: An overview of recommender systems as well as collaborative filtering methods and algorithms is provided, which explains their evolution, provides an original classification for these systems, identifies areas of future implementation and develops certain areas selected for past, present or future importance.
Abstract: Recommender systems have developed in parallel with the web. They were initially based on demographic, content-based and collaborative filtering. Currently, these systems are incorporating social information. In the future, they will use implicit, local and personal information from the Internet of things. This article provides an overview of recommender systems as well as collaborative filtering methods and algorithms; it also explains their evolution, provides an original classification for these systems, identifies areas of future implementation and develops certain areas selected for past, present or future importance.

2,639 citations

Journal ArticleDOI
TL;DR: In this paper, the authors review and synthesize the literature about service quality delivery through Web sites, describe what is known about the topic, and develop an agenda for needed research.
Abstract: Evidence exists that service quality delivery through Web sites is an essential strategy to success, possibly more important than low price and Web presence. To deliver superior service quality, managers of companies with Web presences must first understand how customers perceive and evaluate online customer service. Information on this topic is beginning to emerge from both academic and practitioner sources, but this information has not yet been examined as a whole. The goals of this article are to review and synthesize the literature about service quality delivery through Web sites, describe what is known about the topic, and develop an agenda for needed research.

2,520 citations