Web Search Personalization by User Profiling

doi:10.1109/ICETET.2008.70

Home
/
Papers
/
Web Search Personalization by User Profiling

Proceedings Article•DOI•

Web Search Personalization by User Profiling

Mangesh Bedekar¹, Bharat M. Deshpande¹, Ramprasad Joshi¹•Institutions (1)

Birla Institute of Technology and Science¹

16 Jul 2008-pp 1099-1103

TL;DR: The mathematics behind these 'link analysis algorithms' are analyzed and their effective use in ecommerce applications where they could be used for displaying 'personalized information' is analyzed.

read less

Abstract: The World Wide Web is growing at a rate of about a million pages per day, making it tougher for search engines to extract relevant information for its users. Earlier Search Engines used simple indexing techniques to search for keywords in websites and gave more weightage to pages with higher frequency of keyword occurrences. This technique was easy to trick by using meta-tags liberally, claiming that their page used popular search terms, thereby, made meta-tags useless for search engines. Another technique widely used was to repeatedly use popular search terms in invisible text (white text on a white background) to fool engines. These fallacies called for a set of algorithms which would sort the results using an unbiased parameter. The currently employed Link Analysis Algorithms make use of the structure present in 'hyperlinks', sorted and displayed depending on a 'popularity index' decided to pages linking to it. In this work, we have analyzed the mathematics behind these 'link analysis algorithms' and their effective use in ecommerce applications where they could be used for displaying 'personalized information'.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•

Authoritative Soueces in a Hyper-linked Environment

[...]

J. Kleinberg

01 Jan 1998

62 citations

Posted Content•

User Profiling Trends, Techniques and Applications

[...]

Sumitkumar Kanoje, Sheetal Girase, Debajyoti Mukhopadhyay

25 Mar 2015-arXiv: Information Retrieval

TL;DR: The main objective of this paper is to explore the field of personalization in context of user profiling, to help researchers make aware of the user profiling.

...read moreread less

Abstract: The Personalization of information has taken recommender systems at a very high level. With personalization these systems can generate user specific recommendations accurately and efficiently. User profiling helps personalization, where information retrieval is done to personalize a scenario which maintains a separate user profile for individual user. The main objective of this paper is to explore this field of personalization in context of user profiling, to help researchers make aware of the user profiling. Various trends, techniques and Applications have been discussed in paper which will fulfill this motto.

...read moreread less

55 citations

Journal Article•DOI•

User Profiling for University Recommender System Using Automatic Information Retrieval

[...]

Sumitkumar Kanoje¹, Debajyoti Mukhopadhyay¹, Sheetal Girase¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Mar 2016-Procedia Computer Science

TL;DR: This paper aims at finding, extracting and integrating keyword based information from various web sources to generate a structured profile and does some experiments on the profiled information to generate knowledge out of it.

...read moreread less

18 citations

Proceedings Article•DOI•

Social network and user context assisted personalization for recommender systems

[...]

Aysha Akther¹, Kazi Masudul Alam¹, Heung-Nam Kim¹, Abdulmotaleb El Saddik¹•Institutions (1)

University of Ottawa¹

18 Mar 2012

TL;DR: A new architecture for user personalization which combines both social network data and context data is designed which aggregates a user's preference data from various social networking services and then builds a centralized user profile which is accessible through public Web services.

...read moreread less

Abstract: In recommender systems, social networks are considered as a trusted source for user interests. In addition, user context can enhance users' decision making. In this paper, we design a new architecture for user personalization which combines both social network data and context data. Our system aggregates a user's preference data from various social networking services and then builds a centralized user profile which is accessible through public Web services. We also collect user's contextual information and store it in a central space which is also accessible through public Web services. Based on Service Oriented Architecture, recommender systems can flexibly utilize users' preference information and context to provide more desirable recommendations. We present how our system can integrate both types of data together and how they can be mapped in a meaningful way.

...read moreread less

14 citations

Proceedings Article•DOI•

Personal search engine based on user interests and modified page rank

[...]

Hany M. Harb¹, Ahmed R. Khalifa¹, Hossam Ishkewy¹•Institutions (1)

Al-Azhar University¹

01 Dec 2009

TL;DR: This paper introduces A Personal Search Engine which provides results relevant to the user's interest and depends on the degree of relevance of the document category to ensure relevant and accurate results.

...read moreread less

Abstract: With the tremendous growth of the web and the contents difference, users need specialized accurate results depending on their behavior and varying according to their interest. In this paper, we introduce A Personal Search Engine which provides results relevant to the user's interest. Our search engine depends on three factors to ensure relevant and accurate results. The first factor is the degree of importance of the document category to the user. The second factor is the user's interest page rank which depends on the user's browsing of the page. The third factor is the degree of relevance of the document.

...read moreread less

9 citations

References

PDF

Open Access

More filters

Journal Article•DOI•

The anatomy of a large-scale hypertextual Web search engine

[...]

Sergey Brin¹, Lawrence Page¹•Institutions (1)

Stanford University¹

01 Apr 1998

TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

...read moreread less

Abstract: In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

...read moreread less

14,696 citations

Additional excerpts

...Kleinburg’s HITS algorithm [8] and Google’s PageRank [ 2 , 3, 6, 7] algorithm, are eigenvector based methods....
[...]

Journal Article•

The Anatomy of a Large-Scale Hypertextual Web Search Engine.

[...]

Sergey Brin, Lawrence Page

01 Jan 1998-Computer Networks

TL;DR: Google as discussed by the authors is a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems.

...read moreread less

13,327 citations

Journal Article•DOI•

Authoritative sources in a hyperlinked environment

[...]

Jon Kleinberg¹•Institutions (1)

Cornell University¹

01 Sep 1999-Journal of the ACM

TL;DR: This work proposes and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of “hub pages” that join them together in the link structure, and has connections to the eigenvectors of certain matrices associated with the link graph.

...read moreread less

Abstract: The network structure of a hyperlinked environment can be a rich source of information about the content of the environment, provided we have effective means for understanding it. We develop a set of algorithmic tools for extracting information from the link structures of such environments, and report on experiments that demonstrate their effectiveness in a variety of context on the World Wide Web. The central issue we address within our framework is the distillation of broad search topics, through the discovery of “authorative” information sources on such topics. We propose and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of “hub pages” that join them together in the link structure. Our formulation has connections to the eigenvectors of certain matrices associated with the link graph; these connections in turn motivate additional heuristrics for link-based analysis.

...read moreread less

8,328 citations

"Web Search Personalization by User ..." refers methods in this paper

...Kleinburg’s HITS algorithm [ 8 ] and Google’s PageRank [2, 3, 6, 7] algorithm, are eigenvector based methods....
[...]

Journal Article•DOI•

Unsupervised Learning by Probabilistic Latent Semantic Analysis

[...]

Thomas Hofmann¹•Institutions (1)

Brown University¹

01 Jan 2001-Machine Learning

TL;DR: This paper proposes to make use of a temperature controlled version of the Expectation Maximization algorithm for model fitting, which has shown excellent performance in practice, and results in a more principled approach with a solid foundation in statistical inference.

...read moreread less

Abstract: This paper presents a novel statistical method for factor analysis of binary and count data which is closely related to a technique known as Latent Semantic Analysis. In contrast to the latter method which stems from linear algebra and performs a Singular Value Decomposition of co-occurrence tables, the proposed technique uses a generative latent class model to perform a probabilistic mixture decomposition. This results in a more principled approach with a solid foundation in statistical inference. More precisely, we propose to make use of a temperature controlled version of the Expectation Maximization algorithm for model fitting, which has shown excellent performance in practice. Probabilistic Latent Semantic Analysis has many applications, most prominently in information retrieval, natural language processing, machine learning from text, and in related areas. The paper presents perplexity results for different types of text and linguistic data collections and discusses an application in automated document indexing. The experiments indicate substantial and consistent improvements of the probabilistic method over standard Latent Semantic Analysis.

...read moreread less

2,574 citations

Proceedings Article•DOI•

Topic-sensitive PageRank

[...]

Taher H. Haveliwala¹•Institutions (1)

Stanford University¹

07 May 2002

TL;DR: A set of PageRank vectors are proposed, biased using a set of representative topics, to capture more accurately the notion of importance with respect to a particular topic, and are shown to generate more accurate rankings than with a single, generic PageRank vector.

...read moreread less

Abstract: In the original PageRank algorithm for improving the ranking of search-query results, a single PageRank vector is computed, using the link structure of the Web, to capture the relative "importance" of Web pages, independent of any particular search query. To yield more accurate search results, we propose computing a set of PageRank vectors, biased using a set of representative topics, to capture more accurately the notion of importance with respect to a particular topic. By using these (precomputed) biased PageRank vectors to generate query-specific importance scores for pages at query time, we show that we can generate more accurate rankings than with a single, generic PageRank vector. For ordinary keyword search queries, we compute the topic-sensitive PageRank scores for pages satisfying the query using the topic of the query keywords. For searches done in context (e.g., when the search query is performed by highlighting words in a Web page), we compute the topic-sensitive PageRank scores using the topic of the context in which the query appeared.

...read moreread less

1,765 citations