scispace - formally typeset
Search or ask a question
Topic

Metasearch engine

About: Metasearch engine is a research topic. Over the lifetime, 2590 publications have been published within this topic receiving 79273 citations.


Papers
More filters
Journal ArticleDOI
01 Apr 1998
TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.
Abstract: In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

14,696 citations

Proceedings ArticleDOI
01 Apr 2001
TL;DR: A set of techniques for the rank aggregation problem is developed and compared to that of well-known methods, to design rank aggregation techniques that can be used to combat spam in Web searches.
Abstract: We consider the problem of combining ranking results from various sources. In the context of the Web, the main applications include building meta-search engines, combining ranking functions, selecting documents based on multiple criteria, and improving search precision through word associations. We develop a set of techniques for the rank aggregation problem and compare their performance to that of well-known methods. A primary goal of our work is to design rank aggregation techniques that can e ectively combat \spam," a serious problem in Web searches. Experiments show that our methods are simple, e cient, and e ective.

1,982 citations

Journal ArticleDOI
01 Sep 1999
TL;DR: It is shown that web users type in short queries, mostly look at the first 10 results only, and seldom modify the query, suggesting that traditional information retrieval techniques may not work well for answering web search requests.
Abstract: In this paper we present an analysis of an AltaVista Search Engine query log consisting of approximately 1 billion entries for search requests over a period of six weeks. This represents almost 285 million user sessions, each an attempt to fill a single information need. We present an analysis of individual queries, query duplication, and query sessions. We also present results of a correlation analysis of the log entries, studying the interaction of terms within queries. Our data supports the conjecture that web users differ significantly from the user assumed in the standard information retrieval literature. Specifically, we show that web users type in short queries, mostly look at the first 10 results only, and seldom modify the query. This suggests that traditional information retrieval techniques may not work well for answering web search requests. The correlation analysis showed that the most highly correlated items are constituents of phrases. This result indicates it may be useful for search engines to consider search terms as parts of phrases even if the user did not explicitly specify them as such.

1,255 citations

Journal ArticleDOI
TL;DR: It is found that most people use few search terms, few modified queries, view few Web pages, and rarely use advanced search features, and the language of Web queries is distinctive.
Abstract: In studying actual Web searching by the public at large, we analyzed over one million Web queries by users of the Excite search engine. We found that most people use few search terms, few modified queries, view few Web pages, and rarely use advanced search features. A small number of search terms are used with high frequency, and a great many terms are unique; the language of Web queries is distinctive. Queries about recreation and entertainment rank highest. Findings are compared to data from two other large studies of Web queries. This study provides an insight into the public practices and choices in Web searching.

1,153 citations

Proceedings ArticleDOI
06 Aug 2006
TL;DR: In this paper, the authors show that incorporating implicit feedback can augment other features, improving the accuracy of a competitive web search ranking algorithm by as much as 31% relative to the original performance.
Abstract: We show that incorporating user behavior data can significantly improve ordering of top results in real web search setting. We examine alternatives for incorporating feedback into the ranking process and explore the contributions of user feedback compared to other common web search features. We report results of a large scale evaluation over 3,000 queries and 12 million user interactions with a popular web search engine. We show that incorporating implicit feedback can augment other features, improving the accuracy of a competitive web search ranking algorithms by as much as 31% relative to the original performance.

1,119 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
86% related
Web service
57.6K papers, 989K citations
82% related
Information system
107.5K papers, 1.8M citations
81% related
Server
79.5K papers, 1.4M citations
78% related
Scalability
50.9K papers, 931.6K citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202316
202218
202112
20209
201919
201830