scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Journal Article
TL;DR: Subjective as well as quantitative evaluation show that the algorithm outperforms keyword-based cluster-labeling algorithms, and is capable of accurately discovering the topic, and often ranking it in the top one or two extracted keyphrases.
Abstract: The ability to discover the topic of a large set of text documents using relevant keyphrases is usually regarded as a very tedious task if done by hand. Automatic keyphrase extraction from multi-document data sets or text clusters provides a very compact summary of the contents of the clusters, which often helps in locating information easily. We introduce an algorithm for topic discovery using keyphrase extraction from multi-document sets and clusters based on frequent and significant shared phrases between documents. The keyphrases extracted by the algorithm are highly accurate and fit the cluster topic. The algorithm is independent of the domain of the documents. Subjective as well as quantitative evaluation show that the algorithm outperforms keyword-based cluster-labeling algorithms, and is capable of accurately discovering the topic, and often ranking it in the top one or two extracted keyphrases.

136 citations

Patent
28 Jun 2005
TL;DR: In this article, the authors propose a method for determining a facet from a web page to be indexed for searching, indexing the web page with the facet in a search index, augmenting a user's web search query with the augmented facet, searching the search index for the facet-augmented query, and presenting to the user, as a search result, the Web page based on a correlation of the query and the facet.
Abstract: A method includes determining a facet from a web page to be indexed for searching, indexing the web page with the facet in a search index, augmenting a user's web search query with the facet, searching the search index for the facet-augmented query, and presenting to the user, as a search result, the web page based on a correlation of the query and the facet. Another method includes receiving a user's web search query, augmenting the query with a facet, and transmitting the augmented query to a search engine. A further method includes agreeing with an advertiser to associate a web page with a facet, and indexing the web page with the facet in a search index.

135 citations

Proceedings ArticleDOI
12 Sep 2016
TL;DR: This paper proposes to use word embeddings to incorporate and weight terms that do not occur in the query, but are semantically related to the query terms, and develops an embedding-based relevance model, an extension of the effective and robust relevance model approach.
Abstract: Word embeddings, which are low-dimensional vector representations of vocabulary terms that capture the semantic similarity between them, have recently been shown to achieve impressive performance in many natural language processing tasks. The use of word embeddings in information retrieval, however, has only begun to be studied. In this paper, we explore the use of word embeddings to enhance the accuracy of query language models in the ad-hoc retrieval task. To this end, we propose to use word embeddings to incorporate and weight terms that do not occur in the query, but are semantically related to the query terms. We describe two embedding-based query expansion models with different assumptions. Since pseudo-relevance feedback methods that use the top retrieved documents to update the original query model are well-known to be effective, we also develop an embedding-based relevance model, an extension of the effective and robust relevance model approach. In these models, we transform the similarity values obtained by the widely-used cosine similarity with a sigmoid function to have more discriminative semantic similarity values. We evaluate our proposed methods using three TREC newswire and web collections. The experimental results demonstrate that the embedding-based methods significantly outperform competitive baselines in most cases. The embedding-based methods are also shown to be more robust than the baselines.

135 citations

Book ChapterDOI
01 Jan 2009
TL;DR: This paper proposes an approach which takes into account not only the amount of information related to an alternative (expressed by a distance from an ideal positive alternative) but also the reliability of information represented by an alternative meant as how sure the information is.
Abstract: In this paper we discuss the ranking of alternatives represented by elements of Atanassov’s intuitionistic fuzzy sets, to be called A-IFSs, for short. That is, alternatives are elements of the universe of discourse with a degree of membership and a degree of non-membership assigned. First, we show disadvantages of some approaches known from the literature, including a straightforward method based on the calculation of distances from the ideal positive alternative which can be viewed as a counterpart of the approach in the traditional fuzzy setting. Instead, we propose an approach which takes into account not only the amount of information related to an alternative (expressed by a distance from an ideal positive alternative) but also the reliability of information represented by an alternative meant as how sure the information is.

135 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168