Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Passage retrieval revisited

[...]

Marcin Kaszkiel¹, Justin Zobel¹•Institutions (1)

RMIT University¹

01 Jul 1997

TL;DR: This paper compares their scheme of arbitrary passage retrieval to several other document retrieval and passage retrieval methods and shows experimentally that, compared to these methods,ranking via fixed-length passages is robust and effective.

...read moreread less

Abstract: Ranking based on passages addresses some of the shortcomings of whole-document ranking. It provides convenient units of text to return to the user, avoids the difficulties of comparing documents of different length, and enables identification of short blocks of relevant material amongst otherwise irrelevant text. In this paper we explore the potential of passage retrieval, based on an experimental evaluation of the ability of passages to identify relevant documents. We compare our scheme of arbitrary passage retrieval to several other document retrieval and passage retrieval methods; we show experimentally that, compared to these methods, ranking via fixed-length passages is robust and effective. Our experiments also show that, compared to whole-document ranking, ranking via fixed-length arbitrary passages significantly improves retrieval effectiveness, by 8% for TREC disks 2 and 4 and by 18%-37% for the Federal Register collection.

...read moreread less

299 citations

Proceedings Article•DOI•

Query type classification for web document retrieval

[...]

Inho Kang, Gil-Chang Kim

28 Jul 2003

TL;DR: A user query classification scheme that uses the difference of distribution, mutual information, the usage rate as anchor texts, and the POS information for the classification and could get the best performance when the OKAPI scoring algorithm was used.

...read moreread less

Abstract: The heterogeneous Web exacerbates IR problems and short user queries make them worse. The contents of web documents are not enough to find good answer documents. Link information and URL information compensates for the insufficiencies of content information. However, static combination of multiple evidences may lower the retrieval performance. We need different strategies to find target documents according to a query type. We can classify user queries as three categories, the topic relevance task, the homepage finding task, and the service finding task. In this paper, a user query classification scheme is proposed. This scheme uses the difference of distribution, mutual information, the usage rate as anchor texts, and the POS information for the classification. After we classified a user query, we apply different algorithms and information for the better results. For the topic relevance task, we emphasize the content information, on the other hand, for the homepage finding task, we emphasize the Link information and the URL information. We could get the best performance when our proposed classification method with the OKAPI scoring algorithm was used.

...read moreread less

295 citations

Proceedings Article•DOI•

Combining document representations for known-item search

[...]

Paul Ogilvie¹, Jamie Callan¹•Institutions (1)

Carnegie Mellon University¹

28 Jul 2003

TL;DR: This paper investigates the pre-conditions for successful combination of document representations formed from structural markup for the task of known-item search, and presents a mixture-based language model to investigate several hypotheses.

...read moreread less

Abstract: This paper investigates the pre-conditions for successful combination of document representations formed from structural markup for the task of known-item search. As this task is very similar to work in meta-search and data fusion, we adapt several hypotheses from those research areas and investigate them in this context. To investigate these hypotheses, we present a mixture-based language model and also examine many of the current meta-search algorithms. We find that compatible output from systems is important for successful combination of document representations. We also demonstrate that combining low performing document representations can improve performance, but not consistently. We find that the techniques best suited for this task are robust to the inclusion of poorly performing document representations. We also explore the role of variance of results across systems and its impact on the performance of fusion, with the surprising result that the correct documents have higher variance across document representations than highly ranking incorrect documents.

...read moreread less

294 citations

Proceedings Article•

An Empirical Study on Learning to Rank of Tweets

[...]

Yajuan Duan¹, Long Jiang², Tao Qin², Ming Zhou², Heung-Yeung Shum² - Show less +1 more•Institutions (2)

University of Science and Technology of China¹, Microsoft²

23 Aug 2010

TL;DR: This paper proposes a new ranking strategy which uses not only the content relevance of a tweet, but also the account authority and tweet-specific features such as whether a URL link is included in the tweet.

...read moreread less

Abstract: Twitter, as one of the most popular micro-blogging services, provides large quantities of fresh information including real-time news, comments, conversation, pointless babble and advertisements. Twitter presents tweets in chronological order. Recently, Twitter introduced a new ranking strategy that considers popularity of tweets in terms of number of retweets. This ranking method, however, has not taken into account content relevance or the twitter account. Therefore a large amount of pointless tweets inevitably flood the relevant tweets. This paper proposes a new ranking strategy which uses not only the content relevance of a tweet, but also the account authority and tweet-specific features such as whether a URL link is included in the tweet. We employ learning to rank algorithms to determine the best set of features with a series of experiments. It is demonstrated that whether a tweet contains URL or not, length of tweet and account authority are the best conjunction.

...read moreread less

293 citations

Patent•

Method and apparatus for retrieving documents based on information other than document content

[...]

Douglass Russell Judd, Paul Andre Gauthier, J. Eric Baldeschwieler

03 Nov 1998

TL;DR: In this article, a method and apparatus for retrieving documents from a collection of documents based on information other than the contents of a desired document is provided for retrieval of documents from the Web.

...read moreread less

Abstract: A method and apparatus are provided for retrieving documents from a collection of documents based on information other than the contents of a desired document. The collection of documents, which may be a hypertext system or documents available via the World Wide Web, is indexed. In one embodiment, an indexing process of a search engine receives one or more specifications that identify documents, or document locations, and non-content information such as a tag word or code word. The indexing process searches the index to identify all documents in the index that match one or more of the specifications. If a match is found, the tag word is added to the index, and information about the matching document is stored in the index in association with the tag word. A search query is submitted to the search engine. The search query is automatically modified to add a reference to the tag word, such as a query term that will exclude any index entry for a document associated with the tag word. The search is executed against the index, and a set of search results is generated. Accordingly, the search results automatically exclude all documents associated with the tag word. These techniques may be used, for example, to implement a Web search service that produces more accurate search results or that prevents certain documents, such as pornographic materials, from appearing in search results.

...read moreread less

292 citations

Collapse

Network Information

Performance

Metrics

30,908

Papers

494,240

Citations

No. of papers in the topic in previous years
Year	Papers
2024	1
2023	3,112
2022	6,541
2021	1,105
2020	1,082
2019	1,168

Ranking (information retrieval)

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics