scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: Compared to passage ranking with adaptations of current document ranking algorithms, the new “DO-TOS” passage-ranking algorithm requires only a fraction of the resources, at the cost of a small loss of effectiveness.
Abstract: Queries to text collections are resolved by ranking the documents in the collection and returning the highest-scoring documents to the user. An alternative retrieval method is to rank passages, that is, short fragments of documents, a strategy that can improve effectiveness and identify relevant material in documents that are too large for users to consider as a whole. However, ranking of passages can considerably increase retrieval costs. In this article we explore alternative query evaluation techniques, and develop new tecnhiques for evaluating queries on passages. We show experimentally that, appropriately implemented, effective passage retrieval is practical in limited memory on a desktop machine. Compared to passage ranking with adaptations of current document ranking algorithms, our new “DO-TOS” passage-ranking algorithm requires only a fraction of the resources, at the cost of a small loss of effectiveness.

98 citations

Patent
28 Feb 2006
TL;DR: In this article, a system for and a method of using user-entered information to return more meaningful information in response to Internet search queries are disclosed, which includes managing a database and displaying search results from the database.
Abstract: A system for and a method of using user-entered information to return more meaningful information in response to Internet search queries are disclosed. A method in accordance with the present invention comprises managing a database in response to multiple user inputs and displaying search results from the database in response to a search query. The search results include a results list and supplemental data related to the search query. Managing the database includes, among other things, re-ranking elements in the results list, storing information related to relevancies of elements in the results list, blocking a link in the results list, storing links to documents related to the search query, or any combination of these. The supplemental data include descriptions of or indices to one or more concepts related to the search query.

98 citations

Book ChapterDOI
10 Apr 2006
TL;DR: In this article, a probabilistic user-item relevance model based on the classic probability ranking principle was proposed to re-formulate the problem of log-based collaborative filtering.
Abstract: Implicit acquisition of user preferences makes log-based collaborative filtering favorable in practice to accomplish recommendations. In this paper, we follow a formal approach in text retrieval to re-formulate the problem. Based on the classic probability ranking principle, we propose a probabilistic user-item relevance model. Under this formal model, we show that user-based and item-based approaches are only two different factorizations with different independence assumptions. Moreover, we show that smoothing is an important aspect to estimate the parameters of the models due to data sparsity. By adding linear interpolation smoothing, the proposed model gives a probabilistic justification of using TF×IDF-like item ranking in collaborative filtering. Besides giving the insight understanding of the problem of collaborative filtering, we also show experiments in which the proposed method provides a better recommendation performance on a music play-list data set.

98 citations

Journal ArticleDOI
TL;DR: This article proposes to use lightweight data summaries for determining relevant sources during query evaluation, and compares several data structures and hash functions with respect to their suitability for building such summaries, stressing benefits for queries that contain joins and require ranking of results and sources.
Abstract: A growing amount of Linked Data--graph-structured data accessible at sources distributed across the Web--enables advanced data integration and decision-making applications. Typical systems operating on Linked Data collect (crawl) and pre-process (index) large amounts of data, and evaluate queries against a centralised repository. Given that crawling and indexing are time-consuming operations, the data in the centralised index may be out of date at query execution time. An ideal query answering system for querying Linked Data live should return current answers in a reasonable amount of time, even on corpora as large as the Web. In such a live query system source selection--determining which sources contribute answers to a query--is a crucial step. In this article we propose to use lightweight data summaries for determining relevant sources during query evaluation. We compare several data structures and hash functions with respect to their suitability for building such summaries, stressing benefits for queries that contain joins and require ranking of results and sources. We elaborate on join variants, join ordering and ranking. We analyse the different approaches theoretically and provide results of an extensive experimental evaluation.

98 citations

Proceedings ArticleDOI
28 Jul 2013
TL;DR: The authors' L2R method was trained to learn the answer rating, based on the feedback users give to answers in Q&A forums, and was able to outperform a state of the art baseline with gains of up to 21% in NDCG, a metric used to evaluate rankings.
Abstract: Collaborative web sites, such as collaborative encyclopedias, blogs, and forums, are characterized by a loose edit control, which allows anyone to freely edit their content. As a consequence, the quality of this content raises much concern. To deal with this, many sites adopt manual quality control mechanisms. However, given their size and change rate, manual assessment strategies do not scale and content that is new or unpopular is seldom reviewed. This has a negative impact on the many services provided, such as ranking and recommendation. To tackle with this problem, we propose a learning to rank (L2R) approach for ranking answers in QA we also conducted a comprehensive study of the features, showing that (ii) review and user features are the most important in the QA and (iii) the best set of new features we proposed was able to yield the best quality rankings.

98 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168