scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: The effectiveness and the benefits of the new indices are exhibited to unfold the full potential of the h-index, with extensive experimental results obtained from the DBLP, a widely known on-line digital library.
Abstract: What is the value of a scientist and its impact upon the scientific thinking? How can we measure the prestige of a journal or a conference? The evaluation of the scientific work of a scientist and the estimation of the quality of a journal or conference has long attracted significant interest, due to the benefits by obtaining an unbiased and fair criterion. Although it appears to be simple, defining a quality metric is not an easy task. To overcome the disadvantages of the present metrics used for ranking scientists and journals, J. E. Hirsch proposed a pioneering metric, the now famous h-index. In this article we demonstrate several inefficiencies of this index and develop a pair of generalizations and effective variants of it to deal with scientist ranking and publication forum ranking. The new citation indices are able to disclose trendsetters in scientific research, as well as researchers that constantly shape their field with their influential work, no matter how old they are. We exhibit the effectiveness and the benefits of the new indices to unfold the full potential of the h-index, with extensive experimental results obtained from the DBLP, a widely known on-line digital library.

399 citations

Proceedings ArticleDOI
David D. Lewis1
01 Jul 1995
TL;DR: This work shows how to define what constitutes good effectiveness for binary text classification systems, tune the systems to achieve the highest possible effectiveness, and estimate how the effectiveness changes as new data is processed.
Abstract: Text retrieval systems typically produce a ranking of documents and let a user decide how far down that ranking to go. In contrast, programs that filter text streams, software that categorizes documents, agents which alert users, and many other IR systems must make decisions without human input or supervision. It is important to define what constitutes good effectiveness for these autonomous systems, tune the systems to achieve the highest possible effectiveness, and estimate how the effectiveness changes as new data is processed. We show how to do this for binary text classification systems, emphasizing that different goals for the system le ad to different optimal behaviors. Optimizing and estimating effectiveness is greatly aided if classifiers that explicitly estimate the probability of class membership are used. Ranked retrieval is the information retrieval (IR) researc her’s favorite tool for dealing with information overload. Ranked retrieval systems display documents in order of probability of releva nce or some similar measure. Users see the best documents first, anddecide how far down the ranking to go in examining the available information. The central role played by ranking in this appr oach has led researchers to evaluate IR systems primarily, often exclusively, on the quality of their rankings. (See, for instance , the TREC evaluations [1].) In some IR applications, however, ranking is not enough: A company provides an SDI (selective dissemination of information) service which filters newswire feeds. Relevant articles are faxed each morning to clients. Interaction between customer and system takes place infrequently. The cost of resources (tying up phone lines, fax machine paper, etc.) is a factor to consider in operating the system. A text categorization system assigns controlled vocabulary categories to incoming documents as they are stored in a text database. Cost cutting has eliminated manual checking of category assignments.

397 citations

Proceedings ArticleDOI
03 Nov 2003
TL;DR: Two algorithms for determining expertise from email were compared: a content-based approach that takes account only of email text, and a graph-based ranking algorithm (HITS) that take account both of text and communication patterns.
Abstract: A common method for finding information in an organization is to use social networks---ask people, following referrals until someone with the right information is found. Another way is to automatically mine documents to determine who knows what. Email documents seem particularly well suited to this task of "expertise location", as people routinely communicate what they know. Moreover, because people explicitly direct email to one another, social networks are likely to be contained in the patterns of communication. Can these patterns be used to discover experts on particular topics? Is this approach better than mining message content alone? To find answers to these questions, two algorithms for determining expertise from email were compared: a content-based approach that takes account only of email text, and a graph-based ranking algorithm (HITS) that takes account both of text and communication patterns. An evaluation was done using email and explicit expertise ratings from two different organizations. The rankings given by each algorithm were compared to the explicit rankings with the precision and recall measures commonly used in information retrieval, as well as the d' measure commonly used in signal-detection theory. Results show that the graph-based algorithm performs better than the content-based algorithm at identifying experts in both cases, demonstrating that the graph-based algorithm effectively extracts more information than is found in content alone.

395 citations

Proceedings ArticleDOI
01 Jul 1997
TL;DR: The role of phrases in query expansion via local context analysis and local feedback and how they can be used to significantly reduce the error associated with automatic dictionary translation are explored.
Abstract: Dictionary methods for cross-language information retrieval give performance below that for mono-lingual retrieval. Failure to translate multi-term phrases has been shown to be one of the factors responsible for the errors associated with dictionary methods. First, we study the importance of phrasal translation for this approach. Second, we explore the role of phrases in query expansion via local context analysis and local feedback and show how they can be used to significantly reduce the error associated with automatic dictionary translation.

394 citations

Journal ArticleDOI
17 Aug 2005-Nature
TL;DR: The H-index as mentioned in this paper sums up publication record and is the most popular publication index in the world, and it can be found here: http://www.h-index.org/
Abstract: ‘H-index’ sums up publication record.

393 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168