scispace - formally typeset
Search or ask a question
Topic

Ranking (information retrieval)

About: Ranking (information retrieval) is a research topic. Over the lifetime, 21109 publications have been published within this topic receiving 435130 citations.


Papers
More filters
Proceedings ArticleDOI
19 Jul 2009
TL;DR: This work proposes an analogical reasoning-based approach which measures the analogy between the new question-answer linkages and those of relevant knowledge which contains only positive links; the candidate answer which has the most analogous link is assumed to be the best answer.
Abstract: The method of finding high-quality answers has significant impact on user satisfaction in community question answering systems. However, due to the lexical gap between questions and answers as well as spam typically existing in user-generated content, filtering and ranking answers is very challenging. Previous solutions mainly focus on generating redundant features, or finding textual clues using machine learning techniques; none of them ever consider questions and their answers as relational data but instead model them as independent information. Moreover, they only consider the answers of the current question, and ignore any previous knowledge that would be helpful to bridge the lexical and semantic gap. We assume that answers are connected to their questions with various types of latent links, i.e. positive indicating high-quality answers, negative links indicating incorrect answers or user-generated spam, and propose an analogical reasoning-based approach which measures the analogy between the new question-answer linkages and those of relevant knowledge which contains only positive links; the candidate answer which has the most analogous link is assumed to be the best answer. We conducted experiments based on 29.8 million Yahoo!Answer question-answer threads and showed the effectiveness of our approach.

113 citations

Proceedings ArticleDOI
10 Oct 2004
TL;DR: The rank-by-feature prism is introduced that is a color-coded lower-triangular matrix that guides users to desired features that help them interactively find interesting features in multidimensional data sets.
Abstract: Exploratory analysis of multidimensional data sets is challenging because of the difficulty in comprehending more than three dimensions. Two fundamental statistical principles for the exploratory analysis are (1) to examine each dimension first and then find relationships among dimensions, and (2) to try graphical displays first and then find numerical summaries (D.S. Moore, (1999). We implement these principles in a novel conceptual framework called the rank-by-feature framework. In the framework, users can choose a ranking criterion interesting to them and sort 1D or 2D axis-parallel projections according to the criterion. We introduce the rank-by-feature prism that is a color-coded lower-triangular matrix that guides users to desired features. Statistical graphs (histogram, boxplot, and scatterplot) and information visualization techniques (overview, coordination, and dynamic query) are combined to help users effectively traverse 1D and 2D axis-parallel projections, and finally to help them interactively find interesting features

113 citations

Journal ArticleDOI
TL;DR: A novel category-aware service clustering and distributed recommending method is proposed for automatic mashup creation and Experiments on a real-world dataset have proved that the proposed approach not only gains significant improvement at precision rate but also enhances the diversity of recommendation results.
Abstract: Mashup has emeraged as a promising way to allow developers to compose existed APIs (services) to create new or value-added services. With the rapid increasing number of services published on the Internet, service recommendation for automatic mashup creation gains a lot of momentum. Since mashup inherently requires services with different functions, the recommendation result should contain services from various categories. However, most existing recommendation approaches only rank all candidate services in a single list, which has two deficiencies. First, ranking services without considering to which categories they belong may lead to meaningless service ranking and affect the recommendation accuracy. Second, mashup developers are not always clear about which service categories they need and services in which categories cooperate better for mashup creation. Without explicitly recommending which service categories are relevant for mashup creation, it remains difficult for mashup developers to select proper services in a mixed ranking list, which lower the user friendliness of recommendation. To overcome these deficiencies, a novel category-aware service clustering and distributed recommending method is proposed for automatic mashup creation. First, a Kmeans variant ( vKmeans ) method based on topic model Latent Dirichlet Allocation is introduced for enhancing service categorization and providing a basis for recommendation. Second, on top of vKmeans , a service category relevance ranking ( SCRR ) model, which combines machine learning and collaborative filtering, is developed to decompose mashup requirements and explicitly predict relevant service categories. Finally, a category-aware distributed service recommendation ( CDSR ) model, which is based on a distributed machine learning framework, is developed for predicting service ranking order within each category. Experiments on a real-world dataset have proved that the proposed approach not only gains significant improvement at precision rate but also enhances the diversity of recommendation results.

113 citations

Proceedings ArticleDOI
06 Nov 2007
TL;DR: This work designs a ranking formula that strikes a balance between producing good results and reducing query processing time, and uses data from the Orkut social network, which includes over 40 million users, to show that this ranking, augmented by this new signal, produces high quality results, while maintaining queryprocessing time small.
Abstract: In social networks such as Orkut, www.orkut.com, a large portion of the user queries refer to names of other people. Indeed, more than 50% of the queries in Orkut are about names of other users, with an average of 1.8 terms per query. Further, the users usually search for people with whom they maintain relationships in the network. These relationships can be modelled as edges in a friendship graph, a graph in which the nodes represent the users. In this context, search ranking can be modelled as a function that depends on the distances among users in the graph, more specifically, of shortest paths in the friendship graph. However, application of this idea to ranking is not straightforward because the large size of modern social networks (dozens of millions of users) prevents efficient computation of shortest paths at query time. We overcome this by designing a ranking formula that strikes a balance between producing good results and reducing query processing time. Using data from the Orkut social network, which includes over 40 million users, we show that our ranking, augmented by this new signal, produces high quality results, while maintaining query processing time small.

113 citations

Proceedings ArticleDOI
26 Apr 2010
TL;DR: Facetedpedia is a faceted retrieval system for information discovery and exploration in Wikipedia that builds upon the collaborative vocabulary in Wikipedia, more specifically the intensive internal structures (hyperlinks) and folksonomy (category system).
Abstract: This paper proposes Facetedpedia, a faceted retrieval system for information discovery and exploration in Wikipedia. Given the set of Wikipedia articles resulting from a keyword query, Facetedpedia generates a faceted interface for navigating the result articles. Compared with other faceted retrieval systems, Facetedpedia is fully automatic and dynamic in both facet generation and hierarchy construction, and the facets are based on the rich semantic information from Wikipedia. The essence of our approach is to build upon the collaborative vocabulary in Wikipedia, more specifically the intensive internal structures (hyperlinks) and folksonomy (category system). Given the sheer size and complexity of this corpus, the space of possible choices of faceted interfaces is prohibitively large. We propose metrics for ranking individual facet hierarchies by user's navigational cost, and metrics for ranking interfaces (each with k facets) by both their average pairwise similarities and average navigational costs. We thus develop faceted interface discovery algorithms that optimize the ranking metrics. Our experimental evaluation and user study verify the effectiveness of the system.

113 citations


Network Information
Related Topics (5)
Web page
50.3K papers, 975.1K citations
83% related
Ontology (information science)
57K papers, 869.1K citations
82% related
Graph (abstract data type)
69.9K papers, 1.2M citations
82% related
Feature learning
15.5K papers, 684.7K citations
81% related
Supervised learning
20.8K papers, 710.5K citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20241
20233,112
20226,541
20211,105
20201,082
20191,168