scispace - formally typeset
Open Access

Optimizing ranking functions: a connectionist approach to adaptive information retrieval

TLDR
This dissertation examines the use of adaptive methods to automatically improve the performance of ranked text retrieval systems and proposes and empirically validate general adaptive methods which improve the ability of a large class of retrieval systems to rank documents effectively.
Abstract
This dissertation examines the use of adaptive methods to automatically improve the performance of ranked text retrieval systems. The goal of a ranked retrieval system is to manage a large collection of text documents and to order documents for a user based on the estimated relevance of the documents to the user's information need (or query). The ordering enables the user to quickly find documents of interest. Ranked retrieval is a difficult problem because of the ambiguity of natural language, the large size of the collections, and because of the varying needs of users and varying collection characteristics. We propose and empirically validate general adaptive methods which improve the ability of a large class of retrieval systems to rank documents effectively. Our main adaptive method is to numerically optimize free parameters in a retrieval system by minimizing a non-metric criterion function. The criterion measures how well the system is ranking documents relative to a target ordering, defined by a set of training queries which include the users' desired document orderings. Thus, the system learns parameter settings which better enable it to rank relevant documents before irrelevant. The non-metric approach is interesting because it is a general adaptive method, an alternative to supervised methods for training neural networks in domains in which rank order or prioritization is important. A second adaptive method is also examined, which is applicable to a restricted class of retrieval systems but which permits an analytic solution. The adaptive methods are applied to a number of problems in text retrieval to validate their utility and practical efficiency. The applications include: A dimensionality reduction of vector-based document representations to a vector space in which inter-document similarity more accurately predicts semantic association; the estimation of a similarity measure which better predicts the relevance of documents to queries; and the estimation of a high-performance neural network combination of multiple retrieval systems into a single overall system. The applications demonstrate that the approaches improve performance and adapt to varying retrieval environments. We also compare the methods to numerous alternative adaptive methods in the text retrieval literature, with very positive results.

read more

Citations
More filters
Proceedings ArticleDOI

Models for metasearch

TL;DR: The experimental results show that metasearch algorithms based on the Borda and Bayesian models usually outperform the best input system and are competitive with, and often outperform, existing metAsearch strategies.
Proceedings ArticleDOI

Automatic combination of multiple ranked retrieval systems

TL;DR: This work proposes a method by which the relevance estimates made by different experts can be automatically combined to result in superior retrieval performance and applies the method to two expert combination tasks.
Proceedings ArticleDOI

Condorcet fusion for improved retrieval

TL;DR: A graph-theoretic analysis is applied to one of the two major classes of voting procedures from Social Choice Theory, the Condorcet procedure, and yields a sorting-based algorithm that performs very well on TREC data, often outperforming existing metasearch algorithms whether or not relevance scores and training data is available.
Journal ArticleDOI

Semantically enhanced Information Retrieval: An ontology-based approach

TL;DR: The major contribution of this work is an innovative, comprehensive semantic search model, which extends the classic IR model, addresses the challenges of the massive and heterogeneous Web environment, and integrates the benefits of both keyword and semantic-based search.
Proceedings ArticleDOI

FRank: a ranking method with fidelity loss

TL;DR: An algorithm named FRank is proposed based on a generalized additive model for the sake of minimizing the fedelity loss and learning an effective ranking function and the experimental results show that the proposed algorithm outperforms other learning-based ranking methods on both conventional IR problem and Web search.
References
More filters
Book ChapterDOI

Learning internal representations by error propagation

TL;DR: This chapter contains sections titled: The Problem, The Generalized Delta Rule, Simulation Results, Some Further Generalizations, Conclusion.
Book

Introduction to Modern Information Retrieval

TL;DR: Reading is a need and a hobby at once and this condition is the on that will make you feel that you must read.