Okapi at TREC

Open AccessProceedings Article

Okapi at TREC

Stephen Robertson, +4 more

- Iss: 500207, pp 109-123

Chats0

TLDR

Much of the work involved investigating plausible methods of applying Okapi-style weighting to phrases, and expansion using terms from the top documents retrieved by a pilot search on topic terms was used.

Abstract:

City submitted two runs each for the automatic ad hoc, very large collection track, automatic routing and Chinese track; and took part in the interactive and filtering tracks. The method used was : expansion using terms from the top documents retrieved by a pilot search on topic terms. Additional runs seem to show that we would have done better without expansion. Twor runs using the method of city96al were also submitted for the Very Large Collection track. The training database and its relevant documents were partitioned into three parts. Working on a pool of terms extracted from the relevant documents for one partition, an iterative procedure added or removed terms and/or varied their weights. After each change in query content or term weights a score was calculated by using the current query to search a second protion of the training database and evaluating the results against the corresponding set of relevant documents. Methods were compared by evaluating queries predictively against the third training partition. Queries from different methods were then merged and the results evaluated in the same way. Two runs were submitted, one based on character searching and the other on words or phrases. Much of the work involved investigating plausible methods of applying Okapi-style weighting to phrases

Citations

PDF

Open Access

More filters

Book

Learning to Rank for Information Retrieval

Tie-Yan Liu

TL;DR: Three major approaches to learning to rank are introduced, i.e., the pointwise, pairwise, and listwise approaches, the relationship between the loss functions used in these approaches and the widely-used IR evaluation measures are analyzed, and the performance of these approaches on the LETOR benchmark datasets is evaluated.

...read moreread less

Book

The Probabilistic Relevance Framework

Stephen Robertson, +1 more

TL;DR: This work presents the PRF from a conceptual point of view, describing the probabilistic modelling assumptions behind the framework and the different ranking algorithms that result from its application: the binary independence model, relevance feedback models, BM25 and BM25F.

...read moreread less

Proceedings Article

From Word Embeddings To Document Distances

Matt J. Kusner, +3 more

TL;DR: It is demonstrated on eight real world document classification data sets, in comparison with seven state-of-the-art baselines, that the Word Mover's Distance metric leads to unprecedented low k-nearest neighbor document classification error rates.

...read moreread less

Journal ArticleDOI

Relevance-Based Language Models

Victor Lavrenko, +1 more

TL;DR: This work proposes a novel technique for estimating a relevance model with no training data and demonstrates that it can produce highly accurate relevance models, addressing important notions of synonymy and polysemy.

...read moreread less

Journal ArticleDOI

A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval

ChengXiang Zhai, +1 more

TL;DR: This paper examines the sensitivity of retrieval performance to the smoothing parameters and compares several popular smoothing methods on different test collection.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Relevance weighting of search terms

Stephen Robertson, +1 more

TL;DR: This paper examines statistical techniques for exploiting relevance information to weight search terms using information about the distribution of index terms in documents in general and shows that specific weighted search methods are implied by a general probabilistic theory of retrieval.

...read moreread less

Journal ArticleDOI

Relevance weighting of search terms

Stephen Robertson, +1 more

- 01 May 1976 -

Journal of the Association for Informati...

TL;DR: In this article, a series of relevance weighting functions is derived and is justified by theoretical considerations, in particular, it is shown that specific weighted search methods are implied by a general probabilistic theory of retrieval.

...read moreread less

Proceedings ArticleDOI

Probabilistic models of indexing and searching

Stephen Robertson, +2 more

TL;DR: There is a considerable body of related work by Salton, Yu and associates on automatic indexing using within-document frequencies of terms.

...read moreread less

Journal ArticleDOI

The use of term position devices in ranked output experiments

E. Michael Keen

- 01 Mar 1991 -

Journal of Documentation

TL;DR: The use of term proximity devices is proposed here by analogy with Boolean techniques and seven algorithms are devised to incorporate the ideas of sentence matching, proximate terms, term order specification and term distance computations.

...read moreread less

Journal ArticleDOI

An evaluation of automatic query expansion in an online library catalogue

Micheline Hancock-Beaulieu, +1 more

- 01 Dec 1992 -

Journal of Documentation

TL;DR: An automatic query expansion (AQE) facility in an online catalogue was evaluated in an operational library setting and found that contrary to previous results, AQE was beneficial in a substantial number of searches.

...read moreread less

Okapi at TREC

Citations

Learning to Rank for Information Retrieval

The Probabilistic Relevance Framework

From Word Embeddings To Document Distances

Relevance-Based Language Models

A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval

References

Relevance weighting of search terms

Relevance weighting of search terms

Probabilistic models of indexing and searching

The use of term position devices in ranked output experiments

An evaluation of automatic query expansion in an online library catalogue

Related Papers (5)

Term Weighting Approaches in Automatic Text Retrieval

Modern Information Retrieval

A vector space model for automatic indexing

Introduction to Information Retrieval

An algorithm for suffix stripping