scispace - formally typeset
Open AccessBook

A review of the use of inverted files for best match searching in information retrieval systems

Shirley A. Perry, +1 more
- pp 124-131
TLDR
In this article, the use of inverted files for the calculation of similarity coefficients and other types of matching function is discussed in the context of mechanised document retrieval systems and a critical evaluation is presented of a range of algorithms which have been described for the matching of documents with queries.
Abstract
The use of inverted files for the calculation of similarity coefficients and other types of matching function is discussed in the context of mechanised document retrieval systems. A critical evaluation is presented of a range of algorithms which have been described for the matching of documents with queries. Particular attention is paid to the computational efficiency of the various procedures, and improved search heuristics are given in some cases. It is suggested that the algorithms could be implemented sufficiently efficiently to permit the provision of nearest neighbour searching as a standard retrieval option.

read more

Citations
More filters
Journal ArticleDOI

Inverted files for text search engines

TL;DR: This tutorial introduces the key techniques in the area of text indexing, describing both a core implementation and how the core can be enhanced through a range of extensions.
Journal ArticleDOI

Recent trends in hierarchic document clustering: a critical review

TL;DR: Algorithms that can be used to allow the implementation of hierarchic agglomerative clustering methods for document retrieval, and experimental evidence suggests that nearest neighbor clusters provide a reasonably efficient and effective means of including interdocument similarity information in document retrieval systems.
Journal ArticleDOI

Filtered document retrieval with frequency-sorted indexes

TL;DR: An evaluation technique that uses early recognition of which documents are likely to be highly ranked to reduce costs is proposed and it is shown that frequency sorting can lead to a net reduction in index size, regardless of whether the index is compressed.
Journal ArticleDOI

Retrieving Records from a Gigabyte of Text on a Minicomputer Using Statistical Ranking.

TL;DR: To show the feasibility ofStatistically based ranked retrieval of records using keywords, research was done to produce very fast search techniques using these ranking algorithms, and to test the results against large databases with many end users.
Journal ArticleDOI

New techniques for best-match retrieval

TL;DR: A scheme to answer best-match queries from a file containing a collection of objects to allow the optimum use of any given set of precomputed intrafile distances is described.