Showing papers on "Inverted index published in 1977"

PDF

Open Access

Journal Article•DOI•

Automatic ranked output from boolean searches in SIRE

[...]

Terry Noreault¹, Matthew B. Koll¹, Michael J. McGill¹•Institutions (1)

01 Nov 1977-Journal of the Association for Information Science and Technology

TL;DR: The study found that relevant documents were ranked significantly higher than nonrelevant documents in the set of documents retrieved in response to a Boolean query.

...read moreread less

Abstract: This study examined the effectiveness and efficiency of employing a fully automatic algorithm for ranking the results of Boolean searches of an inverted file design document retrieval system. The study indicated that with minor modification of file designs, such as those implemented in the Syracuse Information Retrieval Experiment (SIRE), document retrieval systems could efficiently provide users with output lists on which the rank order of a document is a good indicator of its probable relevance to the user's information need. The study found that relevant documents were ranked significantly higher than nonrelevant documents in the set of documents retrieved in response to a Boolean query. By utilizing an augmented inverted file design the variable incremental cost for ranked output was only ten cents per query. There was no increased user effort.

...read moreread less

92 citations

Journal Article•DOI•

Minimum cost selection of secondary indexes for formatted files

[...]

Henry D. Anderson¹, P. Bruce Berra¹•Institutions (1)

Syracuse University¹

01 Mar 1977-ACM Transactions on Database Systems

TL;DR: A cost function is developed for the evaluation of candidate indexing choices and applied to the optimization of index selection, demonstrating the increased effectiveness of secondary indexes for large files, the effect of the relative rates of retrieval and maintenance, the greater cost of allowing for arbitrarily formulated queries, and the impact on cost of the use of different index structures.

...read moreread less

Abstract: Secondary indexes are often used in database management systems for secondary key retrieval. Although their use can improve retrieval time significantly, the cost of index maintenance and storage increases the overhead of the file processing application. The optimal set of indexed secondary keys for a particular application depends on a number of application dependent factors. In this paper a cost function is developed for the evaluation of candidate indexing choices and applied to the optimization of index selection. Factors accounted for include file size, the relative rates of retrieval and maintenance and the distribution of retrieval and maintenance over the candidate keys, index structure, and system charging rates. Among the results demonstrated are the increased effectiveness of secondary indexes for large files, the effect of the relative rates of retrieval and maintenance, the greater cost of allowing for arbitrarily formulated queries, and the impact on cost of the use of different index structures.

...read moreread less

65 citations

Journal Article•DOI•

An Inverted File Processor for Information Retrieval

[...]

Stellhorn¹•Institutions (1)

United States Department of the Army¹

01 Dec 1977-IEEE Transactions on Computers

TL;DR: Using this equipment, a complicated sample search involving 70 terms and over 67 000 document references can be performed from 13 to 60 times faster than with a conventional machine.

...read moreread less

Abstract: Response time in large, inverted file document retrieval systems is determined primarily by the time required to access files of document identifiers on disk and perform the processing associated with a Boolean search request. This paper describes a specialized computer system capable of performing these functions in hardware. Using this equipment, a complicated sample search involving 70 terms and over 67 000 document references can be performed from 13 to 60 times faster than with a conventional machine. Alternatively, many small searches can be processed concurrently with little effect upon system performance. Similar configurations can be applied to standard merging and sorting problems.

...read moreread less

25 citations

Proceedings Article•DOI•

Self-adaptive automatic data base design

[...]

Michael Hammer¹•Institutions (1)

Massachusetts Institute of Technology¹

13 Jun 1977

TL;DR: The design principles of an automatic system that has the ability to choose the physical design for a data base and to adapt this design to changing requirements are presented.

...read moreread less

Abstract: Physical data base design, the selection of organizational structures and access mechanisms for a data base, is one of the most important responsibilities of a Data Base Administrator (DBA). A DBA often has difficulty in performing this task; he lacks the information needed to choose a design that is well matched to the data base's mode of use.This paper presents the design principles of an automatic system that has the ability to choose the physical design for a data base and to adapt this design to changing requirements. The components of such a system include: an information gathering module that collects global statistics on the overall usage pattern of the data base; a predictor that projects observed usage statistics into the future; a design evaluator that computes a figure of merit for any proposed design; and a heuristic proposer that synthesizes a small set of candidate designs for detailed consideration. These principles have been applied to the design of a system that selects secondary indices for an inverted file system.

...read moreread less

13 citations

Book Chapter•DOI•

Associative processing of non-numerical information

[...]

R. M. Lea¹•Institutions (1)

Brunel University London¹

01 Jan 1977

TL;DR: A common misconception is that associative processing is a special mode of computation which can only be achieved at high expense with complex hardware components, but this is not the case.

...read moreread less

Abstract: A common misconception is that associative processing is a special mode of computation which can only be achieved at high expense with complex hardware components. Consequently, it is often maintained that associative processing can only be justified in certain dedicated computer applications for which conventional computer hardware is cost-ineffective. In truth, associative processing is a natural form of information processing and its features are independent of the machine on which it is implemented. Moreover, computer systems supporting the storage, retrieval and processing of non-numerical information are inevitably associative processing systems, whether or not this was intended by their designers. To understand this, perhaps controversial, contention it is helpful to reflect on the nature of information itself.

...read moreread less

12 citations

Journal Article•DOI•

An inverted index implementation

[...]

Ken J. McDonell¹•Institutions (1)

Monash University, Clayton campus¹

01 Jan 1977-The Computer Journal

12 citations

Journal Article•DOI•

Design of geological data systems for developing nations

[...]

D. Gill, J. Beylin, S. Boehm, Y. Frenkel, E. Rosenthal - Show less +1 more

01 Apr 1977-Mathematical Geosciences

TL;DR: For analytical, inventory, and a variety of other basic types of geological data the main functions of an information management system can be accommodated by simple systems in which comprehensiveness is compromised in favor of practicality and ease of implementation.

...read moreread less

Abstract: For analytical, inventory, and a variety of other basic types of geological data the main functions of an information management system can adequately be accommodated by simple systems in which comprehensiveness is compromised in favor of practicality and ease of implementation. Albeit possessing some shortcomings, such a strategy is likely to prove profitable particularly to geologists in developing nations who are confronted with the task of self-developing much needed geological data systems in the face of limited electronic data processing resources. Based on the experience of the Geological Survey of Israel, several considerations and practical guidelines for the design and implementation of such systems can be outlined. Data bases should be limited in their scope to specific subjects or projects, be designed to serve existing and only the more realistic foreseeable needs, and include provisions for merger and intelligent communication with related files. Such data bases typically contain logically simple-structured information and are small in size. Revision, deletion, and update transactions are infrequent; the search criteria for retrieval are for the most part predictable and a fast response time is not essential. These attributes prescribe a preference for simple fixed- or semi-free-format sequential files which, in turn, simplify appreciably the programming of the supporting software. Input forms should be meticulously planned with due consideration given to aspects of software compatibility, user convenience and acceptance, and efficiency in data gathering. The use of standard forms should be integrated into the institution's routine to facilitate direct data entry by each contributor, thereby improving and economizing the data collection process, and to secure data capture at its acquisition level (field, laboratory). The user's more immediate retrieval needs are adequately satisfied by a master list, documenting the entire data base and a number of external inverted index directories cross-referencing the master list according to the attributes by which the file is most likely to be searched. Further development of output capabilities should be directed to provide for flexible retrieval by multikey query functions and base map posting. For data files storing raw chemical analyses of rocks and water samples, the incorporation of processing capabilities to compute interpretative geochemical parameters as an integral part of the system's output is particularly useful.

...read moreread less

3 citations

Journal Article•DOI•

Inverted file structure for molecular formula and homologous series searching of large data bases

[...]

R. Geoffrey. Dromey

01 Nov 1977-Analytical Chemistry

3 citations

Journal Article•DOI•

Computer search of a free-text data base as a tool for investigating structure-effect relationships

[...]

Rudolph J. Marcus¹, Eugene E. Gloye¹, Edwin T. Florance¹•Institutions (1)

Office of Naval Research¹

01 Jan 1977-Computational Biology and Chemistry

TL;DR: In the course of validating searches in a free-text data base taken from the Merck Index, an inverted Index was produced which can be entered by medical use rather than by compound name.

...read moreread less

1 citations