Showing papers on "Document retrieval published in 1978"

PDF

Open Access

Journal Article•DOI•

An evaluation of feedback in document retrieval using co‐occurrence data

[...]

David J. Harper¹, C. J. van Rijsbergen•Institutions (1)

01 Mar 1978-Journal of Documentation

TL;DR: This paper reports experiments with a term weighting model incorporating relevance information in which it is assumed that index terms are distributed dependently and argues that if high recall searches are required, relevance feedback based on the modified dependence model may be superior to the widely used Boolean search.

...read moreread less

Abstract: This paper reports experiments with a term weighting model incorporating relevance information in which it is assumed that index terms are distributed dependently. Initially this model was tested with complete relevance information against a similar model which assumes index terms are distributed independently. The experiments demonstrated conclusively that index terms are not independent for a number of diverse document collections. It was concluded that the use of relevance information together with dependence information could potentially improve retrieval effectiveness. As a result of further experiments the initial strict dependence model was modified and in particular a new relevance‐based term weight was developed. This modified dependence model was then used as the basis for relevance feedback, i.e. with partial relevance information only, and significant increases in retrieval effectiveness were achieved. The evaluation method used in the feedback experiments emphasized the effect of the feedback on documents which the potential user would not previously have seen. Finally the incorporation of relevance feedback in an operational system is considered and in particular it is argued that if high recall searches are required, relevance feedback based on the modified dependence model may be superior to the widely used Boolean search.

...read moreread less

170 citations

Journal Article•DOI•

Can retrieval of information from citation indexes be simplified? Multiple mention of a reference as a characteristic of the link between cited and citing article

[...]

Gertrud Herlach¹•Institutions (1)

Novartis¹

01 Nov 1978-Journal of the Association for Information Science and Technology

TL;DR: In this article, the authors tested and accepted that the mechanistically identifiable citation link characteristic, mention of a given reference more than once within the same research paper, indicates a close and useful relationship of a citing to a given cited paper.

...read moreread less

Abstract: The hypothesis is tested and accepted that the mechanistically identifiable citation link characteristic, mention of a given reference more than once within the same research paper, indicates a close and useful relationship of a citing to a given cited paper. Closeness and usefulness of the relationship between papers linked by citation were determined by means of users' judgments. It is shown that as a selection criterion for document retrieval, multiple mention of a reference would yield good precision but low recall, since a considerable number of papers with corresponding single mention were judged closely related to the given cited paper. Frequency counts showed that approximately one-third of all bibliographic references in the research papers checked are mentioned in the text more than once.

...read moreread less

62 citations

Book Chapter•DOI•

The reference string indexing method

[...]

Hans-Jörg Schek¹•Institutions (1)

IBM¹

10 Oct 1978

TL;DR: Common restrictions such as the usage only of a certain set of descriptors or (complete) keywords in document retrieval systems or the specification of only certain attributed values for queries in formatted files should be removed without loosing performance necessary for interactive usage.

...read moreread less

Abstract: The motivation for the reference string indexing method may be derived from the intention to retrieve any piece of information by specifying arbitrary parts of it. Common restrictions such as the usage only of a certain set of descriptors or (complete) keywords in document retrieval systems or the specification of only certain (inverted) attributed values for queries in formatted files should be removed without loosing performance necessary for interactive usage.

...read moreread less

26 citations

Journal Article•DOI•

WEIRD: an approach to concept-based information retrieval

[...]

Matthew B. Koll¹•Institutions (1)

Syracuse University¹

01 May 1978

TL;DR: In the first section of the paper the basic characteristics of WEIRD are described and the results of a preliminary evaluation are reported, and alternatives for further development ofWEIRD are considered.

...read moreread less

Abstract: WEIRD is an automatic document retrieval system designed and implemented at Syracuse University, which attempts to advance the art of computerized retrieval from word-matching to judging conceptual similarity. WEIRD uses a vector space model to represent the relations among terms and documents. Items in the space are located according to their “meaning”, which is their proximity to all other items in the data base as measured by co-occurrence frequencies. This is done without manipulating large matrices. The dimensions of the space are not used to define relations; items are defined solely by their position relative to the other items. Retrieval is determined by Euclidean distance from the plotted query. In the first section of the paper the basic characteristics of WEIRD are described. Second, the results of a preliminary evaluation are reported. Alternatives for further development of WEIRD are then considered.

...read moreread less

26 citations

Journal Article•DOI•

Opinion paper: Design criteria for documentation retrieval languages

[...]

Friedrich Gebhardt, Imant Stellmacher

01 Jul 1978-Journal of the Association for Information Science and Technology

TL;DR: The proposed solution is to provide the user with a simple, clearly designed subset of the language that nevertheless includes all important query functions, while the additions to modify, shorten, improve, and extend it are left to the experienced user.

...read moreread less

Abstract: Query languages for document retrieval systems should be simple and easy to learn for the casual user; they should provide all conceivable facilities for the experienced user. These goals comprise the most serious contradictions that evolve between all the design criteria collected, compared, and evaluated in this paper. The proposed solution or, at least, relief to this conflict is to provide the user with a simple, clearly designed subset of the language that nevertheless includes all important query functions, while the additions to modify, shorten, improve, and extend it are left to the experienced user. It is stressed that the simple data formats available with most systems are insufficient; the need for more elaborate structures is substantiated. A point is made for a formal rather than a natural language for document retrieval.

...read moreread less

26 citations

Proceedings Article•

Current research into specialized processors for text information retrieval

[...]

Lee A. Hollaar, David C. Roberts

13 Sep 1978

TL;DR: Development efforts being carried out to produce backend systems for the efficient searching and retrieval of full text databases, and the characteristics of text retrieval, are presented.

...read moreread less

Abstract: While there have been a number of projects involved with the design and construction of specialized processors to aid in the efficient operation of large structured database systems, such as RAP or CASSM, very little work has been done on comparable hardware for text information retrieval. This paper summarizes development efforts being carried out to produce backend systems for the efficient searching and retrieval of full text databases. The characteristics of text retrieval, and its special problems when compared to other database systems, are presented. Two representative applications are discussed, one the retrieval of relevant items from a database being updated online from messages originating from a large number of sources, and the second a legal reference system consisting of all court decisions. Processors to scan large amounts of data at speeds comparable to the transfer rate of the disks on which it is stored are presented, along with a network of simple processors to allow rapid merging of directory information for inverted file systems.

...read moreread less

12 citations

Automatic classification in information retrieval

[...]

van Rijsbergen

01 Apr 1978

9 citations

Journal Article•DOI•

Does relevance feedback improve document retrieval performance

[...]

Robert E. Williamson

01 May 1978

TL;DR: Relevance feedback techniques as implemented in Salton's SMART DRS appear to show that it is worthwhile for user's to read abstracts prior to evaluation of full texts, and three reasonable, easily understood retrieval procedures are presented.

...read moreread less

Abstract: Many authors (1, 2, 3, 5, 6, 7) have suggested that overall performance of a document retrieval system is improved by relevance feedback. Relevance feedback denotes the last three steps in the following process: 1) the searcher enters a query, 2) the system prepares a ranked list of suggested documents, 3) the searcher judges some of the documents for relevancy, 4) the searcher informs the system of these documents judged and of the judgement, 5) the system constructs a new query based on the descriptors used in the original query and the descriptors used in the documents judged, 6) the system prepares a second ranked list of suggested documents. The presumption is that the second list is better than the first. By all performance measures (e.g. “fluid ranking” and “frozen ranking”), the second list is better than the first. However, if one reranks documents in the original list so as to reflect the searcher's efforts (step 3), the corresponding performance measures are comparable to those for the second list. The marginal difference between the performance measures for the ”reranked original” list (searcher's efforts alone) and the second list (which includes computer efforts) makes it unclear if the cost of steps 4 through 6 above can be justified. It is hoped that advocates of relevance feedback will present “reranked original” performance measures as a basis for any performance improvement claims. This paper also presents three reasonable, easily understood retrieval procedures for which the frozen ranking, the fluid ranking, and the reranked original evaluations are “obviously” the pertinent way to evaluate. Relevance feedback techniques as implemented in Salton's SMART DRS appear to show that it is worthwhile for user's to read abstracts prior to evaluation of full texts. The last indication presented in this paper is that the relevance feedback performance improvements noted using SMART are due mostly to the user making assessments; subsequent computer efforts appear to be most likely to result in no further change. For a query for which there is a subsequent change, the change is as likely to be harmful as helpful.

...read moreread less

8 citations

Journal Article•DOI•

Augmented Transition Networks as a design tool for personalized database systems

[...]

Alan L. Tharp

01 May 1978

TL;DR: This paper illustrates the use of Augmented Transition Networks as a design tool for constructing document retrieval systems for those personalized applications which are too small or specialized to attract a commercial vendor.

...read moreread less

Abstract: This paper illustrates the use of Augmented Transition Networks (ATNs) as a design tool for constructing document retrieval systems for those personalized applications which are too small or specialized to attract a commercial vendor. ATNs, which are explained in the context of this application, are used not only to improve the human/computer interface with the retrieval system but also to conceptually organize its structure.

...read moreread less

6 citations

Book•

Systems analysis for information retrieval

[...]

Helen M. Townley

01 Jan 1978

3 citations

Other•DOI•

Strong-Motion Information Retrieval System user's manual

[...]

April Converse

01 Jan 1978

Journal Article•

A Document-Retrieval Method Using Dependency-Relations of Titles

[...]

Shinobu Takamatsu, Fujio Nishida

31 Oct 1978-Bulletin of the University of Osaka Prefecture, Series A Engineering and Natural Sciences

Automatic query adjustment in document retrieval

[...]

Vernimb Carlo, Yoshio Sasaki

01 May 1978

Proceedings Article•DOI•

Use of an infinite-valued propositional calculus in a document retrieval system

[...]

John R. Miller

01 Jan 1978

TL;DR: It is proposed that a particular infinite-valued propositional calculus be used to define a retrieval function for a document retrieval system in which queries are well-formed formulas of the propositionally calculus.

...read moreread less

Abstract: It is proposed that a particular infinite-valued propositional calculus be used to define a retrieval function for a document retrieval system in which queries are well-formed formulas of the propositional calculus. With this approach, “weighted” queries may be processed in a simple, straightforward manner; queries may be represented as highly sparse vectors, and this representation may be transformed easily into the more conventional truth-table representation, and conversely; feedback techniques may be used; and a means is provided for circumventing the “independence assumption” with regard to subject identifiers.

...read moreread less

Journal Article•DOI•

Record block allocation for retrieval on secondary keys

[...]

Chung-Shu Yang

01 May 1978

TL;DR: This paper studies the problem of record address allocation in disk-like devices so as to facilitate the fast retrieval of a set of records which are jointly accessed by a query.

...read moreread less

Abstract: Query retrieval based on secondary keys is an important operation in retrieval systems. Such a query generally retrieves more than one data record which satisfies the query criterion. This paper studies the problem of record address allocation in disk-like devices so as to facilitate the fast retrieval of a set of records which are jointly accessed by a query. A heuristic scheme, using the proposed minimal access retrieval property, is designed to assign records to blocks. Some experimental results are also presented.

...read moreread less