scispace - formally typeset
Search or ask a question

Showing papers presented at "International ACM SIGIR Conference on Research and Development in Information Retrieval in 1979"


Journal ArticleDOI
01 Sep 1979
TL;DR: Experimental results show that it is possible to devise clustering strategies based on the principles of adaptation in natural systems that are both effective and efficient.
Abstract: Given a set of objects each of which is represented by a finite number of attributes or features and a clustering criterion that associates a value of utility to any classification, the objective of a clustering method is to identify that classification of the objects which optimizes the criterion. A new strategy to solve this problem is developed. The approach is, in essence, a modification of the reproductive plan, a type of adaptive procedure devised by Holland [2], which embodies many principles found in the adaptation of natural systems through evolution. The proposed approach differs from conventional methods in the sense that the search through the space of possible solutions proceeds in a parallel fashion.The adaptive clustering strategy requires the specification of methods for the generation of an initial population of classifications, the parent selection, the modifications and the replacement of current classifications with new ones. The effects of changing several of these features are investigated. Experimental results show that it is possible to devise clustering strategies based on the principles of adaptation in natural systems that are both effective and efficient.

77 citations


Journal ArticleDOI
S. H. Jamieson1
01 Sep 1979
TL;DR: A strategy is described for the implementation of high-powered techniques which overcomes this problem - an intelligent terminal is used in conjunction with an existing commercial I. R. system, which permits the implemented of more sophisticated retrieval techniques very cheaply.
Abstract: The results and achievements of research in information retrieval have had little influence on the types of retrieval mechanism implemented in the large commercial on-line retrieval systems. Commercial systems still use simple Boolean techniques while experiments have shown that other techniques, such as those making use of relevance information, perform better. Reasons for this are suggested. A strategy is described for the implementation of high-powered techniques which overcomes this problem - an intelligent terminal is used in conjunction with an existing commercial I. R. system. The added processing capability permits the implementation of more sophisticated retrieval techniques very cheaply. Retrieval methods that can be implemented on such a terminal used in this way are described.Particular emphasis is placed upon the practical implementation of a term weighting scheme based on relevance feedback information which generates a ranked list of documents in answer to a query.

12 citations


Journal ArticleDOI
W. D. Dominick, W. D. Penniman1
01 Sep 1979
TL;DR: This paper is based on, and extracted in part from, a much more expanded and detailed manuscript entitled "Monitoring and Evaluation of On-Line Information System Usage", accepted for publication in Information Processing and Management.
Abstract: This paper is based on, and extracted in part from, a much more expanded and detailed manuscript entitled "Monitoring and Evaluation of On-Line Information System Usage", accepted for publication in Information Processing and Management.

6 citations


Journal ArticleDOI
01 Jul 1979
TL;DR: The author's objectives are accomplished, the most outstanding of which is an "illustrative...critical, and quantative" presentation of representative samples of the network, hierarchic, inverted tree, and relational approaches to generalized data base management system (GDBMS).
Abstract: The author has been unquestionably successful in accomplishing his objectives, the most outstanding of which is an "illustrative...critical, and quantative" presentation of representative samples (chapters 5 through 9) of the network, hierarchic, inverted tree, and relational approaches to generalized data base management system (GDBMS). The common example carried through these chapters was a well chosen and effective vehicle for showing the commonalities, differences, strengths, and weaknesses of the approaches. The fundamentals of file organization and data base concepts (chapters 2 and 3), built upon some of the author's previously published work, preceed a short economic and architectual tabulation of marketed systems (chapter 4). The presentation of the last generalized data base management approach, IBM's experimental System R, is followed by a one chapter treatment of supporting concepts, functional dependencies, and normalization.

5 citations


Journal ArticleDOI
01 Sep 1979
TL;DR: An update procedure that draws as much as possible on intermediate results from previous updates and a user interface that provides for control over the process of retrieval without calling for knowledge of how that process works are proposed.
Abstract: METER is a text analysis and retrieval system for non-expert computer users to exploit statistical associations between index terms of documents. It will run on a DEC PDP-11/45 minicomputer with continually changing collections of up to 20,000 documents at a time. A scaled version of METER with all major features of the full system has been implemented on a DEC PDP-11/70 as an experimental test bed for evaluation and comparison of associative retrieval algorithms. Although the basic structure of METER is similar to earlier statistical systems for retrospective document searches, the severe requirements of frequent updates of a document collection, of running on a small processor, and of meeting needs of users with little technical training have led to some novel developments. Among these are an update procedure that draws as much as possible on intermediate results from previous updates and a user interface that provides for control over the process of retrieval without calling for knowledge of how that process works.

4 citations


Journal ArticleDOI
01 Sep 1979
TL;DR: Document retrieval system models are presented and measures to rank the closeness of documents to a query are given.
Abstract: Document retrieval system models are presented. Measures to rank the closeness of documents to a query are given. Algorithms to calculate the measures for graph and partition models are provided.

2 citations


Journal ArticleDOI
01 Sep 1979
TL;DR: The main conclusion is that models which concentrate on improving the effectiveness of the search process are not rendered redundant by the availability of new hardware, however, the efficiency of their implementation would be improved.
Abstract: Recently several models of the search process in a document retrieval system have been proposed and retrieval experiments have shown that they will improve system performance. These include models which use relevance judgements to rank documents in order of probability of relevance and models of retrieval from clusters of documents. In this paper various models are compared in terms of the ease with which they could be implemented. An important consideration is how this implementation would be affected by the introduction of new hardware such as content-addressable memories. The main conclusion is that models which concentrate on improving the effectiveness of the search process are not rendered redundant by the availability of new hardware. However, the efficiency of their implementation would be improved.

2 citations


Journal ArticleDOI
01 Sep 1979
TL;DR: The specific technologies; the specific major office functions; how they interrelate; and how they are making such radical productivity increases possible are discussed.
Abstract: It is well known that American productivity has advanced very little in the last ten years in contrast to many other countries' rapidly rising productivity. It is becoming evident that major productivity gains can be made, particularly in the office workforce, which constitutes the majority of the American workforce today. Rapidly developing information technologies are making it possible to achieve radically increased productivity in the office. This paper will discuss the specific technologies; the specific major office functions; how they interrelate; and how they are making such radical productivity increases possible. "The Paperless Office", created by Micronet, Inc. in Washington, D.C., will be described. The current project to automate the office activities of the American Productivity Center in Houston, Texas, will be described. Some projections of the significance of these projects will be given.

2 citations


Journal ArticleDOI
08 Jan 1979
TL;DR: It is argued that since modern on-line systems have more than achieved the technological aims of the original workers in the field, there is no further need for research in automatic information retrieval.
Abstract: Automatic information retrieval, that is document retrieval, was an early concern in computing. It might, however, be thought that since modern on-line systems have more than achieved the technological aims of the original workers in the field, there is no further need for research. I shall argue that this is not the case.

2 citations


Journal ArticleDOI
08 Jan 1979
TL;DR: Professor Maurer's book has been translated into English by C. C. Price from the original Dataenstrukturen und Programmierverfahren because the translator had each segment of a program tested on a computer using a subset of PL/I.
Abstract: Professor Maurer's book has been translated into English by C. C. Price from the original Dataenstrukturen und Programmierverfahren. The text is based on the author's revised and expanded lecture notes. The four major topics are: A Model for the Manipulation of Data Structures, Lists, Trees, and Complex Data Structures. It concludes with an Index of Symbols, a Bibliography of 67 references, of which 79 percent are in English, and a Subject Index. In Addition, the translator had each segment of a program tested on a computer using a subset of PL/I.

1 citations



Journal ArticleDOI
01 Sep 1979
TL;DR: The Norris Cotton Cancer Center (NCCC) On-Line Personal Bibliographic Retrieval System was developed to assist researchers at the Center in managing personal or project-related collections of reference materials.
Abstract: The Norris Cotton Cancer Center (NCCC) On-Line Personal Bibliographic Retrieval System was developed to assist researchers at the Center in managing personal or project-related collections of reference materials. The system supports on-line entry, storage, and retrieval of bibliographic citations for collections of books, journal articles, reprints, reports, manuscripts, other documents, and audio-visual materials.The NCCC system is intended for relatively small collections of materials (under 10,000 items) and is not intended to duplicate or compete in any way with MEDLARS, CANCERLINE, or any of the other on-line bibliographic retrieval services. It is similar in concept to Mitre's SHOEBOX (1), but is much less complex and requires far less computer resources. The major advantage of the NCCC system is that it is simple, small, and easy to use.None of the techniques used in developing the NCCC system is particularly new or innovative. Rather, several well-known approaches to bibliographic retrieval were combined to produce a system that was easy and inexpensive to develop and is easy and inexpensive to use. The system is now running on the Dartmouth College Computing system's equivalent of a Honeywell 66/DPS-3. The programs are all written in BASIC.

Journal ArticleDOI
01 Sep 1979
TL;DR: This paper empirically compares placing symbolic information into both trees and lattices (a lattice may be thought of as a unidirectional network with single and initiating and terminating nodes).
Abstract: Unidirectional trees and lattices may both be used to hold a symbolic data base consisting of lexes, lexemes or other symbol strings. This paper empirically compares placing symbolic information into both trees and lattices (a lattice may be thought of as a unidirectional network with single and initiating and terminating nodes).

Journal ArticleDOI
Robert T. Dattola1
01 Sep 1979
TL;DR: It is shown that regular discrimination values are too costly to compute after every update to the data base and dynamic discrimination values that are easy to update are defined for use as approximations to regular values.
Abstract: The use of discrimination values as a term weighting function in document retrieval systems is examined. It is shown that regular discrimination values are too costly to compute after every update to the data base. Dynamic discrimination values that are easy to update are defined for use as approximations to regular values. Experiments are performed comparing regular vs. dynamic discrimination values. Actual user queries from an operational data base are used to evaluate dynamic discrimination values in a production environment. Generalized forms of normalized recall and precision are used as evaluation measures. Retrieval results indicate statistically significant improvements using dynamic discrimination weighting.

Journal ArticleDOI
01 Sep 1979
TL;DR: Computer and Information Retrieval professionals have the opportunity to apply information retrieval techniques within the second computer revolution to foster a new potential revolution in education, brought about by the advent of the personal computer.
Abstract: Information retrieval is conceptually fundamental in human communication as well as in man-computer communication. Computing and Information Retrieval professionals have the opportunity to apply information retrieval techniques within the second computer revolution to foster a new potential revolution in education, brought about by the advent of the personal computer.