Showing papers on "Document retrieval published in 1985"

PDF

Open Access

Journal Article•DOI•

An evaluation of retrieval effectiveness for a full-text document-retrieval system

[...]

David C. Blair¹, M. E. Maron²•Institutions (2)

University of Michigan¹, University of California, Berkeley²

01 Mar 1985-Communications of The ACM

TL;DR: An evaluation of a large, operational full-text document-retrieval system shows the system to be retrieving less than 20 percent of the documents relevant to a particular search.

...read moreread less

Abstract: An evaluation of a large, operational full-text document-retrieval system (containing roughly 350,000 pages of text) shows the system to be retrieving less than 20 percent of the documents relevant to a particular search. The findings are discussed in terms of the theory and practice of full-text document retrieval.

...read moreread less

871 citations

Journal Article•DOI•

Access methods for text

[...]

Chris Faloutsos¹•Institutions (1)

University of Toronto¹

01 Mar 1985-ACM Computing Surveys

TL;DR: This paper compares text retrieval methods intended for office systems with methods from database systems and from information retrieval systems, and examines the most interesting representatives of each class.

...read moreread less

Abstract: This paper compares text retrieval methods intended for office systems. The operational requirements of the office environment are discussed, and retrieval methods from database systems and from information retrieval systems are examined. We classify these methods and examine the most interesting representatives of each class. Attempts to speed up retrieval with special purpose hardware are also presented, and issues such as approximate string matching and compression are discussed. A qualitative comparison of the examined methods is presented. The signature file method is discussed in more detail.

...read moreread less

375 citations

Journal Article•DOI•

RUBRIC: A System for Rule-Based Information Retrieval

[...]

B.P. Mc Cune, R.M. Tong, Jeffrey S. Dean, Daniel G. Shapiro

01 Sep 1985-IEEE Transactions on Software Engineering

TL;DR: Initial experiments indicate that a RUBRIC rule set better matches human retrieval judgment than a standard Boolean keyword expression, given equal amounts of effort in defining each.

...read moreread less

Abstract: A research prototype software system for conceptual information retrieval has been developed. The goal of the system, called RUBRIC, is to provide more automated and relevant access to unformatted textual databases. The approach is to use production rules from artificial intelligence to define a hierarchy of retrieval subtopics, with fuzzy context expressions and specific word phrases at the bottom. RUBRIC allows the definition of detailed queries starting at a conceptual level, partial matching of a query and a document, selection of only the highest ranked documents for presentation to the user, and detailed explanation of how and why a particular document was selected. Initial experiments indicate that a RUBRIC rule set better matches human retrieval judgment than a standard Boolean keyword expression, given equal amounts of effort in defining each. The techniques presented may be useful in stand-alone retrieval systems, front-ends to existing information retrieval systems, or real-time document filtering and routing.

...read moreread less

149 citations

Journal Article•DOI•

Moves in online searching

[...]

Raya Fidel¹•Institutions (1)

University of Washington¹

01 Jan 1985-Online Information Review

TL;DR: Observation and analysis of about ninety searches resulted in a list of eighteen operational moves, or modifications of query formulation, that keep the meaning of query components unchanged, and twelve conceptual moves which change the meaningof query components.

...read moreread less

Abstract: Moves, or changes in query formulation, are made to resolve three problem situations: (1) when retrieved sets are too large; (2) when they are too small; or (3) when retrieved sets are off‐target. Observation and analysis of about ninety searches resulted in a list of eighteen operational moves, or modifications of query formulation, that keep the meaning of query components unchanged, and twelve conceptual moves which change the meaning of query components. All these moves are explained and then related to search tactics and strategies.

...read moreread less

141 citations

The Effectiveness and Efficiency of Agglomerative Hierarchic Clustering in Document Retrieval

[...]

Ellen Marie Voorhees

01 Oct 1985

TL;DR: The main goal of this thesis is to compare clustered file searches and inverted file searches in order to determine under what circumstances one search is to be preferred over the other.

...read moreread less

Abstract: The major component of a document retrieval system is the component that searches the document collection and selects the documents to be returned in response to a query. Since users wait for the results of the search, the component must be efficient as well as effective. The main goal of this thesis is to compare clustered file searches and inverted file searches in order to determine under what circumstances one search is to be preferred over the other. A preliminary goal is to define a good cluster search. Three types of agglomerative clustering strategies, the single link, the complete link, and the group average link methods, are investigated. Searches of the single link hierarchy, the cluster hierarchy used extensively in previous research, are shown to be inferior to searches of the other hierarchy types. Searches of the group average link and complete link hierarchies perform similarly for small collections; for larger collections, searches of the complete link hierarchy are more effective. A top-down search of the group average link hierarchy is the most time efficient search asymptotically. The experimental evidence suggests that the difference in the efficiency and effectiveness of the complete link and group average link searches is due to the restricted depth of the complete link hierarchy. The depth of the group average link hierarchy increases as the size of the collection increases, but the depth of the complete link hierarchy does not. Thus the largest clusters in the complete link hierarchy are not very large, and the clusters can be accurately represented by centroids. Since the depth of the hierarchy does not increase with collection size, searches of the complete link hierarchy should remain effective for larger collections. The top-down search of the complete link hierarchy is somewhat more effective than the inverted file search. The relative efficiency of the two searches depends on the relative efficiency of accessing a page and computing a similarity, since the cluster search accesses many more pages but computes fewer similarities than the inverted file search. For an inexpensive similarity measure, the inverted file search is much more efficient.

...read moreread less

131 citations

Book•

Interaction in information systems: A review of research from document retrieval to knowledge-based systems

[...]

Nicholas J. Belkin, Alina Vickery

01 Jan 1985

96 citations

Journal Article•DOI•

Full text database retrieval performance

[...]

Carol Tenopir¹•Institutions (1)

University of Hawaii at Manoa¹

01 Feb 1985-Online Information Review

TL;DR: Minor adjustments have been made for the display of full text databases, allowing words resulting in retrieval to be displayed in context; but changes have not been made in retrieval techniques.

...read moreread less

Abstract: Complete texts of many journals are now available for online searching. Most of these full text databases have been made available on the same or similar search systems that provide access to bibliographic information. The systems use inverted files that retain limited context information (e.g., paragraphs and location of words within paragraphs). The retrieval techniques used are simply those that were developed earlier for bibliographic databases. Retrieval relies on Boolean logic, word stem searching with truncation, and word proximity specification. Minor adjustments have been made for the display of full text databases, allowing words resulting in retrieval to be displayed in context; but changes have not been made in retrieval techniques. This is due to the reliance on search systems that provide access to many types of databases, all of which are by‐products of improved techniques for creating printed publications.

...read moreread less

67 citations

Journal Article•DOI•

Principles, procedures and rules in an expert system for information retrieval

[...]

Peretz Shoval¹•Institutions (1)

Ben-Gurion University of the Negev¹

01 Dec 1985-Information Processing and Management

TL;DR: This article presents the principle, procedures and rules which are utilized in the expert system, which assists users in selecting the right vocabulary terms for a database search.

...read moreread less

Abstract: An expert system was developed in the area of information retrieval, with the objective of performing the job of an information specialist, who assists users in selecting the right vocabulary terms for a database search. The system is composed of two components: One is the knowledge base, represented as a semantic network, in which the nodes are words, concepts, phrases, comprising a vocabulary of the application area and the links express semantic relationships between those nodes. The second component is the rules, or procedures, which operate upon the knowledge-base, analogous to the decision rules or work patterns of the information specialist. Two major stages comprise the consulting process of the system: During the “search” stage relevant knowledge in the semantic network is activated, and search and evaluation rules are applied in order to find appropriate vocabulary terms to represent the user's problem. During the “suggest” stage those terms are further evaluated, dynamically rank-ordered according to relevancy, and suggested to the user. Explanations to the findings can be provided by the system and backtracking is possible in order to find alternatives in case some suggested term is rejected by the user. This article presents the principle, procedures and rules which are utilized in the expert system.

...read moreread less

62 citations

Proceedings Article•DOI•

RUBRIC: an environment for full text information retrieval

[...]

Richard Tong, Victor N. Askman, James F. Cunningham, Carl J. Tollander

05 Jun 1985

TL;DR: The prototype system, called RUBRIC, is designed to help IR professionals gain easy access to large unformatted full text databases and can give significant improvements over commercially available Boolean keyword systems such as DIALOG, LEX1S, and MEDLARS.

...read moreread less

Abstract: This paper describes an ongoing investigation into the application of ideas from Artificial Intelligence (AI) in the development of a computer-based aid for Information Retrieval (IR). The prototype system, called RUBRIC, is designed to help IR professionals gain easy access to large unformatted full text databases. Knowledge about retrieval requests is encoded in RUBRIC as a collection of rules with attached uncertainty values. This representation provides an appropriately expressive query language that can represent partial relevance and which is easily understood and modified. When coupled with an effective user interface, the rule-based approach can, we believe, give significant improvements over commercially available Boolean keyword systems such as DIALOG, LEX1S, and MEDLARS. At the same time, it avoids the theoretical and computational problems associated with full scale natural language processing of documents (e.g., as proposed by Lebowitz [1]), and the dil~eulties users have in understanding the mechanisms used in statistical approaches (e.g., Salton's SMART system

...read moreread less

35 citations

Proceedings Article•

The type concept in office document retrieval

[...]

Federico Barbic, Fausto Rabitti

21 Aug 1985

TL;DR: A modeling approach based on the type definition and the use of types in query formulation and processing is presented, allowing the definition of types at different levels of detail (type hierarchies).

...read moreread less

Abstract: The problem of the retrieval by content of office documents is addressed here. However, the retrieval by content is greatly enhanced if the semantic role of document objects can he described. For this reason we introduce a conceptual level of modeling resulting in the definition of conceptual structures of documents. Type definition is essential for the retrieval, but since office document structures tend to greatly differ from instance to instance, we introduce the concept of weak type, allowing the definition of types at different levels of detail (type hierarchies). In this paper a modeling approach based on these ideas is presented. Particular emphasis is put on the type definition and the use of types in query formulation and processing.

...read moreread less

33 citations

Journal Article•DOI•

Full-text medical literature retrieval by computer. A pilot test.

[...]

Morris F. Collen, Charles D. Flagle

15 Nov 1985-JAMA

TL;DR: A pilot test of a full-text, medical literature retrieval service demonstrated its capabilities for on-line search and retrieval of references, abstracts, and/orFull-text journal articles for medical practice, medical education, and research.

...read moreread less

Abstract: A pilot test of a full-text, medical literature retrieval service demonstrated its capabilities for on-line search and retrieval of references, abstracts, and/or full-text journal articles During a three-month test period, more than 500 health care professionals conducted 9,377 searches using computer terminals located in seven different health care sites Searches were initiated for purposes of patient care, medical education, research, or for browsing The majority of responders to a questionnaire given during the test period said they would continue to use the service during the pilot test, and only about 1% reported the search process difficult to use or not "user-friendly" It is predictable that with a comprehensive data base, full-text medical literature retrieval can be very useful for medical practice, medical education, and research (JAMA1985;254:2768-2774)

...read moreread less

Journal Article•DOI•

A comparison of a network structure and a database system used for document retrieval

[...]

W. B. Croft¹, Thomas J. Parenty¹•Institutions (1)

University of Massachusetts Amherst¹

01 Dec 1985-Information Systems

TL;DR: The comparison shows that certain features of a database system can have a significant effect on the efficiency of the implementation, and it appears that a database implementation of a sophisticated document retrieval system can be competitive with a stand-alone implementation.

...read moreread less

Journal Article•DOI•

A new approach to information retrieval systems using fuzzy expressions

[...]

Rembrand B. R. C. Zenner¹, Rita De Caluwe¹, Etienne Kerre¹•Institutions (1)

Ghent University¹

01 Sep 1985-Fuzzy Sets and Systems

TL;DR: The method described solves the problems that Radecki-who uses lambda-level fuzzy sets-met with, trying to reduce time, and gives rise to a good working Information Retrieval System, as the examples show.

...read moreread less

Journal Article•DOI•

Performance of a multi-key access method based on descriptors and superimposed coding techniques

[...]

Ron Sacks-Davis¹•Institutions (1)

RMIT University¹

01 Dec 1985-Information Systems

TL;DR: It is shown that the method performs well on query and is efficient of storage, and experimental results based on the use of this method are presented.

...read moreread less

Proceedings Article•DOI•

Composite document extended retrieval: an overview

[...]

Edward A. Fox¹•Institutions (1)

Virginia Tech¹

05 Jun 1985

TL;DR: It is of interest to consider how Boolean logic systems can be extended to give better performance, especially with composite documents, and to integrate those approaches with vector methods.

...read moreread less

Abstract: Experimental information retrieval (IR) systems, some dating back to the sixties, have demonstrated the viability of fully automatic document storage and retrieval methodologies with small to medium size bibliographic collections [72]. Many of these experimental systems utilize the vector space model in which each important term (such as a word stem) identifies a different dimension in a space, so that matrix methods and vector operations can be defined on queries and documents. Statistical techniques have been very effective, and probabilistic enhancements have given additional improvements [84]. However, the basic vector space model is oriented towards recording the essential information in the text of a title/abstract combination rather than describing more complex document structures. It is necessary to extend the model in order to handle composite documents.On the other hand, commonly available retrieval systems that employ Boolean logic queries and utilize inverted file storage schemes can without modification accommodate such documents, albeit with somewhat less effectiveness than is possible with more sophisticated systems. Hence, it is also of interest to consider how Boolean logic systems can be extended to give better performance, especially with composite documents, and to integrate those approaches with vector methods.

...read moreread less

Book Chapter•DOI•

Text Retrieval Machines

[...]

Dik Lun Lee, Frederick H. Lochovsky

01 Jan 1985

TL;DR: Various approaches to text retrieval machines for large text database are surveyed and designs for multiple response resolution, an important but often ignored issue in associative memory and processors, are reviewed.

...read moreread less

Abstract: Various approaches to text retrieval machines for large text database are surveyed. Signature processors for supporting superimposed coding are first described. Text processors for pattern matching are then categorized and discussed. Finally, various designs for multiple response resolution, an important but often ignored issue in associative memory and processors, are reviewed.

...read moreread less

Journal Article•DOI•

Interfaces and expert systems for online retrieval

[...]

Cynthia A. Kehoe

01 Jun 1985-Online Information Review

TL;DR: The history of separate online system interfaces, leading to efforts to develop expert systems for searching databases, particularly for end users, are reviewed and the research in such expert systems is introduced.

...read moreread less

Abstract: This paper reviews the history of separate online system interfaces, leading to efforts to develop expert systems for searching databases, particularly for end users, and introduces the research in such expert systems. Appended is a bibliography of sources on interfaces and expert systems for online retrieval.

...read moreread less

A Knowledge-Based System Approach to Document Retrieval.

[...]

Gautam Biswas, Viswanath Subramanian, James C. Bezdek

01 Jan 1985

An expert assistant for document retrieval system.

[...]

W. Bruce Croft

01 Jan 1985

Journal Article•DOI•

Computer-aided searching of bibliographic data bases: Online estimation of the value of information

[...]

David R. Morehead¹, William B. Rouse²•Institutions (2)

Georgia Institute of Technology¹, Bell Labs²

01 Jan 1985-Information Processing and Management

TL;DR: Results of a series of five experiments on human information seeking behavior in three different information seeking environments are presented and a conceptual model of how humans value information is presented.

...read moreread less

Abstract: The paper presents and synthesizes results of a series of five experiments on human information seeking behavior in three different information seeking environments. The first three experiments utilized a highly-controlled, simulated information seeking task developed to study human search strategies in citation networks. Emphasis in the fourth and fifth experiments was placed on assessing the value of information for humans in realistic search environments. Subjects search on a topic of their own choice in a data base of fiction in Experiment Four and a data base of technical literature in Experiment Five. After summarizing the experimental results, a conceptual model of how humans value information is presented. The model is then used as a basis for a broad interpretation of the empirical results. Implications of both the empirical and modeling results are considered for the areas of information retrieval logic, system flexibility, retrieval methods, types of aiding, online estimation of information value, and computerizing versus computer-aiding.

...read moreread less

Patent•

Document filing system

[...]

Hiromichi Fujisawa, Toshihiro Hananoi, Atsushi Hatakeyama, Yasuaki Nakano, Junichi Tono - Show less +1 more

23 Aug 1985

TL;DR: In this paper, a character file system consisting of a control subsystem 100 for providing the control of the whole system and a data base function, an input subsystem 200, a document recognizing device 300 for recognizing a document, a text search system 400 for executing high speed text search, and a terminal subsystem 800 for executing retrieval.

...read moreread less

Abstract: PURPOSE: To provide the titled system with a full text searching function while directly referring the text of a document to retrieve the document by simultaneously executing the storage of test data and the retrieval of a character string from the read text data. CONSTITUTION: The character filing system consists of a control subsystem 100 for providing the control of the whole system and a data base function, an input subsystem 200, a document recognizing device 300 for recognizing a document, a text search system 400 for executing high speed text search, and a terminal subsystem 800 for executing retrieval. The text data read out from text files 451W453 are applied to the device 300 and the parallel search of character strings is executed. COPYRIGHT: (C)1987,JPO&Japio

...read moreread less

Journal Article•DOI•

A common interface for accessing document retrieval systems and dbms for retrieval of bibliographic data

[...]

Micheal A. Shepherd¹, Carolyn Watters¹•Institutions (1)

Technical University of Nova Scotia¹

01 Jun 1985-Information Processing and Management

TL;DR: PSI currently provides a common command language for access to multiple document retrieval systems and it is shown that PSI could be extended to provide this same command language to access DBMS, whether the DBMS are relational or network.

...read moreread less

Abstract: Due to their ready availability, database management systems are being applied to bibliographic databases with increasing frequency. This is being done in spite of the fact that although DBMS query languages tend to be very powerful, they are far too complex for the casual user. It is proposed that PSI, an existing virtual-system intermediary for document retrieval systems, be extended to include access to DBMS containing bibliographic data in order to circumvent the complexity problem or the casual user. PSI currently provides a common command language for access to multiple document retrieval systems. It is shown that PSI could be extended to provide this same command language to access DBMS, whether the DBMS are relational or network.

...read moreread less

AN EVALUATION OF RETRIEVAL EFFECTIVENESS FOR A FULL-TEXT DOCulwvT-l?ETl?lEviiL SYSTEM

[...]

David C. Blair, M. E. Maron

01 Jan 1985

TL;DR: Two elements make the idea of automatic full-text retrieval even more attractive: digital technology continues to provide computers that are larger, faster, cheaper, more reliable, and easier to use; and, on the other hand, full- text retrieval avoids the risks of human error.

...read moreread less

Abstract: Document retrieval is the problem of finding stored documents that contain useful information. There exist a set of documents on a range of topics, written by different authors, at different times, and at varying levels of depth, detail, clarity, and precision, and a set of individuals who, at different times and for different reasons, search for recorded information that may be contained in some of the documents in this set. In each instance in which an individual seeks information, he or she will find some documents of the set useful and other documents not useful; the documents found useful are, we say, relevant; the others, not relevant. How should a collection of documents be organized so that a person can find all and only the relevant items? One answer is automatic full-text retrieval, which on its surface is disarmingly simple: Store the full text of all documents in the collection on a computer so that every character of every word in every sentence of every document can be located by the machine. Then, when a person wants information from that stored collection, the computer is instructed to search for all documents containing certain specified words and word combinations, which the user has specified. Two elements make the idea of automatic full-text retrieval even more attractive. On the one hand, digital technology continues to provide computers that are larger, faster, cheaper, more reliable, and easier to use; and, on the other hand, full-text retrieval avoids the

...read moreread less

Journal Article•DOI•

Document Storage and Retrieval in the Electronic Office.

[...]

John Ashford

01 Apr 1985-Information Development

TL;DR: Proposals are made for practical approaches to the design of electronic office systems to provide for the effective storage and retrieval of the documents which they generate.

...read moreread less

Abstract: Electronic office systems involving the creation, transmission and storage of documents are now being installed with fifty or more user stations. Little provision is yet being made for the filing and retrieval of documents held within the systems, and problems common in conventional filing and registry practice will be at least as difficult in the electronic office. Proposals are made for practical approaches to the design of electronic office systems to provide for the effective storage and retrieval of the documents which they generate.

...read moreread less

Journal Article•

What's in a Name? Use of Names and Titles in Subject Searching.

[...]

Anne B. Piternick

01 Jan 1985-Database

TL;DR: Certaines bases de donnees permettent l'interrogation par noms d'individus, d'organismes, de systemes etc., on donne des exemples de ce type of recherches et oficiales qu'elles presentent.

...read moreread less

Abstract: Certaines bases de donnees permettent l'interrogation par noms d'individus, d'organismes, de systemes etc. On donne des exemples de ce type de recherches et de l'avantage qu'elles presentent

...read moreread less

Proceedings Article•DOI•

A testbed for information retrieval research: the Utah retrieval system architecture

[...]

Lee A. Hollaar¹•Institutions (1)

University of Utah¹

05 Jun 1985

TL;DR: The background of URSA and its structure is discussed, with particular emphasis on the features that make it a good testbed for information retrieval techniques.

...read moreread less

Abstract: The Utah Retrieval System Architecture provides an excellent testbed for the development and testing of new algorithms or techniques for information retrieval. URSA™ is a message-based structure capable of running on a variety of system configurations, ranging from a single mainframe processor to a system distributed across a number of dissimilar processors. It can readily support a variety of specialized backend processors, such as high-speed search engines.The architecture divides the components of a text retrieval system into two classes: servers and clients. A triple of servers (index, search, and document access) for each database provide the capabilities normally associated with a retrieval system. Possible clients for these servers include a window-based user interface, whose query language can be easily modified, a connection to a mainframe host processor, or Al-based query modification programs that wish to use the database.Any module in the system can be replaced by a new module using a different algorithm as long as the new module complies with the message formats for that function. In fact, with some care this module switch can occur while the system is running, without affecting the users. A monitor program collects statistics on all system messages, giving information regarding query complexity, processing time for each module, queueing times, and bandwidths between every module.This paper discusses the background of URSA and its structure, with particular emphasis on the features that make it a good testbed for information retrieval techniques.

...read moreread less

Journal Article•

BACS: evolution of an integrated library system toward information management.

[...]

E A Kelly, B Halbrook, S Igielnik, C Rueby

01 Jan 1985-Bulletin of The Medical Library Association

TL;DR: The evolution of the Washington University School of Medicine BACS integrated library system toward information management functions is outlined and it is argued that libraries are flexible institutions that are likely to enlarge rather than to diminish.

...read moreread less

Abstract: The evolution of the Washington University School of Medicine BACS integrated library system toward information management functions is outlined. The creation of a machine-readable database and its extension through telecommunications have consequences that reach beyond the functions of the library as we have perceived them. It is argued that libraries are flexible institutions that, with automation, are likely to enlarge rather than to diminish.

...read moreread less

One Use of Computerized Instructional Gaming in Legal Education: To Better Understand the Rich Logical Structure of Legal Rules and Improve Legal Writing

[...]

Layman E. Allen¹, Charles S. Saxon¹•Institutions (1)

University of Michigan¹

01 Jan 1985

Three approaches to information retrieval.

[...]

Ian A. Macleod

01 Jan 1985

TL;DR: A driving device for a pieZoelectric element for electrically driving a piezoelectrics element to obtain a predetermined mechanical displacement is provided.

...read moreread less

Abstract: A driving device for a piezoelectric element for electrically driving a piezoelectric element to obtain a predetermined mechanical displacement, the piezoelectric element driving device being provided with a transformer having a primary winding and a secondary winding, the secondary winding being connected to the piezoelectric element; a switching element connected to the primary winding and controlling the amount of energy stored in an air-gap of the core of the transformer; and an energy control means for driving the switching element so that the amount of energy stored in the air-gap of the core of the transformer becomes a set value, the displacement of the piezoelectric element being controlled by the amount of energy stored in the air-gap of the core of the transformer.

...read moreread less

Using information Retrieval Techniques in an Expert System.

[...]

Sheila G. Winett, Edward A. Fox

01 Jan 1985