scispace - formally typeset
Search or ask a question

Showing papers on "Human–computer information retrieval published in 1986"


Book
01 May 1986
TL;DR: Thank you very much for downloading machine learning applications in expert systems and information retrieval for knowledge that, people have look numerous times for their chosen books like this, but end up in malicious downloads.
Abstract: Thank you very much for downloading machine learning applications in expert systems and information retrieval. Maybe you have knowledge that, people have look numerous times for their chosen books like this machine learning applications in expert systems and information retrieval, but end up in malicious downloads. Rather than enjoying a good book with a cup of tea in the afternoon, instead they cope with some malicious bugs inside their computer.

115 citations


Proceedings ArticleDOI
Gerard Salton1
01 Sep 1986
TL;DR: An attempt is made to outline the information retrieval environment of the future and to assess the usefulness of some of the currently proposed search and retrieval methods.
Abstract: Substantial successes were achieved in the early years in automatic indexing and retrieval using single term indexing theories with term weight assignments based on frequency considerations. The development of more refined indexing systems using thesaurus aids and automatically constructed term association maps changed the retrieval effectiveness only slightly. The recent introduction of the relevance concept in the form of probabilistic retrieval models provided a firm basis for term weighting and document ranking practices. However, the probabilistic methods were not helpful in substantially enhancing the retrieval effectiveness.At the present time, attempts are made to add artificial intelligence concepts to the document retrieval environment in the form of fancy graphics interfaces, learning systems for query and document indexing and for collection searching, extended logic models relating documents and information requests, and analysis methods based on the use of semantic maps and other kinds of knowledge structures. Using the earlier developments and evaluation results as guidelines, an attempt is made to outline the information retrieval environment of the future and to assess the usefulness of some of the currently proposed search and retrieval methods.

52 citations


Proceedings ArticleDOI
01 Sep 1986
TL;DR: The definition of a retrieval strategy which incorporates parsing of query text and a more “shallow” parsing of document texts, whose retrieval effectiveness is investigated and described are described.
Abstract: This paper deals with mechanisms for performing text retrieval which incorporate a degree of linguistic processing into the overall strategy. We have performed some experiments using parsing of text an a test collection of documents and queries to try and find out exactly if and how parsing could contribute to an overall improvement in retrieval effectiveness. Investigating this topic has led us to the definition of a retrieval strategy which incorporates parsing of query text and a more “shallow” parsing of document texts, whose retrieval effectiveness is investigated and described. Our results indicate that significant improvements in retrieval effectiveness can be obtained by incorporating such linguistic processing into an overall retrieval strategy.

38 citations


Proceedings ArticleDOI
01 Sep 1986
TL;DR: A project which attempts to classify representations of the anomalous states of knowledge of users of document retrieval systems on the basis of structural characteristics of the representations, and which specifies different retrieval strategies and ranking mechanisms for each ASK class.
Abstract: We report on a project which attempts to classify representations of the anomalous states of knowledge (ASKs) of users of document retrieval systems on the basis of structural characteristics of the representations, and which specifies different retrieval strategies and ranking mechanisms for each ASK class. The classification and retrieval strategy specification is based on 53 real problem statements, 35 of which have a total of 250 evaluated documents. Four facets of the ASK structures have been tentatively identified, whose combinations determine the method and order of application of five basic ranking strategies. This work is still in progress, so results presented here are incomplete.

38 citations


Proceedings Article
01 Jan 1986
TL;DR: In this article, a soft matching function is provided by the vectar-based [2, 3, 4] and the fuzzy set [S, 6] models to rank documents with respect to the degree of similarity between a sunogate document and the query.
Abstract: The fundamental problem in Information Retrieval (IR) is to identify lhe relevant documents from lhe nonrelevant ones in a collection of documents according to. a particular user's information. needs. One of the majar difficulties in modeUing information retrieval is to choose an appropriate (knowledge) representation of the content of an individual document For example. it is common to describe each document by a set of (weighted) indu. terms ar keywords obtained from an automatic indexing scheme (1, 2, 3]. Since these index tenns (or some other similar "constructs") provide us only with partial knowledge about the contents of the documents, it is unrealistic to expect that lhe system would identify without uncertainty only those docume.nts the user needs. Thus, any relevance judgment based on the surrogate documents and some highly model dependent retrieval strategy is bound to be uncertain. In this regard, the search strategy acq,ted in !he standard Boolean model, for example, used in most commercial systems is generally considered ·to be too restrictive. On the other hand, a soft matching function is provided by the vectarbased [2, 3, 4) and the fuzzy set [S, 6] models to rank documents with respect to the degree of similarity between a sunogate document and the query. In these approaches, it is believed that the query formulated by the user (using whatever query language provided by a particular IR model) can in fact accurately reftect a user's information requirements. However, in practice, more often than not a user may not be able to describe· in a precise way the characteristics of the relevant information items even with a query language of sufficient expressive power [7). This drawbaclt, to some extent, can be remedied by incorporating some intuitive relevance feedback procedure [8) to im~ve the query formulation.

26 citations



Journal ArticleDOI
TL;DR: A faceted hierarchical thesaurus organization has been designed to accomplish this goal and leads novice end-users to browse subject areas before retrieval and yet provides control and coverage of terms in a domain.
Abstract: Direct end-user data entry and retrieval is a major factor in achieving an economical information retrieval system. To be effective, such a system would have to provide a thesaurus structure which leads novice end-users to browse subject areas before retrieval and yet provides control and coverage of terms in a domain. A faceted hierarchical thesaurus organization has been designed to accomplish this goal.

24 citations


Book ChapterDOI
30 Jun 1986
TL;DR: A new approach to the indexation of documents by keywords is proposed, taking into account to what extent a given keyword may and must appear in an acceptable description of a considered document.
Abstract: This paper proposes a new approach to the indexation of documents by keywords, taking into account to what extent a given keyword may and must appear in an acceptable description of a considered document. Possibility (resp. necessity) measures are used to estimate the possible (resp. certain) relevance of a document with respect to a query.

23 citations


Proceedings ArticleDOI
01 Sep 1986
TL;DR: An assessment of the applicability of existing multivariate data graphical techniques to the vector space model is presented and an overview of the graphical techniques used in the representation of information in a document collection environment is given.
Abstract: This paper gives an overview of the graphical techniques which have been used in the representation of information in a document collection environment. An assessment of the applicability of existing multivariate data graphical techniques to the vector space model is presented.

23 citations


Proceedings ArticleDOI
01 Dec 1986
TL;DR: A model for such an application of AI techniques for profile development and maintenance of an information retrieval system that can be personalized to each user is presented.
Abstract: Development of an information retrieval system that can be personalized to each user requires maintaining and continually updating an interest profile for each individual user. Since people tend to be poor at self-description, it is suggested that profile development and maintenance is an area in which machine learning and knowledge based techniques can be profitably employed. This paper presents a model for such an application of AI techniques.

18 citations


Proceedings ArticleDOI
01 Sep 1986
TL;DR: In this chapter, a soft matching function is provided by the vectarbased and the fuzzy set models to rank documents with respect to the degree of similarity between a sunogate document and the query.

Book ChapterDOI
30 Jun 1986
TL;DR: The role of Information Retrieval techniques in the construction of Knowledge-Based Systems is outlined and the theory of fuzzy relational products developed by Bandler and Kohout qualifies as an especially adequate tool to handle fuzzness of selection as well as fuzziness of information contents.
Abstract: The paper outlines the role of Information Retrieval techniques in the construction of Knowledge-Based Systems. A Functional Communication Structure selects and communicates the relevant information by means of fuzzy logical and fuzzy relational requests between the individual knowledge acceptors and knowledge donors. The theory of fuzzy relational products developed by Bandler and Kohout qualifies as an especially adequate tool to handle fuzziness of selection as well as fuzziness of information contents.



01 Jan 1986
TL;DR: It is shown that certain properties of the p-norm model that one would expect to hold, given the topological origin of the model, do not in fact hold.
Abstract: There are three topics discussed in this work. The first topic is an investigation of the topological properties of the p-norm model of Salton, Fox, and Wu. It is shown that certain properties of the p-norm model that one would expect to hold, given the topological origin of the model, do not in fact hold. These properties include the abil­ ity to change the query by changing p, and the ability to adequately separate docu­ ments. Since these properties do hold in the model as actually constructed, it must be that the properties do not follow from the topological origin of the model. The second topic is a search for a usable model with an adequate theoretical basis. In order to construct such a model, the topological paradigm is defined. This paradigm establishes a minimal set of requirements that any system with a topological foundation should have. A particular example of the paradigm, the Topological Infor­ mation Retrieval System (TIRS), is constructed. It is shown that all of the desired properties of the p-norm model hold for the TIRS model. A discussion of the various query systems that may be used with TIRS is given. These query systems include a natural language interface and a weighted boolean query system, as well as two spe­ cialized interfaces. The weighted boolean query system has the property that pairs, when treated as units, have all of the properties of the non­ weighted boolean lattice. The run time of the system is estimated, once for an inverted file implementation, and once for an implementation using kd-trees. These run times are much better than for traditional systems. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. The third topic is a reexamination of the standard models of information retrieval, considered as cases of the topological paradigm. The paradigm is shown to be a unifying model, in that all of the standard models, i.e., the boolean, vector space, fuzzy set theoretic, and probabilistic models, as well as a hierarchical model, are shown to be instances of the paradigm. An appendix contains a review of relevant topics from topology and abstract algebra. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter One Introduction The problem of information retrieval can be stated quite simply: given a collec­ tion of data, determine which of the data possess a given set of properties. There are several methods currently utilized to make these determinations. If the data can be placed into a fixed, relatively unchanging format, then it is possible to construct a data base system around the data. If it possible to determine which properties form the major distinctions in the data, then it may be possible to construct a fixed or slowly changing classification scheme for the data. Many expert systems have this basic structure. A different problem arises if the data are of varied format, are constantly chang­ ing in number and value, and if the major properties of the collection are impossible to determine in advance. It is this last problem that is the subject area of information retrieval, as it will be considered in this work. There are at least two instances of this problem which must be addressed. The first, and more important, is the question and answer system. In this system, there is a large collection of data, which is constantly being updated, and one is allowed to ask questions of the system, which the system must answer, using the available data. There are no working question and answer sys­ tems, as defined above. Indeed, it may be that the first working question and answer system will pass the Turing test An easier problem is the document retrieval prob­ lem. This problem is important, both as an adjunct to bibliographic research, and as a small domain for a question and answer system. Hence, this work will interpret infor­ mation retrieval as document retrieval. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. It is possible to formally define the information retrieval problem. Several definitions exist in the literature [Kra85, Rad83a]. The definition to be used here fol­ lows [1]. Definition: An information retrieval system is an ordered octtuple IR = < D O C ,Q Y ,D S ,Q S ,R S ,f^ ,f^ , f ,> , DOC is a finite set of documents to be used as data in the information retrieval system, QY is the set of all possible natural language queries that may be asked of the system, DS is the document space, the abstract document representation used in the matching of documents to queries, QS is the query space, the abstract query representation used in the matching process, RS is the relevance space, the set of values that indicates the degree of correspondence between documents and queries, f i'. DOC —̂ D S is the indexing or abstraction function, which maps the docu­ ments into the document soace, fq : QY QS is the translation function, which maps the natural language queries into the query space, and

Journal ArticleDOI
TL;DR: The presented retrieval rules may be viewed as the logical approach in implementing a physical distributed retrieval system that consists of n local retrieval systems.
Abstract: This paper describes how the operations on the local inverted files are to be modified in order to use them in the distributed information retrieval system based on thesauri. The global system consists of n local retrieval systems. The presented retrieval rules may be viewed as the logical approach in implementing a physical distributed retrieval system.

Book ChapterDOI
09 Jul 1986
TL;DR: In this article, the authors discuss a number of differences between traditional database retrieval and the retrieval problems that arise from the large knowledge bases of rules that occur in many expert systems applications.
Abstract: This chapter discusses a number of differences between traditional database retrieval and the retrieval problems that arise from the large knowledge bases of rules that occur in many expert systems applications. It outlines several principles and techniques applicable to these problems, including a basic principle of factoring. In particular, conceptually factored, taxonomic representation systems such as KL-ONE appear to be well suited to such applications. Such knowledge structures can be used to perform a kind of abstract “parsing” of a situation, using the patterns and schemata of the knowledge base as a “grammar.” The way in which elements of the knowledge base are accessed in this process differs substantially from the way in which elements of a traditional database are accessed. To handle large knowledge bases of rule-like information, it will be necessary to combine insights of knowledge representation research with those of database organization and retrieval.

Journal ArticleDOI
TL;DR: Software previously developed for mainframe computers, laser disk applications, information retrieval of textual files on IBM‐PCs, and other functions, is being modified to meet the needs of CD‐ROM applications.
Abstract: The tremendous storage capacity of the CD‐ROM has generated the need for sophisticated search software capable of handling large files. Software previously developed for mainframe computers, laser disk applications, information retrieval of textual files on IBM‐PCs, and other functions, is being modified to meet these needs. Other software is being specifically written for CD‐ROM applications. Vendors of significant information retrieval products are identified, and the characteristics of twelve packages are compared.

Proceedings ArticleDOI
01 Sep 1986
TL;DR: User Friendly Online Searching is examined in the context of Natural Language Processing in Information Retrieval and Artificial Intelligence.
Abstract: User Friendly Online Searching is examined in the context of Natural Language Processing in Information Retrieval and Artificial Intelligence. Opportunities for synergetic R & D are identified as the basis for Intelligent Information Retrieval and Artificial Retrieval Intelligence.

Journal Article
01 Nov 1986-Online
TL;DR: InfoMaster as mentioned in this paper is a service InfoMaster, which sert d'intermediaire pour l'acces a plus de 700 bases de donnees, a.k.a. InfoMaster.
Abstract: Presentation du service InfoMaster qui sert d'intermediaire pour l'acces a plus de 700 bases de donnees

Journal ArticleDOI
01 May 1986
TL;DR: Modern technology allows natural language processing mechanisms to begin to be incorporated in the sense of matching terms found in the free text specification of the query and thefree text within the document.
Abstract: Modern computerized information retrieval systems consist of mechanisms to acquire, describe (e.g., index), and store "documents", and to receive, analyze, and respond to queries for information for users. A key element is the index language, by which the users (or user intermediaries) and indexers can communicate. Modern technology allows natural language processing mechanisms to begin to be incorporated in the sense of matching terms found in the free text specification of the query and the free text within the document.



01 Oct 1986
TL;DR: This research project was based on the hypothesis that within document frequency (VDF) of a term, the performance of a system using VDF would be affected by the resolution of anaphora (replacement of its anaphor with its referent) within its text of a document.
Abstract: Anaphora is the linguistic device of abbreviated subsequent reference to a concept. This research project was based on the hypothesis that within document frequency (VDF) of a term. and ultimately retrieval performance of a system using VDF. would be affected by the resolution of anaphora (replacement of its anaphor with its referent) within its text of a document. In order to test the hypothesis. a two-phase investigation was implemented. In the first phase, all potential anaphors in a random sample of 300 abstracts from each of two databases were identified. Each occurrence of anaphora was then examined in order to determine if the term actually functioned anaphorically. From these observations, patterns emerged which were then developed into rules that captured the systematic regularities of functional anaphors. The rules were tested by at least three people to determine whether the rules accurately distinguished functioning anaphors from potential anaphors. In the second phase of the project. 24 queries. abstracts retrieved from computerized searches on the queries, and relevance judgments on each retrieved document were selected from a previous research project. All functioning anaphors within the abstracts were resolved by hand. Twelve term weighting schemes were used on the basis of determining relevance of each document to its corresponding query. Two statistical relationships were then compared: 1) between the user's relevance judgment and the system's judgment based on the unresolved abstracts. and 2) between the user's relevance judgment and the system's judgment based on the resolved abstracts. If the latter relation is stronger than the former. then a formal treatment of anaphora in bibliographic retrieval positively affects system performance. Results of the comparisons were mixed. In some instances. the resolved documents produced a significantly better correlation between user's judgments and system's judgments. while in other instances, the opposite occurred. The findings that resolution of anaphora may increase the performance of a retrieval are far from conclusive. It is clear that future studies of anaphora in information retrieval must be treated in a more complex manner than was attempted here.