scispace - formally typeset
Search or ask a question

Showing papers in "Information Processing and Management in 1998"



Journal ArticleDOI
TL;DR: Evidence is provided that a finite range of criteria exists and that these criteria are applied consistently across types of information users, problem situations, and source environments.
Abstract: This article takes a cognitive approach toward understanding the behaviors of end-users by focusing on the values or criteria they employ in making relevance judgments, or decisions about whether to obtain and use information. It compares and contrasts the results of two empirical studies in which criteria were elicited directly from individuals who were seeking information to resolve their own information problems. In one study, respondents were faculty and students in an academic environment examining print documents from traditional text-based information retrieval systems. In the other study, respondents were occupational users of weather-related information in a multimedia environment in which sources included interpersonal communication, mass media, weather instruments, and computerized weather systems. The results of the studies, taken together, provide evidence that a finite range of criteria exists and that these criteria are applied consistently across types of information users, problem situations, and source environments.

303 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a concept of relevance as a relationship and an effect on the movement of a user through the iterative stages of their information seeking process and suggested that partially relevant items may play an important role in the early stages of a users' information-seeking process over time.
Abstract: User relevance judgments are central to both the systems and user-oriented approaches to information retrieval (IR) systems research and development. User-oriented relevance research has also operated on two largely unconnected tracks. First, a relevance level track that examines users' criteria for relevance judgments. Second, a regions of relevance track that examines the measurement of users' relevance judgments. Users judgments and criteria for highly relevant items have been central issues for much of the relevance research. Findings are presented from four separate studies of relevance judgments by 55 users, conducting their initial online search on a particular information problem. In three studies, the number of items judged “partially” relevant (on a scale of relevant, partially relevant or not relevant) was positively correlated with different aspects of changes in users', including: (1) information problem definition, (2) search intermediaries' perceptions that a user's question and information problem has changed during the mediated search interaction, (3) personal knowledge due to the search interaction, and (4) criteria for making relevance judgments. Users with high knowledge and topic levels were more likely to judge items as highly relevant. Differences between users' criteria for highly, partially and non-relevant items are also identified. Findings suggest the need to expand the framework for relevance research and further identify the characteristics of the middle region of relevance or partial relevance as: (1) partially relevant items may play an important role in the early stages of a user's information seeking process over time for a particular information problem and (2) a relationship may exist between partially relevant items retrieved and changes in users' information problems during an information seeking process. Results also suggest that partially relevant items may be useful at the early stages of users' information seeking processes. We propose a useful concept of relevance as a relationship and an effect on the movement of a user through the iterative stages of their information seeking process. Users' relevance judgments can also be plotted on a three-dimensional spatial model of relevance level, region and time. Implications for the development of IR systems, searching practice and relevance research are also discussed.

278 citations


Journal ArticleDOI
TL;DR: Investigation of image attributes typically noted by participants in a series of describing tasks involving activities such as viewing images, describing them for a retrieval system, and describing them from memory suggest that access to a wide range of attributes is needed to address all facets of interest to those using pictorial images.
Abstract: With the current rapid expansion in imaging technologies, access to collections of images is a subject of major interest. This exploratory research investigated image attributes typically noted by participants in a series of describing tasks involving activities such as viewing images, describing them for a retrieval system, and describing them from memory. Content analysis and descriptive statistics were used to characterize textual statements generated by participants; this analysis produced forty-seven image attributes which were grouped conceptually into twelve higher level classes of attributes. The data suggest that access to a wide range of attributes is needed to address all facets of interest to those using pictorial images. They further suggest that certain classes of attributes may appear more frequently in a set of tasks relating to the description of images, including literal objects, the human form and associated attributes, and color and location terms, More unexpectedly, terms describing the ‘story’ within the image also appeared frequently in this research. The disjunction between these results and those attributes typically addressed in traditional image indexing systems suggest revisiting assumptions upon which image indexing and retrieval systems are being created.

182 citations


Journal ArticleDOI
TL;DR: The subject's deselection of items and associated application of judgment criteria provide specific insights into how relevance judgment occurs, and the findings help untangle the relevance judgment process for one individual and one situation.
Abstract: Research on relevance has established a conceptual consensus that stresses the importance of studying relevance judgments from a perspective that takes the users of retrieval systems into account Yet little research has investigated how actual system users make relevance judgments Theoretical claims pertaining to the nature of the relevance judgment process, thus, remain untested This study is a step in the empirical exploration of the evolutionary nature of relevance judgments The study intensively focuses on a single person with a real information problem She was observed during both her online searching and document retrieval The subject made her relevance evaluations first using bibliographic records and then using full-text documents as is typical of search processes that rely on bibliographic retrieval tools The data consist of printouts of records and full-texts containing the subject's evaluation markings as well as transcripts of think-aloud protocols and her responses to questions during post-search interview sessions The mental model concept is employed for analysis purposes and is operationalized as the research subject's changing perception of the information that she needs for her purposes as expressed in her relevance judgments Specifically, the think-aloud protocols and markings of texts provide indications of the state of the subject's mental model and its change as she interacted with the materials that she retrieved and selected as relevant or possibly relevant Frequencies of terms marked at the stage of record evaluation and the topical categories highlighted at the stage of document review were computed to provide a more concrete indication of the topical change in the subject's mental model of the needed information The study also identified the set of judgment criteria that the subject applied during her evaluation of online records Special attention during the analysis was paid to anomalous judgment behaviors demonstrated by the subject such as her deselection process in record evaluation and topic reformulation in document evaluation Overall, the findings help untangle the relevance judgment process for one individual and one situation In doing so, the findings provide a preliminary anchor for understanding the nature of the relevance judgment process of people engaged in an information search process The subject's deselection of items and associated application of judgment criteria provide specific insights into how relevance judgment occurs

122 citations


Journal ArticleDOI
TL;DR: The analysis of phenomena seen during the implementation of a GA for IR has brought a new crossover operation, which is introduced and compared with other learning methods.
Abstract: Genetic algorithms (GAs) search for good solutions to a problem by operations inspired from the natural selection of living beings. Among their many uses, we can count information retrieval (IR). In this field, the aim of the GA is to help an IR system to find, in a huge documents text collection, a good reply to a query expressed by the user. The analysis of phenomena seen during the implementation of a GA for IR has brought us to a new crossover operation. This article introduces this new operation and compares it with other learning methods.

97 citations


Journal ArticleDOI
TL;DR: Tools to aid users and librarians in overviewing collections, previewing objects and gathering results were created and serve as the beginnings of a digital librarian toolkit.
Abstract: This paper describes a collaborative effort to explore user needs in a digital library, develop interface prototypes for a digital library and suggest and prototype tools for digital librarians and users at the Library of Congress (LC). Interfaces were guided by an assessment of user needs and aimed to maximize interaction with primary resources and support both browsing and analytical search strategies. Tools to aid users and librarians in overviewing collections, previewing objects and gathering results were created and serve as the beginnings of a digital librarian toolkit. The design process and results are described and suggestions for future work are offered.

96 citations


Journal ArticleDOI
TL;DR: The formalisms used in logical models for information retrieval are introduced, the use of logic to build the models is shown, and a brief overview of some of the current logical models in information retrieval is presented.
Abstract: The use of logic to model the information retrieval process has become an established research area. Nevertheless, many people in the information retrieval community do not yet appreciate the work performed in this area, mainly because they do not understand logical formalisms, and hence cannot see the connection between logic and information retrieval. This paper aims at resolving the problem. It introduces the formalisms used in logical models for information retrieval, shows the use of logic to build the models, and presents a brief overview of some of the current logical models in information retrieval.

86 citations


Journal ArticleDOI
TL;DR: DB Tomography was used to derive technical intelligence from a near-earth space (NES) database derived from the Science Citation Index and the Engineering Compendex to derive pervasive technical themes of the space database.
Abstract: Database Tomography (DT) is a system which includes algorithms for extracting multi-word phrase frequencies and performing phrase proximity analyses (relating physical closeness of the multi-word technical phrases to thematic relationships) on any type of large textual database. As an illustration of the DT process applied to the published literature, DT was used to derive technical intelligence from a near-earth space (NES) database derived from the Science Citation Index and the Engineering Compendex. Phrase frequency analysis (the occurrence frequency of multi-word technical phrases) provided the pervasive technical themes of the space database, and the phrase proximity analysis provided the relationships among the pervasive technical themes. Bibliometric analysis of the NES literature supplemented the DT results by identifying: the recent most prolific NES authors; the journals which contain numerous NES papers; the institutions which produce numerous NES papers; the keywords most frequently specified by the NES authors; the authors whose works are cited most frequently in the NES papers; and the particular papers and journals cited most frequently in the NES papers.

85 citations


Journal ArticleDOI
TL;DR: A composite feature measure which combines the shape and color features of an image based on a clustering technique and a similarity measure to compute the degree of match between a given pair of images is proposed.
Abstract: We have proposed a composite feature measure which combines the shape and color features of an image based on a clustering technique. We have also developed a similarity measure to compute the degree of match between a given pair of images. This technique can be used for content-based image retrieval of images using shape and/or color, We have tested our technique on two image databases: one consisting of 100 synthetic images, and another database consisting of 500 actual trademarks images. Test results of the proposed scheme for retrieval of images using only shape, only color, and a weighted combination of the two are presented. The efficiency of retrieval is found to be very high and the experimental results are promising for practical applications

75 citations


Journal ArticleDOI
TL;DR: The durability or “staying power” of accounting research in representative scholarly journals is investigated by evaluating the extent and usage of previous literature in current literature by using the generalized gamma distribution and its four nested models.
Abstract: The first objective of this study is to investigate the durability or “staying power” of accounting research in representative scholarly journals by evaluating the extent and usage of previous literature in current literature. The value or durability of research can be represented by the pattern of citation vintages that typifies a body of literature. We use the generalized gamma distribution and its four nested models (exponential, Weibull, gamma, and log-normal) to determine a mean, median, and mode for citation age. A second and significant motivation of the study is to objectively rank the relative influence of journals on the accounting literature. Three variations of an impact factor are used to make this analysis. The first impact factor is based upon simple citation count using the proportional method, while the other two impact factors use the results of the time analysis of the data to improve the method of ranking through the emphasis of current publications.

Journal Article
TL;DR: Arampatzis et al. as discussed by the authors describe a retrieval schema which goes beyond the classical information retrieval keyword hypothesis and takes into account also linguistic variation Guided by the failures and successes of other state of the art approaches as well as their own experience with the Irena system, their approach is based on phrases and incorporates linguistic resources and processors.
Abstract: In this article we describe a retrieval schema which goes beyond the classical information retrieval keyword hypothesis and takes into account also linguistic variation Guided by the failures and successes of other state of the art approaches as well as our own experience with the Irena system our approach is based on phrases and incorporates linguistic resources and processors In this respect we introduce the Phrase Retrieval Hypothesis to replace the Keyword Retrieval Hypothesis We suggest a representation of phrases suitable for indexing and an architecture for such a retrieval system Syntactical normalization is introduced to improve retrieval e ectiveness Morphological and lexico semantical normalizations are adjusted to t in this model A previous version of this work was presented as a paper at RIAO conference in Quebec Canada and included in RIAO proceedings Arampatzis et al a Dept of Information Systems Faculty of Mathematics and Computing Science University of Nijmegen Toernooiveld NL ED Nijmegen The Netherlands E mail avgerino cs kun nl tel fax

Journal ArticleDOI
TL;DR: The data model was developed for juvenile justice and medical patient records and can be extended to a runtime model for a compact visualization using graphical timelines, which is usable in other application domains such as personal resumes, financial histories, or customer support.
Abstract: This paper proposes an information architecture for personal history data and describes how the data model can be extended to a runtime model for a compact visualization using graphical timelines. Our information architecture was developed for juvenile justice and medical patient records, but is usable in other application domains such as personal resumes, financial histories, or customer support. Our model groups personal history events into aggregates that are contained in facets (e.g., doctor visits, hospitalizations, or lab tests). Crosslinks enable representation of arbitrary relationships across events and aggregates. Data attributes, such as severity, can be mapped by data administrators to visual attributes such as color and line thickness. End-users have powerful controls over the display contents, and they can modify the mapping to fit their tasks.

Journal ArticleDOI
TL;DR: The results indicate that the chosen program is progressive in terms of empirical support and precision of the theories, and that the growth pattern is elaborative.
Abstract: The aim of this article is to analyse the growth of a theoretical research program in the field of information needs and seeking studies The program consists of a set of interrelated studies on the effects of task complexity on information source use The growth is assessed by reconstructing the logical structure of the theories within the program and by comparing those reconstructions in terms of their conceptual and factual similarity The growth pattern is then analysed by using Wagner's and Berger's model of theory growth from sociology The analysis reveals the growth pattern of the program The results indicate that the chosen program is progressive in terms of empirical support and precision of the theories, and that the growth pattern is elaborative Moreover, based on the analysis consequences for further studies are presented

Journal ArticleDOI
TL;DR: The main motivation of this paper is to discuss some central issues in the application of logic to IR, and analyse the different implications of models based on truth, validity or logical consequentiality.
Abstract: The logical approach to information retrieval has recently been the object of active research. It is our contention that researchers have put a lot of effort in trying to address some difficult problems of IR within this framework, but little effort in checking that the resulting models satisfy those well-formedness criteria that, in the field of mathematical logic, are considered essential and conducive to effective modelling of a real-world phenomenon. The main motivation of this paper is not to propose a new logical model of IR, but to discuss some central issues in the application of logic to IR. The first issue we touch upon is the logical relationship we might want to enforce between formulae d, representing a document, and n, representing an information need; we analyse the different implications of models based on truth, validity or logical consequentiality. The relationship between this issue and the issue of partiality vs. totality of information is subsequently analysed, in the context of a broader discussion of the role of denotational semantics in IR modelling. Finally, the relationship between the paradoxes of material implication and the (in)adequacy of classical logic for IR modelling purposes is discussed. TEL:: +39.050.593407 EMAIL:: fabrizio@iei.pi.cnr.it

Journal ArticleDOI
TL;DR: A retrieval schema is described which goes beyond the classical information retrieval keyword hypothesis and takes into account also linguistic variation and Morphological and lexico-semantical normalizations are adjusted to fit in this model.
Abstract: In this article we describe a retrieval schema which goes beyond the classical information retrieval keyword hypothesis and takes into account also linguistic variation. Guided by the failures and successes of other state-of-the-art approaches, as well as our own experience with the Irena system, our approach is based on phrases and incorporates linguistic resources and processors. In this respect, we introduce the phrase retrieval hypothesis to replace the keyword retrieval hypothesis. We suggest a representation of phrases suitable for indexing, and an architecture for such a retrieval system. Syntactical normalization is introduced to improve retrieval effectiveness. Morphological and lexico-semantical normalizations are adjusted to fit in this model.

Journal ArticleDOI
TL;DR: A method of full-text scanning for matches in a large dictionary is described, suitable for SDI systems, accommodating large dictionaries and typical digital data rates.
Abstract: A method of full-text scanning for matches in a large dictionary is described. The method is suitable for SDI (selective dissemination of information) systems, accommodating large dictionaries (10 4 –10 5 entries) and typical digital data rates (tens of megabytes per second or more). It can be implemented on a single commercially-available board hosted by a personal computer or entirely in software. The preferred approach employs a hardware primary test, followed by a software secondary test. The algorithm is described in detail, the implementation is sketched, and simulation results are presented.

Journal ArticleDOI
Louise T. Su1
TL;DR: Value of search results as a whole, a utility measure, was found to be the best single measure of interactive information performance (success) among the 20 measures selected for study and is suggested to provide a simple way for system comparison and eliminates problems of IR evaluation with multiple measures.
Abstract: Value of search results as a whole, a utility measure, was found to be the best single measure of interactive information performance (success) among the 20 measures selected for study ( Su, 1991 ). The study suggests that this measure provides a simple way for system comparison and eliminates problems of IR evaluation with multiple measures. Value of search results as a whole is a measure which asks for a user's rating on the usefulness of a set of search results based on a Likert 7-point scale. This measure gives a numeric basis for comparing information retrieval performance but it does not indicate why one set of search results is rated more useful than others or how search results or systems can be improved to be more useful or successful. To further our understanding of the measure and to enhance its usefulness, the current paper explores two research questions: (1) What are the conceptual categories or dimensions of the users' reasons for assigning particular ratings on the value of search results? (2) What are the relationships between these dimensions of value and the dimensions of success identified in the earlier study ( Su, 1991 )? Verbal data collected by the previous study will be analyzed to answer the research questions. The current paper presents results from this analysis and discusses implications for theory and practice. It also compares current findings with those from other user criteria studies.

Journal ArticleDOI
Joon Ho Lee1
TL;DR: Experimental results show that combining the evidence of different relevance feedback methods can lead to substantial improvements of retrieval effectiveness.
Abstract: It has been known that retrieval effectiveness can be significantly improved by combining multiple evidence from different query or document representations, or multiple retrieval techniques In this paper, we combine multiple evidence from different relevance feedback methods, and investigate various aspects of the combination We first generate multiple query vectors for a given information problem in a fully automatic way by expanding an initial query vector with various relevance feedback methods We then perform retrieval runs for the multiple query vectors, and combine the retrieval results Experimental results show that combining the evidence of different relevance feedback methods can lead to substantial improvements of retrieval effectiveness

Journal ArticleDOI
TL;DR: The probabilistic technique to retrieve passages from texts having a large size or heterogeneous semantic content is presented and it is suggested that text organization and query generality may have an impact on the difference in effectiveness between the two techniques.
Abstract: This paper presents a probabilistic technique to retrieve passages from texts having a large size or heterogeneous semantic content. The proposed technique is independent on any supporting auxiliary data, such as text structure, topic organization, or pre-defined text segments. A Bayesian framework implements the probabilistic technique. We carried out experiments to compare the probabilistic technique to one based on a text segmentation algorithm. In particular, the probabilistic technique is more effective than, or as effective as the one based on the text segmentation to retrieve small passages. Results show that passage size affects passage retrieval performance. Results do also suggest that text organization and query generality may have an impact on the difference in effectiveness between the two techniques.

Journal ArticleDOI
TL;DR: The elicitation purposes of search intermediaries included requests for information on search terms and strategies, database selection, search procedures, system's outputs and relevance of retrieved items, and users' knowledge and previous information-seeking.
Abstract: What elicitations or requests for information do search intermediaries make of users with information requests during an information retrieval (IR) interaction-including prior to and during an IR interaction-and for what purpose? These issues were investigated during a study of elicitations during 40 mediated IR interactions. A total of 1557 search intermediary elicitations were identified within 15 purpose categories. The elicitation purposes of search intermediaries included requests for information on search terms and strategies, database selection, search procedures, system's outputs and relevance of retrieved items, and users' knowledge and previous information-seeking. The transition sequences from one type of search intermediary elicitation to another were also investigated. These findings are compared with results from a study of end-user questions [Nahl D. & Tenopir C. (1996) Affective and cognitive searching behavior of novice and end-users of a full-text database. Journal of the American Society for Information Science, 47(4), 276–286] and a study of user elicitations of search intermediaries [Wu, Mei Mei (1993) Information interaction dialog: A study of patron elicitation in the information retrieval interaction. Ph.D. Dissertation. Rutgers University, New Brunswick. UMI Order Number 9320541] to develop an Information Retrieval Elicitation Task Model. Implications of the findings for the development and design of IR systems are also discussed.

Journal ArticleDOI
TL;DR: The indirect method of locating for indexing the likely explicit and implicit captions of photographs using multimodal clues including the specific words used, the syntax, the surrounding layout of the Web page, and the general appearance of the associated image shows a surprising degree of success.
Abstract: A variety of software tools index text of the World Wide Web, but little attention has been paid to the many photographs. We explore the indirect method of locating for indexing the likely explicit and implicit captions of photographs. We use multimodal clues including the specific words used, the syntax, the surrounding layout of the Web page, and the general appearance of the associated image. Our MARIE-3 system thus avoids full image processing and full natural-language processing, but shows a surprising degree of success. Experiments with a semi-random set of Web pages showed 41% recall with 41% precision for the task of distinguishing captions from other text, and 70% recall with 30% precision. This is much better than chance since actual captions were only 1.4% of the text on pages with photographs.

Journal ArticleDOI
TL;DR: It is shown that if the similarity function of a retrieval system leads to a (pseudo-) metric, the retrieval, the similarity and the Everett-Cater metric topology coincide and are generally different from the discrete topology.
Abstract: We show that if the similarity function of a retrieval system leads to a (pseudo-) metric, the retrieval, the similarity and the Everett-Cater metric topology coincide and are generally different from the discrete topology. This is the case if we represent documents by lists and use the Jaccard similarity measure. The corresponding metric is then the Marczewski-Steinhaus metric. We further study the special case of a one-element query space consisting of a single-item query.

Journal ArticleDOI
TL;DR: A fuzzy image database model and a concept of fuzzy space are proposed and fuzzy query processing in fuzzy space and fuzzy indexing on complex fuzzy vectors are described.
Abstract: Image data are inherently visual. The description of visual characteristics of images are imprecise. Fuzzy retrieval of images stored in a feature-based image database is a natural means to access the data. Unfortunately, to the authors knowledge, little work has been done on fuzzy image database models and fuzzy retrieval of feature-based image databases. In this paper, a fuzzy image database model and a concept of fuzzy space are proposed and fuzzy query processing in fuzzy space and fuzzy indexing on complex fuzzy vectors are described. An example image database, the computer-aided facial image inference and retrieval system (CAFIIR), is used for explanation throughout the paper.

Journal ArticleDOI
TL;DR: In this paper, the authors examined the notion that the principles underlying the procedure used by doctors to diagnose a patient's disease are useful in the design of intelligent information retrieval systems because the task of the doctor is conceptually similar to the computer (or human) intermediary's task in "intelligent information retrieval": to draw out, through interaction with the IR system, the user's query/information need.
Abstract: The research examines the notion that the principles underlying the procedure used by doctors to diagnose a patient's disease are useful in the design of “intelligent” IR systems because the task of the doctor is conceptually similar to the computer (or human) intermediary's task in “intelligent information retrieval”: to draw out, through interaction with the IR system, the user's query/information need. The research is reported in two parts. In Part II, an information retrieval tool is described which is based on “intelligent information retrieval” assumptions about the information user. In Part I, presented here, the theoretical framework for the tool is set out. This framework is borrowed from the diagnostic procedure currently used in medicine, called “differential diagnosis”. Because of the severe consequences that attend misdiagnosis, the operating principle in differential diagnosis is (1) to expand the uncertainty in the diagnosis situation so that all possible hypotheses and evidence are considered, then (2) to contract the uncertainty in a step by step fashion (from an examination of the patient's symptoms, through the patient's history and a physical (signs), to laboratory tests). The IR theories of Taylor, Kuhlthau and Belkin are used to demonstrate that these medical diagnosis procedures are already present in IR and that it is a viable model with which to design “intelligent” IR tools and systems.

Journal ArticleDOI
TL;DR: Although Boolean searching has been the standard model for commercial information retrieval systems for the past three decades, natural language input and partial-match weighted retrieval have recently emerged from the laboratories to become a searching option in several well-known online systems.
Abstract: Although Boolean searching has been the standard model for commercial information retrieval systems for the past three decades, natural language input and partial-match weighted retrieval have recently emerged from the laboratories to become a searching option in several well-known online systems. The purpose of this investigation is to compare the performance of one of these partial match options, LEXIS/NEXIS's Freestyle, with that of traditional Boolean retrieval. To create a context for the investigation, the definition of natural language and the natural language search engines currently available are discussed. Although the Boolean searches had better results more often than the Freestyle searches, neither mechanism demonstrated superior performance for every query. These results do not in any way prove the superiority of partial match techniques or exact match techniques, but they do suggest that different queries demand different techniques. Further study and analysis are needed to determine which elements of a query make it best suited for partial match or exact match retrieval.

Journal ArticleDOI
TL;DR: The objective is to create a tool that will draw-out the undergraduate's query to the information system by taking the student through the task of doing the term paper and diagnose the student's information need by measuring his or her degree of topic integration.
Abstract: This article is part II of a three part series. In part I we described the theoretical framework for developing an “intelligent information retrieval” tool, based on three principles from medical diagnosis theory. In part II, the present article, we outline a prototype of an “intelligent” IR tool, whose purpose is to facilitate information access for an undergraduate seeking information for a history term paper. Our objective is to create a tool that will (i) draw-out the undergraduate's query to the information system by taking the student through the task of doing the term paper and (ii) diagnose the student's information need by measuring his or her degree of topic integration. The degree of integration indicates a class of information need. The classes of information need are based on Kuhlthau's six stage information search process (ISP) model (each stage is a separate information need, demanding different information to satisfy it). The measurement instrument is based on (i) principles from Shannon's mathematical theory of communication and (ii) principles of uncertainty expansion and reduction from differential diagnosis theory.

Journal ArticleDOI
TL;DR: Noetica represents knowledge using a strongly-typed semantic network that allows for a consistency of representation that is not often found in “free” semantic networks and gives the ability to easily extend a knowledge model while retaining its semantics.
Abstract: Noetica is a tool for structuring knowledge about concepts and the relationships between them. It differs from typical information systems in that the knowledge it represents is abstract, highly connected and includes meta-knowledge (knowledge about knowledge). Noetica represents knowledge using a strongly-typed semantic network. By providing a rich type system it is possible to represent conceptual information using formalised structures. A class hierarchy provides a basic classification for all objects. This allows for a consistency of representation that is not often found in “free” semantic networks and gives the ability to easily extend a knowledge model while retaining its semantics. We also provide visualisation and query tools for this data model. Visualisation can be used to explore complete sets of link-classes, show paths while navigating through the database, or visualise the results of queries. Noetica supports goal-directed queries (a series of user-supplied goals that the system attempts to satisfy in sequence) and path-finding queries (where the system find relationships between objects in the database by following links).

Journal ArticleDOI
TL;DR: In this article it is shown how recall and precision can be expressed using only retrievals, and different types of retrieval systems are investigated: both threshold systems and “close match” systems, and both “optimal” and "non-optimal" retrieval.
Abstract: Topologies for retrieval systems are generated by certain subsets, called retrievals. In this article we show how recall and precision can be expressed using only retrievals. Different types of retrieval systems are investigated: both threshold systems and “close match” systems, and both “optimal” and “non-optimal” retrieval. The relation with the hypergeometric and some “non-standard” distributions is highlighted.

Journal ArticleDOI
TL;DR: The design and construction of features of an automated query system which will assist pharmacologists who are not information specialists to access the Derwent Drug File (DDF) pharmacological database are reported.
Abstract: We report on the design and construction of features of an automated query system which will assist pharmacologists who are not information specialists to access the Derwent Drug File (DDF) pharmacological database. Our approach was to first elucidate those search skills of the search intermediary which might prove tractable to automation. Modules were then produced which assist in the three important subtasks of search statement generation, namely vocabulary selection, the choice of context indicators and query reformulation. Vocabulary selection is facilitated by approximate string matching, morphological analysis, browsing and menu searching. The context of the study, such as treatment or metabolism, is determined using a system of advisory menus. The task of query reformulation is performed using user feedback on retrieved documents, thesaurus relations between document index terms and term postings data. Use is made of diverse information sources, including electronic forms of printed search aids, a thesaurus and a medical dictionary. The system will be of use both to semicasual users and experienced intermediaries. Many of the ideas developed should prove transportable to domains other than pharmacology: the techniques for thesaurus manipulation are designed for use with any hierarchical thesaurus.