scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the Association for Information Science and Technology in 1995"


Journal ArticleDOI
TL;DR: A new approach to information science (IS): domain‐analysis, which states that the most fruitful horizon for IS is to study the knowledge‐domains as thought or discourse communities, which are parts of society's division of labor.
Abstract: This article is a programmatic article, which formulates a new approach to information science (IS): domain-analysis. This approach states that the most fruitful horizon for IS is to study the knowledge-domains as thought or discourse communities, which are parts of society's division of labor. The article is also a review article, providing a multidisciplinary description of research, illuminating this theoretical view. The first section presents contemporary research in IS, sharing the fundamental viewpoint that IS should be seen as a social rather than as a purely mental discipline. In addition, important predecessors to this view are mentioned and the possibilities as well as the limitations of their approaches are discussed. The second section describes recent transdisciplinary tendencies in the understanding of knowledge. In bordering disciplines to IS, such as educational research, psychology, linguistics, and the philosophy of science, an important new view of knowledge is appearing in the 1990s. This new view of knowledge stresses the social, ecological, and content-oriented nature of knowledge. This is opposed to the more formal, computer-like approaches that dominated in the 1980s. The third section compares domain-analysis to other major approaches in IS, such as the cognitive approach. The final section outlines important problems to be investigated, such as how different knowledge-domains affect the informational value of different subject access points in data bases. © 1995 John Wiley & Sons, Inc.

637 citations


Journal ArticleDOI
TL;DR: The author develops these events into complex frames and scripts, and describes in some detail the representation of an isa hierarchy using the statements in typicality logic.
Abstract: el->e2. The intended interpretation here is that whenever el occurs, so does e2 (or, el causes e2). However, the typicality operator can be used here to represent the fact that \"typically\" el causes e2, but not always. For example, if el=strike-thematch and e2=light-the-match, then typically, el causes e2, but not when additional information such as e3=match-is-wet is added. Thus, the preference criteria developed to deal with declarative statements can be used to retract unplausible chains of causal effects given new information. The author than develops these events into complex frames and scripts, and describes in some detail the representation of an isa hierarchy using the statements in typicality logic. It should be noted here that this effort does not differ substantially fi'om similar work on formalizing semantic networks except in the identification ofstatements in a given frame. For example, while every slot-value pair in a frame can be represented by a wff in predicate logic, a number of these formulas will be monotonic, while some will be typicality statements.

369 citations


Journal ArticleDOI
TL;DR: This article presents three popular methods: the connectionist Hopfield network; the symbolic ID3/ID5R; and evolution-based genetic algorithms, which are promising in their ability to analyze user queries, identify users' information needs, and suggest alternatives for search.
Abstract: Information retrieval using probabilistic techniques has attracted significant attention on the part of researchers in information and computer science over the past few decades In the 1980s, knowledge-based techniques also made an impressive contribution to “intelligent” information retrieval and indexing More recently, information science researchers have turned to other newer artificial-intelligence-based inductive learning techniques including neural networks, symbolic learning, and genetic algorithms These newer techniques, which are grounded on diverse paradigms, have provided great opportunities for researchers to enhance the information processing and retrieval capabilities of current information storage and retrieval systems In this article, we first provide an overview of these newer techniques and their use in information science research To familiarize readers with these techniques, we present three popular methods: the connectionist Hopfield network; the symbolic ID3/ID5R; and evolution-based genetic algorithms We discuss their knowledge representations and algorithms in the context of information retrieval Sample implementation and testing results from our own research are also provided for each technique We believe these techniques are promising in their ability to analyze user queries, identify users' information needs, and suggest alternatives for search With proper user-system interactions, these methods can greatly complement the prevailing full-text, keyword-based, probabilistic, and knowledge-based techniques © 1995 John Wiley & Sons, Inc

283 citations


Journal ArticleDOI
TL;DR: Relationships between search topic and system structure are found, such that the most difficult topics on the SLC were those hard to locate in the hierarchy, and those most difficult on the keyword OPACs were hard to spell or required children to generate their own search terms.
Abstract: As we seek both to improve public school education in high technology areas and to link libraries and classrooms on the “information superhighway,” we need to understand more about children’s information searching abilities. We present results of four experiments conducted on four versions of the Science Library Catalog (SLC), a Dewey decimal-based hierarchical browsing system implemented in HyperCard without a keyboard. The experiments were conducted over a 3-year period at three sites, with four databases, and with comparisons to two different keyword online catalogs. Subjects were ethnically and culturally diverse children aged 9 through 12; with 32 to 34 children participating in each experiment. Children were provided explicit instruction and reference materials for the keyword systems but not for the SLC. The number of search topics matched was comparable across all systems and all experiments; search times were comparable, though they varied among the four SLC versions and between the two keyword online public access catalogs (OPACs). The SLC overall was robust to differences in age, sex, and computer experience. One of the keyword OPACs was subject to minor effects of age and computer experience; the other was not. We found relationships between search topic and system structure, such that the most difficult topics on the SLC were those hard to locate in the hierarchy, and those most difficult on the keyword OPACs were hard to spell or required children to generate their own search terms. The SLC approach overcomes problems with several searching features that are difficult for children in typical keyword OPAC systems: typing skills, spelling, vocabulary, and Boolean logic. Results have general implications for the design of information retrieval systems for children.

245 citations


Journal ArticleDOI
TL;DR: This article offers a framework for explicitly defining the types of information that clinicians use and the various states of information need on which different studies have focused, and moves beyond measures of the relevance of retrieved information to assessing the extent to which information systems help practitioners solve the clinical problems they face in practice.
Abstract: Quantitative estimates of physician information need reported in the literature vary by orders of magnitude. This article offers a framework for explicitly defining the types of information that clinicians use and the various states of information need on which different studies have focused. Published reports seem to be in agreement that physicians have many clinical questions in the course of patient care, but most of their questions are never answered. Examination of the clinical questions themselves reveals that they tend to be highly complex, embedded in the context of a unique patient's story. The heavy reliance of physicians on human sources of information has implications for the nature of their information needs, including the narrative structure of their knowledge and the need for more than information alone when solving clinical problems. Evaluation of clinical information systems must move beyond measures of the relevance of retrieved information to assessing the extent to which information systems help practitioners solve the clinical problems they face in practice. © 1995 John Wiley & Sons, Inc.

228 citations


Journal Article
TL;DR: In this article, the authors present evidence that for a considerable number of journals the values of the impact factors published in ISI's Journal Citation Reports (JCR) are inaccurate, particularly for several journals having a high impact factor.
Abstract: The Institute for Scientific Information (ISI) publishes annually listings of impact factors of scientific journals, based upon data extracted from the Science Citation Index (SCI). The impact factor of a journal is defined as the average number of citations given in a specific year to documents published in that journal in the two preceding years, divided by the number of “citable” documents published in that journal in those 2 years. This article presents evidence that for a considerable number of journals the values of the impact factors published in ISI's Journal Citation Reports (JCR) are inaccurate, particularly for several journals having a high impact factor. The inaccuracies are due to an inappropriate definition of citable documents. Document types not defined by ISI as citable (particularly letters and editorials) are actually cited and do contribute to the citation counts of a journal. We present empirical data in order to assess the degree of inaccuracy due to this phenomenon. For several journals the results are striking. We propose to calculate for a journal impact factors per type of document rather than one single impact factor as given currently in the JCR. © 1995 John Wiley & Sons, Inc.

219 citations


Journal ArticleDOI
TL;DR: It is suggested that the factors which influence classification decisions in an electronic environment were consistent with the factors that Kwasnik observed for physical documents in an office and may impact personal as well as organizational efficiency.
Abstract: Personal information management (PIM) systems are information systems developed by individuals for use in a work environment. Seven managers were interviewed to observe how their electronic documents were organized, stored, and retrieved. The purpose of the study was to investigate classification behavior both to identify the features of a PIM system and to suggest whether the factors which influence classification decisions in an electronic environment were consistent with the factors that Kwasnik observed for physical documents in an office. It is suggested that these behaviors may be influenced by the hardware and software environment and may impact personal as well as organizational efficiency. © 1995 John Wiley & Sons, Inc.

218 citations


Journal ArticleDOI
TL;DR: Evidence is presented that for a considerable number of journals the values of the impact factors published in ISI's Journal Citation Reports (JCR) are inaccurate, particularly for several journals having a high impact factor.
Abstract: The Institute for Scientific Information (ISI) publishes annually listings of impact factors of scientific journals, based upon data extracted from the Science Citation Index (SCI). The impact factor of a journal is defined as the average number of citations given in a specific year to documents published in that journal in the two preceding years, divided by the number of “citable” documents published in that journal in those 2 years. This article presents evidence that for a considerable number of journals the values of the impact factors published in ISI's Journal Citation Reports (JCR) are inaccurate, particularly for several journals having a high impact factor. The inaccuracies are due to an inappropriate definition of citable documents. Document types not defined by ISI as citable (particularly letters and editorials) are actually cited and do contribute to the citation counts of a journal. We present empirical data in order to assess the degree of inaccuracy due to this phenomenon. For several journals the results are striking. We propose to calculate for a journal impact factors per type of document rather than one single impact factor as given currently in the JCR. © 1995 John Wiley & Sons, Inc.

208 citations



Journal ArticleDOI
Yiyu Yao1
TL;DR: A new measure of system performance is suggested based on the distance between user ranking and system ranking that only uses the relative order of documents and therefore confirms to the valid use of an ordinal scale measuring relevance.
Abstract: The notion of user preference is adopted for the representation, interpretation, and measurement of the relevance or usefulness of documents. User judgments on documents may be formally described by a weak order (i.e., user ranking) and measured using an ordinal scale. Within this framework, a new measure of system performance is suggested based on the distance between user ranking and system ranking. It only uses the relative order of documents and therefore confirms to the valid use of an ordinal scale measuring relevance. It is also applicable to multilevel relevance judgments and ranked system output. The appropriateness of the proposed measure is demonstrated through an axiomatic approach. The inherent relationships between the new measure and many existing measures provide further supporting evidence

187 citations


Journal ArticleDOI
TL;DR: An algorithmic approach to the automatic generation of thesauri for electronic community systems, which showed that the worm thesaurus was an excellent “memory-jogging” device and that it supported learning and serendipitous browsing.
Abstract: This research reports an algorithmic approach to the automatic generation of thesauri for electronic community systems. The techniques used included term filtering, automatic indexing, and cluster analysis. The testbed for our research was the Worm Community System, which contains a comprehensive library of specialized community data and literature, currently in use by molecular biologists who study the nematode worm C. elegans. The resulting worm thesaurus included 2709 researchers' names, 798 gene names, 20 experimental methods, and 4302 subject descriptors. On average, each term had about 90 weighted neighboring terms indicating relevant concepts. The thesaurus was developed as an online search aide. We tested the worm thesaurus in an experiment with six worm researchers of varying degrees of expertise and background. The experiment showed that the thesaurus was an excellent “memory-jogging” device and that it supported learning and serendipitous browsing. Despite some occurrences of obvious noise, the system was useful in suggesting relevant concepts for the researchers' queries and it helped improve concept recall. With a simple browsing interface, an automatic thesaurus can become a useful tool for online search and can assist researchers in exploring and traversing a dynamic and complex electronic community system. © 1995 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: A method of drawing index terms from text using n‐gram counts, achieving a function similar to, but more general than, a stemmer.
Abstract: A method of drawing index terms from text is presented. The approach uses no stop list, stemmer, or other language- and domain-specific component, allowing operation in any language or domain with only trivial modification. The method uses n-gram counts, achieving a function similar to, but more general than, a stemmer. The generated index terms, which the author calls “highlights,” are suitable for identifying the topic for perusal and selection. An extension is also described and demonstrated which selects index terms to represent a subset of documents, distinguishing them from the corpus. Some experimental results are presented, showing operation in English, Spanish, German, Georgian, Russian, and Japanese. © 1995 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: The proposed algorithmic approach presents a viable option for efficiently traversing large‐scale, multiple thesauri (knowledge network) and can be adopted for automatic, multiple‐thesauri consultation.
Abstract: This paper presents a framework for knowledge discovery and concept exploration. In order to enhance the concept exploration capability of knowledge-based systems and to alleviate the limitations of the manual browsing approach, we have developed two spreading activation-based algorithms for concept exploration in large, heterogeneous networks of concepts (e.g., multiple thesauri). One algorithm, which is based on the symbolic AI paradigm, performs a conventional branch-and-bound search on a semantic net representation to identify other highly relevant concepts (a serial, optimal search process). The second algorithm, which is based on the neural network approach, executes the Hopfield net parallel relaxation and convergence process to identify “convergent” concepts for some initial queries (a parallel, heuristic search process). Both algorithms can be adopted for automatic, multiple-thesauri consultation. We tested these two algorithms on a large text-based knowledge network of about 13,000 nodes (terms) and 80,000 directed links in the area of computing technologies. This knowledge network was created from two external thesauri and one automatically generated thesaurus. We conducted experiments to compare the behaviors and performances of the two algorithms with the hypertext-like browsing process. Our experiment revealed that manual browsing achieved higher-term recall but lower-term precision in comparison to the algorithmic systems. However, it was also a much more laborious and cognitively demanding process. In document retrieval, there were no statistically significant differences in document recall and precision between the algorithms and the manual browsing process. In light of the effort required by the manual browsing process, our proposed algorithmic approach presents a viable option for efficiently traversing large-scale, multiple thesauri (knowledge network). © 1995 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: Relevant information known to be available may go unused in research and development because of information overload or because its use is excluded by deliberate policy.
Abstract: Relevant information known to be available may go unused in research and development because of information overload or because its use is excluded by deliberate policy. Exclusion by policy shows that R&D is not, and does not aim at always being, efficient in the sense of fully reflecting all available relevant information. It may still be efficient relative to chosen strategies of information use and non-use. Overload may be a sign of strategic error, or may be accepted as routine and normal. © 1995 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: A series of experiments conducted using a specific implementation of an inference network based probabilistic retrieval model to study the retrieval effectiveness of combining manaul and automatic index representations in queries and documents indicate that significant benefits in retrieval effectiveness can be obtained through combined representations.
Abstract: Results from research in information retrieval suggest that significant improvements in retrieval effectiveness could be obtained by combining results from multiple index representations and query strategies. Recently, an inference network based probabilistic retrieval model has been proposed, which views information retrieval as an evidential reasoning process in which multiple sources of evidence about document and query content are combined to estimate the relevance probabilities. In this paper we report a series of experiments we conducted using a specific implementation of this model to study the retrieval effectiveness of combining manaul and automatic index representations in queries and documents. The results indicate that significant benefits in retrieval effectiveness can be obtained through combined representations.

Journal ArticleDOI
TL;DR: This is the first in a two‐part series on topical relevance relationships, where conceptual background is presented and empirical research is needed to determine the subset that actually account for topical relevance.
Abstract: This is the first in a two-part series on topical relevance relationships. Part I presents conceptual background; Part II reports on a related empirical study. Since topicality is a major factor in relevance, it is crucial to identify the range of relationship types that occur between the topics of user needs and the topics of user needs and the topics of texts relevant to those needs. We have generally assumed—without particular warrant—that a single relationship type obtains, i.e., that the two topics match. Evidence from the analysis of recall failures, citation analysis, and knowledge synthesis suggests otherwise: topical relevance relationships are not limited to topic matching relationships; to the contrary, in certain circumstances they are quite likely not to be matching relationships. Relationships are one of the two fundamental components of human conceptual systems. Attempts to classify them usually accept a distinction between relationships that occur by virtue of the combination of component units (syntagmatic relationships) and relationships that are built into the language system (paradigmatic relationships). Given the variety of relationship types previously identified, empirical research is needed to determine the subset that actually account for topical relevance. © 1995 John Wiley & Sons, Inc.

Journal ArticleDOI
M. Carl Drott1
TL;DR: In the most widely accepted model of the growth of scientific literature, papers presented at conferences are seen as precursors leading to the creation of journal articles as discussed by the authors, but it is not the case that information science as a discipline has different publication patterns from other scholarly areas.
Abstract: In the most widely accepted model of the growth of scientific literature, papers presented at conferences are seen as precursors leading to the creation of journal articles. A sample of papers presented at an annual meeting of the American Society for Information Science led to journal articles at a rate much lower than would be expected from studies of other disciplines. On the other hand, a sample of articles from the Journal of the Americal Society for Information Science had rates of follow-up publication similar to values reported in the literature. This suggests that it is not the case that information science as a discipline has different publication patterns from other scholarly areas. A more complex model of the growth of scientific literature is proposed. Among the features of this model are recognition that many new findings can be conveyed with relatively small amounts of information. A view that in complex systems novelty may not be as important as generalizability. And the emergence of new forms of dissemination including electronic communication, self-publishing, and “group monographs.” © 1995 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: Investigations which entailed logging the behavior of thesaurus users and testing the effect ofThesaurus‐based query enhancement in an IR system using term weighting cause the results to cause many of the assumptions made by previous researchers in this area to be questioned.
Abstract: We discuss whether it is feasible to build intelligence rule- or weight-based algorithms into general-purpose software for interactive thesaurus navigation. We survey some approaches to the problem reported in the literature, particularly those involving the assignment of “link weights” in a thesaurus network, and point out some problems of both principle and practice. We then describe investigations which entailed logging the behavior of thesaurus users and testing the effect of thesaurus-based query enhancement in an IR system using term weighting, in an attempt to identify successful strategies to incorporate into automatic procedures. The results cause us to question many of the assumptions made by previous researchers in this area. © 1995 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: This article summarizes the evaluation studies that have been done with SAPHIRE, highlighting the lessons learned and laying out the challenges ahead to all medical information retrieval efforts.
Abstract: Information retrieval systems are being used increasingly in biomedical settings, but many problems still exist in indexing, retrieval, and evaluation. The SAPHIRE Project was undertaken to seek solutions for these problems. This article summarizes the evaluation studies that have been done with SAPHIRE, highlighting the lessons learned and laying out the challenges ahead to all medical information retrieval efforts. © 1995 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: This paper measured word-profile similarities between citing and cited publications, as well as between publications citing specific highly cited papers, and found that publications with a citation relationship are significantly more content-related than other publications.
Abstract: In this study we have measured word-profile similarities between citing and cited publications, as well as between publications citing specific highly cited papers. This «cognitive resemblance» was operationalized by different similarity measures using various kinds of terms and classification types. This study focuses on publications of internationally recognized chemical engineering scientists for the year 1982 as «source» publications, and subsequent publications (of other scientists) citing to these publications. This study empirically shows that publications with a citation relationship are significantly more content-related than other publications. It also shows that highly cited documents are mainly cited within their own research area. Thus, at least in chemical engineering, publications sharing citations to the same highly cited article, represent work of the same subject-matter research area. This is certainly not caused by the «narrowness» of the field, as we also show that there is a clear distribution of publications over many (sub)fields so that chemical engineering can be characterized as a broad, interdisciplinary research field. A weak relationship between word-profiles and type of classification was found, and this relationship differs between various types of classification. Mapping based on correspondence analysis clearly visualizes content-related groups of citing and cited publications. Our findings are contrary to the results of some earlier studies and to opinions in circles of sociologists of science that authors refer to publications in a rather arbitrary way mainly for adornment of their claims. These differences can be explained simply with statistical arguments

Journal ArticleDOI
TL;DR: The state of electronic medical records, their advantage over existing paper records, the problems impeding their implementation, and concerns over their security and confidentiality are described.
Abstract: Despite the growth of computer technology in medicine, most medical encounters are still documented on paper medical records. The electronic medical record has numerous documented benefits, yet its use is still sparse. This article describes the state of electronic medical records, their advantage over existing paper records, the problems impeding their implementation, and concerns over their security and confidentiality. As noted in the introduction to this issue, the provision of medical care is an information-intensive activity. Yet in an era when most commercial transactions are automated for reasons of efficiency and accuracy, it is somewhat ironic that most recording of medical events is still done on paper. Despite a wealth of evidence that the electronic medical record (EMR) can save time and cost as well as lead to improved clinical outcomes and data security, most patient-related information is still recorded manually. This article describes efforts to computerize the medical record. Purpose of the Medical Record The major goal of the medical record is to serve as a repository of the clinician’s observations and analysis of the patient. Any clinician’s recorded interactions with a patient usually begin with the history and physical examination. The history typically contains the patient’s chief complaint (i.e., chest pain, skin rash), history of the present illness (other pertinent symptoms related to the chief complaint), past medical history, social history, family history, and review of systems (other symptoms unrelated to the present illness). The physical examination contains an inventory of physical findings, such as abdominal tenderness or an enlarged lymph node. The history and physical are usually followed by an assessment which usually adheres to the problem-oriented approach advocated by Weed ( 1969), with each problem analyzed and given a plan for diagnosis and/or treatment. Subsequent records by the clinician are usually in

Journal ArticleDOI
TL;DR: Current standards and criteria for evaluating electronic ppublications in the context if the promotion and tenure (P&T) process are reviewed.
Abstract: The prodigious grouth in electronic publishing, fom listservs to peer-reviewed e-journals, may create a structural tension in the academy's evaluation system, even if only temporaly, as established norms and behaviors from the print era are challenged by new modalities. In view of the prevailing speculation, we decided to review current standards and criteria for evaluating electronic ppublications in the context if the promotion and tenure (P&T) process

Journal ArticleDOI
TL;DR: This paper describes a new indexing algorithm designed to create large compressed inverted indexes in situ that makes use of simple compression codes for the positive integers and an in-place external multi-way mergesort.
Abstract: An inverted index stores, for each term that appears in a collection of documents, a list of document numbers containing that term. Such an index is indispensable when Boolean or informal ranked queries are to be answered. Construction of the index is, however, a nontrivial task. Simple methods using in-memory data structures cannot be used for large collections because they require too much random access storage, and traditional disk-based methods require large amounts of temporary file space. This paper describes a new indexing algorithm designed to create large compressed inverted indexes in situ. It makes use of simple compression codes for the positive integers and an in-place external multi-way mergesort. The new technique has been used to invert a two-gigabyte text collection in under 4 hours, using less than 40 megabytes of temporary disk space, and less than 20 megabytes of main memory. © 1995 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: A four-round Delphi study with a panel of 25 library media specialists from 22 secondary schools across the United States was aimed at identifying high school students' most significant difficulties in using online and CD-ROM databases; suggesting design elements and curricular and instructional strategies for making these tools more valuable as learning resources; and determining the most significant policy issues related to the use of electronic information resources in schools as mentioned in this paper.
Abstract: A four-round Delphi study with a panel of 25 library media specialists (LMSs) from 22 secondary schools across the United States was aimed at: (1) identifying high school students' most significant difficulties in using online and CD-ROM databases; (2) suggesting design elements and curricular and instructional strategies for making these tools more valuable as learning resources; and (3) determining the most significant policy issues related to the use of electronic information resources in schools. Findings are based on (1) panelists' ratings of 234 items on Likert-type scales and (2) panelists' selections and rankings of a subset of items from that larger set. The conceptual framework for the study was derived from instructional systems design (ISD), a discipline outside the traditional focus of information science research, but one that has considerable potential for offering additional insights to the field. The results confirm that the major issues related to schools' use of online and CD-ROM databases involve their role in students' development of the higher-order thinking skills necessary to plan, design, and conduct competent and credible research in the electronic information age. © 1995 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: ACTS is an automatic Chinese text segmentation proto-type for Chinese full text retrieval that applies partial syntactic analysis—the analysis of morphemes, words, and phrases.
Abstract: Text segmentation is a prerequisite for text retrieval systems. Chinese texts cannot be readily segmented into words because they do not contain word boundaries. ACTS is an automatic Chinese text segmentation proto-type for Chinese full text retrieval. It applies partial syntactic analysis—the analysis of morphemes, words, and phrases. The idea was originally largely inspired by experiments on English morpheme and phrase-analysis-based text retrieval, which are particularly germane to Chinese, because neither Chinese nor English texts have morpheme and phrase boundaries. ACTS is built on the hypothesis that Chinese words and phrases exceeding two characters can be characterized by a grammar that describes the concatenation behavior of the morphological and syntactic categories of their formatives. This is examined through three procedures: (1) Segmentation—texts are divided into one and two character segments by matching against a dictionary; (2) Category disambiguation—the syntactic categories of segments are determined according to context; (3) Parsing—the segments are analyzed based on the grammar, and subsequently combined into compound and complex words for indexing and retrieval. The experimental results, based on a small sample of 30 texts, show that most significant words and phrases in these texts can be extracted with a high degree of accuracy. © 1995 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: Search results, selection of search terms, and efficiency were found to be related to database‐assisted problem‐solving performance.
Abstract: The relationship between personal knowledge in a domain and searching proficiency in that domain, and the relationship between searching proficiency and database-assisted problem-solving performance were the foci of this study. On four assessment occasions over a 2-year period, 36 medical students solved problems in three biomedical domains (bacteriology, pharmacology, and toxicology) with and without assistance from a factual database in the relevant domain. There was little evidence of any relationship between personal domain knowledge and searching proficiency (i.e., search results, selection of search terms, improvement in selection of search terms over the course of the search, and efficiency). Search results, selection of search terms, and efficiency were found to be related to database-assisted problem-solving performance. © 1995 John Wiley & Sons, Inc.


Journal ArticleDOI
TL;DR: The children in the text plus animation and captions group were more successful at identifying the major steps in the procedure and at enacting that procedure whereas the children who read the text only experienced the most difficulty in performing the procedure.
Abstract: We report the results from the second phase of a cognitive study of multimedia and its effect on children's learning. A sample of 71 children (12-year-olds) drawn from three primary schools viewed a procedural text that included a four-sequence animation with captions on how to find south using the sun's shadow. This multimedia sequence was adapted from a section within Compton's Multimedia Encyclopedia using Apple QuickTime. The children were divided into four groups, each of which viewed different media combinations: text only; text plus animation; text plus captions plus animation; and captions with animation. Shortly afterwards the children were asked to undertake two tasks: To recall in their own words what they had learned, and also to enact how they would find south using a model specially designed for this purpose. No significant differences were found among the groups regarding literal recall of what they had read and seen, or in their ability to draw inferences from it. The children in the text plus animation and captions group, however, were more successful at identifying the major steps in the procedure and at enacting that procedure whereas the children who read the text only experienced the most difficulty in performing the procedure. © 1995 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: An empirical investigation of the question, “What relationship types actually account for topical relevance?” reveals that topical relevance relationships include a large variety of relationships, only some of which are matching relationships.
Abstract: The first part of this two-part series argues that the assumption of topic matching between user needs and texts topically relevant to those needs is often erroneous. This second part reports an empirical investigation of the question, “What relationship types actually account for topical relevance?” In order to avoid the bias of topic-matching search strategies, user needs are back-generated from a randomly selected subset of the subject headings employed in a user-oriented topical concordance. The corresponding relevant texts are those indicated in the concordance under the subject heading. The study compares the topics of the user needs with the topics of the relevant texts to determine the relationships between them. This examination reveals that topical relevance relationships include a large variety of relationships, only some of which are matching relationships. Others are examples of paradigmatic relationships or syntagmatic relationships. Indeed, there appear to be no constraints on the kinds of relationships that can function as topical relevance relationships. They are distinguishable from other types of relationships only on functional grounds. © 1995 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: The research used Hulme's concept of literary warrant and Kernan's description of the interactive processes of literature and literary scholarship to justify quantifying existing subject indexing in existing bibliographic records as a first step in the domain analysis of a field.
Abstract: This article reports research that used descriptor subfields in MLA Bibliography online to quantify literary warrant in the domain of scholary work about fiction (i.e., “fiction studies”). The research used Hulme's concept of literary warrant and Kernan's description of the interactive processes of literature and literary scholarship to justify quantifying existing subject indexing in existing bibliographic records as a first step in the domain analysis of a field. It was found that certain of the MLA Bibliography online's descriptor subfields and certain of the descriptor terms within those subfields occurred more often than would occur by chance. The techniques used in the research might be extended to domain analysis of other fields. Use of the methodology might improve the ability to evaluate existing and to design future subject access systems. © 1995 John Wiley & Sons, Inc.