scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the Association for Information Science and Technology in 1990"


Journal ArticleDOI
TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Abstract: A new method for automatic indexing and retrieval is described. The approach is to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries. The particular technique used is singular-value decomposition, in which a large term by document matrix is decomposed into a set of ca. 100 orthogonal factors from which the original matrix can be approximated by linear combination. Documents are represented by ca. 100 item vectors of factor weights. Queries are represented as pseudo-document vectors formed from weighted combinations of terms, and documents with supra-threshold cosine values are returned. initial tests find this completely automatic method for retrieval to be promising.

12,443 citations


Journal ArticleDOI
TL;DR: An overview of current data gathering and analytical techniques for ACA (Author cocitation analysis) is presented, focusing primarily on a set of procedures that the research group at Drexel has found useful and discuss the range of choices possible at each step of the process.
Abstract: An overview of current data gathering and analytical techniques for ACA (Author cocitation analysis) is presented. It focus primarily on a set of procedures that the research group at Drexel has found useful and discuss the range of choices possible at each step of the process. For illustration, and the results of clustering, mapping, and factor analyzing cocited authors from the subdiscipline of macroeconomics, 1972-1977. The bibliography lists more detailed presentations of the methods and fuller examples of their uses

863 citations



Journal ArticleDOI
TL;DR: The control generated by hold all levels organization it will take its role. as discussed by the authors describes the roles of steam power in speed volume and force shipments inability, how did the acquisition of problems arose fatal train service.
Abstract: The control generated by hold all levels organization it will take its role. Technology and machines increase you can't actually image how. James beniger traces the information so, many fascinating topics. How did the acquisition of problems arose fatal train service. By the roles of steam power in speed volume and force shipments inability. Synopsis description very important role in use for people of information come. Agricultural based societies use for the, natural process since it with science and business crises.

250 citations



Journal ArticleDOI
TL;DR: In this paper, the authors identify the subspecialties that constitute the foundations of current research in organizational behavior and organization theory using citations from the Social Sciences Citation Index from 1972 to 1984.
Abstract: Using citations which appeared in published research from 1972 to 1984 (from the Social Sciences Citation Index), the study identifies the subspecialties that constitute the foundations of current research. It focuses on two fields organizational behavior and organization theory. The matrix of raw cocitation counts was factor-analyzed using a principal components analysis with a varimax rotation

151 citations


Journal ArticleDOI
TL;DR: To show the feasibility ofStatistically based ranked retrieval of records using keywords, research was done to produce very fast search techniques using these ranking algorithms, and to test the results against large databases with many end users.
Abstract: Statistically based ranked retrieval of records using keywords provides many advantages over traditional Boolean retrieval methods, especially for end users. This approach to retrieval, however, has not seen widespread use in large operational retrieval systems. To show the feasibility of this retrieval methodology, research was done to produce very fast search techniques using these ranking algorithms, and then to test the results against large databases with many end users. The results show not only response times on the order of 1 and 1/2 seconds for 806 megabytes of text, but also very favorable user reaction. Novice users were able to consistently obtain good search results after 5 minutes of training. Additional work was done to devise new indexing techniques to create inverted files for large databases using a minicomputer. These techniques use no sorting, require a working space of only about 20% of the size of the input text, and produce indices that are about 14% of the input text size. © 1990 John Wiley & Sons, Inc.

142 citations


Journal ArticleDOI
TL;DR: In this article, a test of the utility of author cocitation analysis for empirically deriving groupings of intellectual leaders in a scientific subspecialty is provided. But the utility is limited.
Abstract: This article provides a test of the utility of author cocitation analysis for empirically deriving groupings of intellectual leaders in a scientific subspecialty. Methodologically, it partially replicates the procedure employed by White and Griffith (1981). It provides a partial test of the underlying assumptions of the method, its veracity, and its interpretive and substantive application. Interpretive caveats are given, and additional methodological and statistical manipulations are proposed to enhance the future utility of author cocitation analysis to map prevailing intellectual structures within science

111 citations



Journal ArticleDOI
TL;DR: In this article, a model is proposed that makes plausible the possibility that, in spite of marked differences in their appearance, these distributions are variants of a single distribution; heuristic arguments are then given that this is indeed the case.
Abstract: This article is the first of a two-part series on the informetric distributions, a family of regularities found to describe a wide range of phenomena both within and outside of the information sciences. This article introduces the basic forms these regularities take. A model is proposed that makes plausible the possibility that, in spite of marked differences in their appearance, these distributions are variants of a single distribution; heuristic arguments are then given that this is indeed the case. That a single distribution should describe such a wide range of phenomena, often in areas where the existence of any simple description is surprising, suggests that one should look for explanations not in terms of causal models, but in terms of the properties of the single informetric distribution. Some of the consequences of this conclusion are broached in this article, and explored more carefully in Part II. © 1990 John Wiley & Sons, Inc.

80 citations


Journal ArticleDOI
TL;DR: The results demonstrate that searcher success is markedly improved by greatly increasing the number of names per object.
Abstract: The implications of index‐word selection strategies for user success in interactive searching were investigated in two experiments. People were asked to find target information objects using a simple interactive keyword information retrieval system in which the number of referent terms assigned to each object was systematically varied. The results demonstrate that searcher success is markedly improved by greatly increasing the number of names per object. © 1990 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: This study is based on the notions of user preference and an acceptable ranking strategy that enables us to adopt a gradient descent algorithm to formulate the query vector by an inductive process and has the added advantages that it is applicable to both nonbinary document representation and a user preference relation inducing more than two classes.
Abstract: The subject of query formulation is analyzed within the framework of adaptive linear models. Our study is based on the notions of user preference and an acceptable ranking strategy. Such an approach enables us to adopt a gradient descent algorithm to formulate the query vector by an inductive process. We also present a critical analysis of the existing relevance feedback and the probabilistic approaches. It is shown that Rocchio's method is a special case of our linear model and the independence assumption may be stronger than required for a linear system. Our method has the added advantages that it is applicable to both nonbinary document representation and a user preference relation inducing more than two classes. © 1990 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: Egghe as mentioned in this paper further developed the theory of Bradford's law by deriving a theoretical formula for the Bradford multiplier and for the number of items, produced by the most productive source in every Bradford group.
Abstract: In a previous article (L. Egghe, JASIS 37(4); p. 246–255, 1986) we further developed the theory of Bradford's law by deriving a theoretical formula for the Bradford multiplier and for the number of items, produced by the most productive source in every Bradford group. In this article we apply these results to some classical bibliographies, for which we determine the underlying law of Leimkuhler and also different Bradford groupings. We also extend the above mentioned theory in order to be applicable to incomplete bibliographies (s.a. citation tables or bibliographies truncated before the Groos droop). Finally this extension also has an application in determining the size and other properties of the complete unknown bibliography, based on the incomplete one. © 1990 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: A study was conducted to determine the bibliographic characteristics of uncited papers in the biomedical literature and it was found that certain bibliographical characteristics differentiate cited from unciting papers.
Abstract: The reception of a scientific article is measured in part by its citedness in the subsequent literature. Although most papers are eventually cited, a number of papers in a variety of scientific disciplines have never been cited. A study was conducted to determine the bibliographic characteristics of uncited papers in the biomedical litterature. It was found that certain bibliographic characteristics differentiate cited from uncited papers

Journal ArticleDOI
TL;DR: In this paper, the notion of resilience to ambiguity is made precise and a number of simple examples of resilience, taken from the social sciences, are discussed, and applied to the informetric distributions themselves.
Abstract: This article continues the discussion of the informetric distributions begun in a companion paper. In the earlier paper, the informetric distributions were introduced and found to be variants of a single distribution. It was suggested that this might be explained in terms of that distribution being unusually resilient to ambiguity. In this paper the notion of resilience to ambiguity is made precise. By way of introduction, a number of simple examples of resilience, taken from the social sciences, are discussed. This approach is then applied to the informetric distributions themselves. It is argued that the form taken by the informetric regularities does indeed make them insensitive to the wide range of ambiguities that occur when measuring the output of social activity, and that this ubiquitous form is unusual in having this property. © 1990 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: Both computational results based on a mathematical model as well as experimental results using a library database show that the multiorganizational scheme proposed for indexing very large databases containing hundreds of thousands or possibly millions of records provides effective access to large text databases.
Abstract: A new signature file method for accessing information from large databases containing both formatted and free text data is presented. The new method, called the multiorganizational scheme is proposed for indexing very large databases containing hundreds of thousands or possibly millions of records. With this method, records are grouped into blocks and signatures are formed for each block of records. These signatures are stored in a block descriptor file using a storage device called the bit slice organization. By forming multiple block descriptor files, each based on a possibly different grouping of records into blocks, it is possible to efficiently determine record matches on query. Both computational results based on a mathematical model as well as experimental results using a library database are presented. These results show that the method provides effective access to large text databases. © 1990 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: The test results indicate that, with some reservations, this theory of scaling is applicable to documents, and this finding is further applied to the construction of test collections for Information Retrieval research that could more sensitively measure retrieval system alterations through the use of documents scaled not merely by relevance, but rather, by preference.
Abstract: The relationship between scaling practice and scaling theory remains a controversial problem in Information Retrieval research and experimentation. This article reports a test of a general theory of scaling, i.e., Simple Scalability, applied to the stimulus domain of documents represented as abstracts. The significance of Simple Scalability is that it implies three important properties of scales: transitivity, substitutibility, and independence. The test results indicate that, with some reservations, this theory of scaling is applicable to documents. This finding is further applied to the construction of test collections for Information Retrieval research that could more sensitively measure retrieval system alterations through the use of documents scaled not merely by relevance, but rather, by preference. © 1990 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: Two programs are described, INDEX and INDEXD, which locate repeated phrases in a document, gather statistical information about them, and rank them according to their value as index phrases, showing promise as the basis for a sophisticated conceptual indexing system.
Abstract: In recent years researchers have become increasingly convinced that the performance of information retrieval systems can be greatly enhanced by the use of key phrases for automatic conceptual document indexing and retrieval. In this article we describe two programs, INDEX and INDEXD, which locate repeated phrases in a document, gather statistical information about them, and rank them according to their value as index phrases. The programs show promise as the basis for a sophisticated conceptual indexing system. The simpler program, INDEX, ranks phrases in such a way that frequently occurring phrases which contain several frequently occurring words are given a high ranking. INDEXD is an extension of INDEX which incorporates a dictionary for stemming, weighting of words and validation of syntax of output phrases. Sample output of both programs is included, and we discuss plans to combine INDEXD with linguistic and artificial intelligence techniques to provide a general conceptual phrase-indexing system that can incorporate expert knowledge about a given application area. © 1990 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: In this paper, it is argued that science is a self-correcting system and errors that are inadvertent or deliberate will be corrected over time, and that two important checks on quality control (peer review and the replication of results) are more difficult to accomplish effectively.
Abstract: Misrepresentation in research is clearly a problem today. In the environment of big science, with accelerating competition, increased rewards for discovery and uncertainties of long-range outcome, two important checks on quality control—peer review and the replication of results—are more difficult to accomplish effectively. Additionally, the new information technology now enables scientists to communicate outside established channels where their work is judged. However, it is argued that science is a self-correcting system and errors that are inadvertent or deliberate will be corrected over time. © 1990 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: This study was designed to test hypotheses derived from a psychological theory of remembering known as retrieval by reformulation, and to observe behavioral differences while searching the two catalogs.
Abstract: Twenty subjects were assigned information problems to solve through searching a university card catalog and twenty were assigned the same problems to solve in a comparable online catalog. The study was designed to test hypotheses derived from a psychological theory of remembering known as retrieval by reformulation, and to observe behavioral differences while searching the two catalogs. Verbal protocols were used to identify reformulations and to operationalize further the theoretical construct “reformulation.” Greater perseverance and more frequent search reformulations were associated with the online catalog, while larger retrieval sets and more favorable search assessments were associated with the card catalog. No significant differences were found on most attitudinal measures. Post hoc analyses examined include overlap of sets of retrieved items variance associated with the use of test questions. © 1990 John Wiley & Sons, Inc.

Journal ArticleDOI
Christoph Schwarz1
TL;DR: The system called COPSY (context operator syntax), which uses natural language processing techniques during fully automatic syntactic analysis of free text documents is described, which is being tested by the U.S. Department of Commerce for patent search and indexing.
Abstract: Problems encountered under the syntactic analysis of free text documents are discussed. The system called COPSY (context operator syntax), which uses natural language processing techniques during fully automatic syntactic analysis (indexing and search) of free text documents is described. Applications under real world conditions are mentioned as well as evaluation and technical aspects. Further developments in the field of thesaurus building and full-text analysis using the linguistic algorithms of the syntactic retrieval system are outlined. COPSY was developed as part of a text processing project at Siemens, called TINA (Text-Inhalts-Analyse: text content analysis)




Journal ArticleDOI
TL;DR: The role of peer review and the mechanisms for evaluating scientific manuscripts are presented from the perspective of 23 years as editor of Science as mentioned in this paper, and it is recommended that a more realistic approach be taken to evaluate research productivity.
Abstract: The role of peer review and the mechanisms for evaluating scientific manuscripts are presented from the perspective of 23 years as editor of Science. Reproducibility is important in science and its feasibility varies greatly among the natural, medical, and behavioral sciences. The «publish or perish» syndrome has led to deleterious effects on scientific communication and it is recommended that a more realistic approach be taken to evaluate research productivity. Recent examples of fraud (Darsee and Slutsky) illustrate some weaknesses of the present system and have led to proposals for reform. It is maintained, however, that fraud, as distinguished from unintended error, is not common in science

Journal ArticleDOI
TL;DR: A synchronous citation study of 15 leading physics journals has been performed to determine the obsolescence of Physical Review articles with age as discussed by the authors, and the density of citations to Physical Review has been found to decrease exponentially with a half-life of 4.9 years.
Abstract: A synchronous citation study of 15 leading physics journals has been performed to determine the obsolescence of Physical Review articles with age. The density of citations to Physical Review has been found to decrease exponentially with a half-life of 4.9 years, which is the first conclusive evidence of the exponential decrease. © 1990 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: The purpose of this article is to argue that, for all its perceived defects, the gamma mixture of Poisson processes can be used to make predictions regarding future circulations of a quality adequate for general management requirements.
Abstract: Recent work has questioned the appropriateness of the gamma mixture of Poisson processes to model the circulation of books in a library. The purpose of this article is to argue that, for all its perceived defects, the model can be used to make predictions regarding future circulations of a quality adequate for general management requirements. The precise mathematical form of the model allows the consideration of any number of possible future developments. The use of the model is extensively illustrated with data from the University of Saskatchewan, Canada, and the University of Sussex, England. © 1990 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: In this article, it was shown that the multiplier k that appears in the law of Bradford is not the average production of articles per author nor the average number of articles of a journal per journal, contradicting some earlier statements of Goffman and Warren and of Yablonsky.
Abstract: In this note we show that the multiplier k that appears in the law of Bradford is not the average production of articles per author nor the average number μ of articles per journal, contradicting some earlier statements of Goffman and Warren and of Yablonsky. We remark however that the Bradford multiplier might be close to μ in a lot of cases, being merely a coincidence of the special functional relation between μ and k which we develop in full detail. We finally show that K = kp/A (p = number of Bradford groups, A = total numbers of articles) is a universal constant for the bibliography. Furthermore, K is the Bradford multiplier of a group free formulation of Bradford's law, introduced in an earlier article of the author. © 1990 John Wiley & Sons, Inc.

Journal ArticleDOI
TL;DR: In this paper, it was shown that by adding a third hidden variable to the two parameters in Lotka's law, this law becomes equivalent, in a strict logical sense, with Mandelbrot's and Leimkuhler's.
Abstract: This article will show how by adding a third «hidden» variable to the two parameters in Lotka's law, this law becomes equivalent, in a strict logical sense, with Mandelbrot's. Similarly, Lotka's inverse square law becomes equivalent with Leimkuhler's. We will also show how Pareto's law fits into this framework

Journal ArticleDOI
TL;DR: The results indicate that users are not influenced by this factor when the set of retrieved document citations has fewer than 15 members, and the hypothesis that order of presentation does have an effect is described.
Abstract: User evaluation of the relevance of documents retrieved by an information system can be a useful measure of the performance of the system. The meaningfulness of this measure is, in part, determined by the extent to which it is influenced by external factors. This article examines the question of whether the order of presentation of the document citations influences the relevance judgment of the user. An experiment to test the hypothesis that order of presentation does have an effect is described. The results indicate that users are not influenced by this factor when the set of retrieved document citations has fewer than 15 members. © 1990 John Wiley & Sons, Inc.