scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the Association for Information Science and Technology in 1975"


Journal ArticleDOI
TL;DR: A mixture of two Poisson distributions is examined in detail as a model of specialty word distribution and a measure intended to identify specialty words, consistent with the 2-Poisson model, is proposed and evaluated.
Abstract: The problem studied in this research is that of developing a set of formal statistical rules for the purpose of identifying the keywords of a document-words likely to be useful as index terms for that document. The research was prompted by the observation, made by a number of writers, that non-specialty words, words which possess little value for indexing purposes, tend to be distributed at random in a collection of documents. In contrast, specialty words are not so distributed. In Part I of the study, a mixture of two Poisson distributions is examined in detail as a model of specialty word distribution, and formulas expressing the three parameters of the model in terms of empirical frequency statistics are derived. The fit of the model is tested on an experimental document collection and found to be acceptable for the purposes of the study. A measure intended to identify specialty words, consistent with the 2-Poisson model, is proposed and evaluated.

200 citations


Journal ArticleDOI
TL;DR: An algorithm defining a measure of indexability is developed-a measure intended to reflect the relative significance of words in documents that is found to consistently produce indexes superior to those produced by another measure which had previously been identified in the literature as producing the best results.
Abstract: In Part I of this study,* a mixture of two Poisson distributions was examined as a model of specialty word distribution. Formulas expressing the three parameters of the model in terms of empirical frequency statistics were derived, and a statistical measure intended to identify specialty words, consistent with the model, was proposed. In the present paper, Part II of the study, a probabilistic model of keyword indexing is outlined, and some of the consequences of the model are examined. An algorithm defining a measure of indexability is developed-a measure intended to reflect the relative significance of words in documents. The measure is evaluated and is found to consistently produce indexes superior to those produced by another measure which had previously been identified in the literature as producing the best results.

124 citations


Journal ArticleDOI
TL;DR: Though the main purpose of this paper is to provide insights into a very complex process, formulae are developed that may prove to be of value for an automated operating system.
Abstract: The indexing of a document is among the most crucial steps in preparing that document for retrieval The adequacy of the indexing determines the ability of the system to respond to patron requests This paper discusses this process, and document retrieval in general, on the basis of formal decision theory The basic theoretical approach taken is illustrated by means of a model of word occurrences in documents in the context of a model information system; both models are fully defined in this paper Though the main purpose of this paper is to provide insights into a very complex process, formulae are developed that may prove to be of value for an automated operating system The paper concludes with an interpretation of recall and precision curves as seen from the point of view of decision theory

94 citations


Journal ArticleDOI
TL;DR: The publication counts indicate clear first rank position for the U.S. in scientific publications, followed at a significant distance by the Soviet Union, and the national publication rankings vary widely from discipline to discipline, with Soviet chemistry high, physics moderate and biology low.
Abstract: Indicators of national scientific activity have been derived from counts of 500,000 publications, and millions of citations, in 492 large and heavily cited scientific journals in seven major disciplines, six major countries, and a time span from 1965 to 1971. For 1972 the publication and citation counts covered 2143 journals. All of the citation data, and much of the publication data, were from journals covered by the Science Citation Index (SCI). The publication counts indicate clear first rank position for the U.S. in scientific publications, followed at a significant distance by the Soviet Union. Far below the U.S. and U.S.S.R. are the United Kingdom and Germany, followed by Japan and France. The national publication rankings vary widely from discipline to discipline, with Soviet chemistry high, physics moderate and biology low. The U.S. is the most highly cited country, followed by the U.K.; Germany and Japan are at a middle level, with French and Soviet publications the least heavily cited.

74 citations


Journal ArticleDOI
TL;DR: A test of this general law for the field of map librarianship is discussed, illustrating that Lotka's Law is valid for this discipline.
Abstract: Lotka's Law states that given the number of authors who have written one article, the number writing multiple articles can be predicted. Though Lotka's Law was based on a study of the chemistry and physics literature, interest has recently developed as to its possible application to the humanities. A test of this general law for the field of map librarianship is discussed, illustrating that Lotka's Law is valid for this discipline.

43 citations


Journal ArticleDOI
TL;DR: The mixture of scholarly and nonscholarly work which constitutes the periodical literature of information science is discussed, primarily in terms of the ten journals isolated as the core.
Abstract: The journal citations in A Bibliography on Information Science and Technology (13) provide the data for this Bradford graphical analysis. The mixture of scholarly and nonscholarly work which constitutes the periodical literature of information science is discussed, primarily in terms of the ten journals isolated as the core. The analysis is also directed toward as assessment of the completeness of the bibliography. It is inferred that the bibliography is 72 percent complete in terms of journals and 83 percent complete in terms of articles. Two further uses that have been proposed for the Bradford distribution are ranking subjects in order of breadth and determining policy for collections development. The limited results available in these two areas are considered and problems for future research suggested.

39 citations


Journal ArticleDOI
TL;DR: It was concluded that readability measurement provides one useful techinque for evaluation of abstracts: the system/user interface.
Abstract: Documents may be accessed by increasingly efficient retrieval of abstracts, but information will not be transferred unless the abstracts are read. It is suggested that the measurement of the readability of abstracts can provide an assessment of one phase of an information system: the system/user interface. Controlled reading levels for abstracts could result in more rapid processing of abstracts and a wider use of the information system. It was hypothesized that the use of readability principles in guidelines for abstracting would result in abstracts of lower reading levels than the source documents upon which they were based. Abstracts and their source documents were selected randomly from the information system supported by the Educational Resources Information Center (ERIC); readability scores were calculated using the Flesch Reading Ease formula. Comparisons among reading levels were made using analysis of variance for correlated data and Tukey's Honestly Significant Difference (HSD) test for post hoc comparisons. Results indicated that the reading level of abstracts was significantly higher than the reading level of source documents, but not higher than the reading ability of the intended audience. It was concluded that readability measurement provides one useful techinque for hte evaluation of abstracts.

26 citations


Journal ArticleDOI
TL;DR: Proof is given of the equivalence of Boolean and weighted search methods, this proof being based on the ability to convert any Boolean search request to weighted form and, similarly, any weighted request to Boolean form.
Abstract: Consideration is given to the nature of the retrieval process with emphasis on the selection algorithm employed, and its relation to document set and query form. Proof is given of the equivalence of Boolean and weighted search methods, this proof being based on the ability to convert any Boolean search request to weighted form and, similarly, any weighted request to Boolean form.

20 citations


Journal ArticleDOI
TL;DR: The results obtained from the experiments indicate that titles alone are not satisfactory for efficient retrieval, and the combination of titles and abstracts came the closest to 100% retrieval, with searching of abstracts alone doing almost as well.
Abstract: We have investigated the relative merits of searching on titles, subject headings, abstracts, free-language terms, and combinations of these elements. The COM- PENDEX data base was used for this study since it contained all of the data elements of interest. In general, the results obtained from the experiments indi- cate that, as expected, titles alone are not satisfactory for efficient retrieval. The combination of titles and abstracts came the closest to 100% retrieval, with searching of abstracts alone doing almost as well. Indexer input, although necessary for 100% retrieval in almost all cases, was found to be relatively unimportant.

20 citations


Journal ArticleDOI
TL;DR: Some basic aspects of this research are discussed, some significant new approaches to determination of information concepts are reviewed, and a selected bibliography is appended to serve as a guide to the Soviet literature of information for information science.
Abstract: There is increasing realization of the importance of information concepts in the development of information science. Although some research in this field has been carried out in English-speaking countries, a much larger body of work has developed in the USSR, most of which is unknown to information scientists in the non-socialist countries. The present paper discusses some basic aspects of this research, reviews some significant new approaches to determination of information concepts, and appends a selected bibliography to serve as a guide to the Soviet literature of information for information science.

17 citations


Journal ArticleDOI
A. Sandison1
TL;DR: Citations in Physical Review by the staff of the Physics Department of M.I.T. in 1970‐71 were analyzed as references‐permeter of shelf, and there was no evidence that exponential models are to be preferred over linear models or that citation density patterns reflect library use.
Abstract: Citations in Physical Review by the staff of the Physics Department of M.I.T. in 1970-71 were analyzed as references-permeter of shelf. These citation densities were independent of age for pre-1962 volumes, but the mean density for the 1961-55 volumes in the main library was twice that for less accessible 1954-40 volumes in the basements. For the 1969-62 volumes, the densities fell with age, perhaps partly due to the greater accessibility of the most recent personal copies, and partly to a difference between the relatively recent literature on the precise topics being discussed, compared with the less age-dependent literatures on theory and methodology. There was no evidence that exponential models are to be preferred over linear models or that citation density patterns reflect library use.

Journal ArticleDOI
TL;DR: A decision model for book acquisitions has been developed to simulate the intellectual processes used in acquiring these materials in academic libraries and it consists of a flow chart, weighted inputs and an equation that indicates whether a library should add the title to its collection.
Abstract: A decision model for book acquisitions has been developed to simulate the intellectual processes used in acquiring these materials in academic libraries It consists of a flow chart, weighted inputs and an equation, which when solved indicates whether a library should add the title to its collection, refer it to a cooperative group, defer the decision or drop it altogether Inputs to the model need further study and development, but the model is a step in defining and quantifying the decision process

Journal ArticleDOI
Naomi Sager1
TL;DR: The results of an investigation into information structures in natural language science texts show that the literature of a science subfield has characteristic restrictions on lanugage usage which can be used to develop information formats for text sentences in the subfield.
Abstract: This paper presents the results of an investigation into information structures in natural language science texts. A novel hypothesis was tested; namely, that the literature of a science subfield has characteristic restrictions on lanugage usage which can be used to develop information formats for text sentences in the subfield. The formats provide a standard representation of the specific types of information found in sentences of subfield articles, though a priori semantic categories are not used. The method of sublanguage grammars for obtaining information formats is described. Illustrations are drawn from a sublanguage grammar written for a subfield of pharmacology. Parts of the procedure are computerized or are being implemented.


Journal ArticleDOI
TL;DR: A procedure is developed for optimal allocation of resources among the many processes of a library system that maximizes the expected value of the decision‐maker's utility function.
Abstract: A procedure is developed for optimal allocation of resources among the many processes of a library system. Queueing theory is used to model processes as either waiting or balking processes. The optimal allocation of resources to these processes is defined as that which maximizes the expected value of the decision-maker's utility function. An application of the procedure to a specific library system is discussed.

Journal ArticleDOI
TL;DR: The author contends that the use of computer‐assisted in‐ struction in conjunction with the on‐line information retrieval system is the most promising form of instruc‐ tion in that the medium itself, as well as the message, may be used to acquaint the novice searcher with an interactive userhystem interface.
Abstract: The early 1970's have clearly shown a trend toward the use of on-line systems as the ideal medium for information retrieval. The emphasis placed on direct access by the practitioners in the field, rather than delegated searches through information specialists, leads to the growing need for an efficient design in training transient user groups. Printed manuals, live help, audiovisual presentations and on-line instruction have all been used with varying degrees of success. The author contends that the use of computer-assisted in- struction in conjunction with the on-line information retrieval system is the most promising form of instruc- tion in that the medium itself, as well as the message, may be used to acquaint the novice searcher with an interactive userhystem interface.

Journal ArticleDOI
Michael Bommer1
TL;DR: The major reasons which seem to be preventing operations research from achieving its potential and fulfilling the expectations of its proponents in library management are explored.
Abstract: In the past decade a multitude of operations research models have appeared in the literature, each promising to help library managers in making better plans and decisions. To date, few of these models are being employed by libraries. The major reasons which seem to be preventing operations research from achieving its potential and fulfilling the expectations of its proponents in library management are explored. Finally, a prescription is formulated to guide library managers in working more effectively with operations researchers.

Journal ArticleDOI
TL;DR: The judgmental process of evaluation and the scientific nature of evaluation study in the context of purpose statements; criteria; the selection of variables and data collection and analysis techniques; and requirements of validity, reproducibility and reliability are discussed.
Abstract: This paper considers conceptual and methodological components of information science evaluation studies. The paper discusses the judgmental process of evaluation and the scientific nature of evaluation study in the context of purpose statements; criteria; the selection of variables and data collection and analysis techniques; and requirements of validity, reproducibility and reliability. Industrial value analysis/engineering methodology is described and related to assessments of information products and services. The state-of-the-art of evaluation study in information science is analyzed with respect to 1. the scope of evaluation studies; 2. the use of laboratory-type environments; 3. the use of surrogate judges; 4. selection of variables; 5. frequency of study; and 6. comparability of study results. Evaluation study is seen as essential to the management of information centers and systems and as having appreciable growth potential.

Journal ArticleDOI
TL;DR: A curriculum shift from traditional librarianship to an emphasis on computerization and automation appears to encompass theory as well as technology, observing trends in the educational system in information science.
Abstract: This is our second study of curricula in information science. It provides a basis for comparison of the 1968 curricula with those of 1972, observing trends in the educational system in information science. Since this study solicited information on all three educational levels, the statistics describing all programs are given; comparisons are made at the MS level only, using the 1968 data. These indicate a curriculum shift from traditional librarianship to an emphasis on computerization and automation. This trend appears to encompass theory as well as technology. The most frequently offered course “Introduction to Information Science,” exposes students to a new way of looking at library and information problems. Programming, theories of information content identification, library automation and some basic mathematics has increased. If the trend continues, libraries may be turning into Community Information Centers utilizing telecommunication for their information needs. Deans, faculty, professional society and industry representatives reviewed the questionnaire analysis results in Workshop III and made recommendations for educational goals and curricula on three levels, i.e., the bacculaureate, masters and doctorate.

Journal ArticleDOI
TL;DR: This paper attempts to evaluate two articles; one recently read to the ASIS Special Interest Group on Foundations of Information Science (SIG/FIS) by Heilprin, and the other published in the Journal of the American Society for Information Science by Artandi, which deal with the theoretical problems of defining “ information” and/or “information science.”
Abstract: This paper attempts to evaluate two articles; one recently read to the ASIS Special Interest Group on Foundations of Information Science (SIG/FIS) by Heilprin, and the other published in the Journal of the American Society for Information Science by Artandi, which deal with the theoretical problems of defining “information” and/or “information science.” To that end, definitions, types, functions, forms of presentation and validation criteria of a theory are discussed with relation to science in general and to information science in particular. Arguments are made that: 1. information science is at present a practice-oriented discipline, thus, good practice should be based on sound theory; 2. in the field of information more emphasis should be given to theory development; and 3. more precise and formal methods should be employed for presenting a theory so that it may be properly understood by practitioners and theoreticians.


Journal ArticleDOI
TL;DR: The possibility is suggested that all variables affecting the performance of retrieval systems can be decomposed into M‐factors, and the implications of these ideas for laboratory tests on retrieval systems are discussed.
Abstract: Cleverdon's “inverse relationship between recall and precision” and other relationships between the measures of performance of retrieval systems, arise from variations in other (independent) system variables. This paper explores the properties of these independent variables, and defines a class of basic variables named M-factors. The “inverse relationship” is generated by variations in a single M-factor, but does not always obtain if more than one M-factor is allowed to vary. A number of examples are discussed in the light of this analysis. The possibility is suggested that all variables affecting the performance of retrieval systems can be decomposed into M-factors. The implications of these ideas for laboratory tests on retrieval systems are discussed. A mathematical treatment of the ideas is also given.

Journal ArticleDOI
TL;DR: The model modified to explicitly include renewals and to describe a semester loan policy is demonstrated, and the model is used to simulate various alternative loan policies and the effect of these alternatives on book availability is given.
Abstract: An existing model of the circulation subsystem of a library has been modified to explicitly include renewals and to describe a semester loan policy. An extensive amount of experimental data have been obtained by recording the history of all books which circulated from Sears Library at Case Western Reserve University during the fall semester, 1973. These data, which characterize various aspects of user behavior, in combination with a numerical statement of the loan policy of the library serve as input parameters for the model which simulates the events that occur in the circulation subsystem. The results of the simulation provide information about the availability of books and the delay associated with recalls. These results have been compared to experimental observation. The agreement between the predicted values and the experimental observation is good. Thus the validity of the model is demonstrated, and the model is used to simulate various alternative loan policies. The effect of these alternatives on book availability is also given.

Journal ArticleDOI
TL;DR: The history of thesaurus development, its present status, its planned future, and its costs vs. its benefits are summarized.
Abstract: With respect to information storage and retrieval, a thesaurus is a means to various ends. This paper describes these desired ends (and the utility of a thesaurus with respect thereto) and provides a case history of symbiotic development of an information system (for financial information) and its thesaurus. In this connection, the DISCLOSURE data base publication system and thesaurus are described, and the applications of the thesaurus both to the data base system and to retrieval applications are described. The history of thesaurus development, its present status, its planned future, and its costs vs. its benefits are summarized.

Journal Article
TL;DR: The likelihood that the development of theoretical foundations for information science is a distant goal suggests that the near-term objectives of the information research community should be formulated more realistically as mentioned in this paper, since the task of science is increasingly being determined by urgent, and changing, social concerns.
Abstract: The likelihood that the development of theoretical foundations for information science is a distant goal suggests that the near‐term objectives of the information research community should be formulated more realistically. Since the task of science is increasingly being determined by urgent, and changing, social concerns, these objectives should be formulated within a socially utilitarian framework and within a problem domain different from that which has sustained information research in the past 15 years. The imminent societal imperative, management of knowledge as a social resource, appears to be a viable new framework for defining major roles and tasks for theory‐oriented work in information science.

Journal ArticleDOI
TL;DR: While Canada has a proportion of journals to population higher than most countries, there are comparatively few journals in engineering and applied sciences, with journals of chemistry, physics and zoology leading the list.
Abstract: Most studies of scientific journals have been discipline-oriented, and thus transnational. Journals are here considered from the point of view of the Canadian national information system. While Canada has a proportion of journals to population higher than most countries, there are comparatively few journals in engineering and applied sciences. Canadian journals rank fairly low on an international scale based on citations, with journals of chemistry, physics and zoology leading the list.

Journal ArticleDOI
TL;DR: It is suggested that the near‐term objectives of the information research community should be formulated more realistically, within a socially utilitarian framework and within a problem domain different from that which has sustained information research in the past 15 years.
Abstract: The likelihood that the development of theoretical foundations for information science is a distant goal suggests that the near-term objectives of the information research community should be formulated more realistically. Since the task of science is increasingly being determined by urgent, and changing, social concerns, these objectives should be formulated within a socially utilitarian framework and within a problem domain different from that which has sustained information research in the past 15 years. The imminent societal imperative, management of knowledge as a social resource, appears to be a viable new framework for defining major roles and tasks for theory-oriented work in information science.

Journal ArticleDOI
Robert O. Stanton1
TL;DR: An experimental “management‐by‐objectives” performance system was operated by the Libraries and Information Systems Center of Bell Laboratories during 1973 and it was found that, though the system was very effective for work planning and the development of people, difficulties were encountered in applying it to certain classes of employees.
Abstract: An experimental “management-by-objectives” performance system was operated by the Libraries and Information Systems Center of Bell Laboratories during 1973. It was found that, though the system was very effective for work planning and the development of people, difficulties were encountered in applying it to certain classes of employees, particularly those doing public service or routine work.

Journal ArticleDOI
Jean Rafsnider1
TL;DR: The lower utilization of recent citations by social scientists may be shown to reflect the greater time required for scientific work and subsequent publication of a social science research contribution as opposed to one from the physical sciences.
Abstract: Comparisons of hard and soft science in user (Garvey) and citation (Price) studies were shown to exhibit a previously unnoticed complementarity. In particular, the lower utilization of recent citations by social scientists may be shown to reflect the greater time required for scientific work and subsequent publication of a social science research contribution as opposed to one from the physical sciences. This analysis offers an alternative to the view of hard and soft science offered by Price and raises several research questions regarding the relation between use and citation of scientific literatures.

Journal ArticleDOI
TL;DR: The degree of interdisciplinarity in Canadian science is very low, as determined by the point of view of the interaction of significant Canadian journals.
Abstract: Interdisciplinarity in Canadian science is considered from the point of view of the interaction of significant Canadian journals. Most of these journals cite themselves primarily, and journals in other or related sciences receive few citations. As a result, it can be concluded that the degree of interdisciplinarity, as determined by this measure, is very low in Canadian science.