scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the Association for Information Science and Technology in 1980"


Journal ArticleDOI
TL;DR: The problem of how to permit a patron to represent the relative importance of various index terms in a Boolean request while retaining the desirable properties of a Boolean system is concerned.
Abstract: This article concerns the problem of how to permit a patron to represent the relative importance of various index terms in a Boolean request while retaining the desirable properties of a Boolean system. The character of classical Boolean systems is reviewed and related to the notion of fuzzy sets. The fuzzy set concept then forms the basis of the concept of a fuzzy request in which weights are assigned to index terms. The properties of such a system are discussed, and it is shown that such systems retain the manipulability of traditional Boolean requests.

191 citations


Journal ArticleDOI
TL;DR: The results of these tests suggest that environmental and situational constraints play a major part in determining information behavior and that interventions aimed at improving information flow within organizations must be carefully tailored to the specific situation if they are to have maximum impact.
Abstract: A management-oriented model for describing and studying information behavior is proposed. The model focuses on variables which can be manipulated by managers—primarily environmental and situational variables—rather than on variables describing individual attributes. Several hypotheses derived from the model are tested using a database describing the information-related attitudes and behaviors of some 560 scientists and engineers working in a variety of settings and roles. All but one of the hypotheses were confirmed, adding support to the model. The results of these tests suggest that environmental and situational constraints play a major part in determining information behavior. They suggest that interventions aimed at improving information flow within organizations must be carefully tailored to the specific situation if they are to have maximum impact.

107 citations


Journal ArticleDOI
TL;DR: A linkage similarity measure which takes into account both the bibliographic coupling of documents and their cocitations produced improved document retrieval over a measure based only on bibliographical coupling.
Abstract: A linkage similarity measure which takes into account both the bibliographic coupling of documents and their cocitations (both cited and citing papers) produced improved document retrieval over a measure based only on bibliographic coupling. The test collection consisted of 1712 papers whose relevance to specific queries had been judged by users. To evaluate the effect of using cocitation data, we calculated for each query two measures of similarity between each relevant paper and every other paper retrieved. Papers were then sorted by the similarity measures, producing two ordered lists. We then compared the resulting predictions of relevance, partial relevance, and non-relevance to the user's evaluations of the same papers. Over-all, the change from the bibliographic coupling measure to the linkage similarity measure, representing the introduction of cocitation data, resulted in better retrieval performance.

67 citations


Journal ArticleDOI
TL;DR: It is found that a negative binomial distribution fits scientific productivity data (by the chi‐squared goodness‐of‐fit test) better than many other distributions such as geometric, logarithmic, zeta, cumulative advantage, etc.
Abstract: Results in the literature concerning the probability that an author publishes r articles in time t are reexamined, and it is found that a negative binomial distribution fits scientific productivity data (by the chi-squared goodness-of-fit test) better than many other distributions such as geometric, logarithmic, zeta, cumulative advantage, etc. It is shown analytically that the negative binomial distribution describes a pattern of scientific productivity under the “success-breeds-success” condition in a wide variety of social circumstances.

65 citations


Journal ArticleDOI
TL;DR: A searching algorithm is suggested that helps the inquirer searching for documents on a large interactive system to construct and modify queries inefficiently and to avoid the effect of these biases.
Abstract: The way that individuals construct and modify search queries on a large interactive document retrieval system is subject to systematic biases similar to those that have been demonstrated in experiments on judgments under uncertainty. These biases are shared by both naive and sophisticated subjects and cause the inquirer searching for documents on a large interactive system to construct and modify queries inefficiently. A searching algorithm is suggested that helps the inquirer to avoid the effect of these biases.

53 citations


Journal ArticleDOI
John O'Connor1
TL;DR: The present experiment involved a greater variety of forms of retrieval question and search words were selected independently by two different people for each retrieval question, producing average recall ratios and false retrieval rates.
Abstract: Passage retrieval (already operational for lawyers) has advantages in output form over reference retrieval and is economically feasible. Previous experiments in passage retrieval for scientists have demonstrated recall and false retrieval rates as good or better than those of present reference retrieval services. The present experiment involved a greater variety of forms of retrieval question. In addition, search words were selected independently by two different people for each retrieval question. The search words selected, in combination with the computer procedures used for passage retrieval, produced average recall ratios of 72 and 67%, respectively, for the two selectors. The false retrieval rates were (except for one predictably difficult question) respectively 13 and 10 falsely retrieved sentences per answer-paper retrieved.

52 citations


Journal ArticleDOI
TL;DR: A strong positive relationship was found to exist between the scientists' assessment of journal influence and the citation influence ratings, with product‐moment and Spearman rank correlations in the 0.7–0.9 range for seven of the ten fields.
Abstract: A survey was undertaken to ascertain the extent of agreement between scientists' subjective assessment of the average influence per article for articles in 58 different scientific journals, when compared with corresponding citation influence ratings for articles in the same journals. The scientists' assessments were derived from questionnaires sent to faculty at 97 American universities covering journals in ten different research fields. A strong positive relationship was found to exist between the scientists' assessment of journal influence and the citation influence ratings, with product-moment and Spearman rank correlations in the 0.7–0.9 range for seven of the ten fields.

50 citations


Journal ArticleDOI
TL;DR: The motivating reasons for an approach to a systematic approach to the specification, design, and development of information systems are described and some of the techniques that have been developed to assist the software specification and design activity are surveyed.
Abstract: There is a great need for a systematic approach to the specification, design, and development of information systems. This article describes the motivating reasons for such an approach and surveys some of the techniques that have been developed to assist the software specification and design activity. A methodology is seen as a combination of tools and techniques employed within an organizational and managerial framework that can be consistently applied to successive information system development projects. The ways that information system development organizations can create and use such methodologies are emphasized.

48 citations


Journal ArticleDOI
TL;DR: The development of on-line systems in science and technology, beginning with the first demonstrations of batch searching through the development of multinational on-LINE services, is outlined and the great effort and expenditures underlying the production of services are examined.
Abstract: The development of on-line systems in science and technology, beginning with the first demonstrations of batch searching through the development of multinational on-line services, is outlined. The key technological components are described: computers that operate in a time-shared mode and an efficient, low-cost telecommunications system. Insight is provided on how on-line systems are put together and made accessible to users. Finally, on the economics of on-line systems, the great effort and expenditures underlying the production of services are examined.

40 citations


Journal ArticleDOI
TL;DR: An experimental computer program has been developed to classify documents according to the 80 sections and five major section groupings of Chemical Abstracts (CA) using pattern recognition techniques supplemented by heuristics.
Abstract: An experimental computer program has been developed to classify documents according to the 80 sections and five major section groupings of Chemical Abstracts (CA). The program uses pattern recognition techniques supplemented by heuristics. During the “training” phase, words from pre-classified documents are selected, and the probability of occurrence of each word in each section of CA is computed and stored in a reference dictionary. The “classification” phase matches each word of a document title against the dictionary and assigns a section number to the document using weights derived from the probabilities in the dictionary. Heuristic techniques are used to normalize word variants such as plurals, past tenses, and gerunds in both the training phase and the classification phase. The dictionary lookup technique is supplemented by the analysis of chemical nomenclature terms into their component word roots to influence the section to which the documents are assigned. Program performance and human consistency have been evaluated by comparing the program results against the published sections of CA and by conducting an experiment with people experienced in the assignment of documents to CA sections. The program assigned approximately 78% of the documents to the correct major section groupings of CA and 67% of the correct sections or cross-references at a rate of 100 documents per second.

35 citations


Journal ArticleDOI
TL;DR: NLM's development of a document delivery system to complement its bibliographic retrieval system is discussed, and MEDLINE is presented as a prototype for on-line bibliographical search systems.
Abstract: MEDLINE is presented as a prototype for on-line bibliographic search systems. Creation of the data base, indexing language, and file organization are reviewed. On accessing the files, search logic is illustrated with a sample MEDLINE search. NLM's development of a document delivery system to complement its bibliographic retrieval system is discussed.

Journal ArticleDOI
TL;DR: It is argued that in information science the authors have to distinguish physical, objective, or document space from perspective, subjective, or information space because each is a systematic distortion of the other.
Abstract: It is argued that in information science we have to distinguish physical, objective, or document space from perspective, subjective, or information space. These two spaces are like maps and landscapes: each is a systematic distortion of the other. However, transformations can be easily made once the two spaces are distinguished. If the transformations are omitted we only get unhelpful physical solutions to information problems.

Journal ArticleDOI
TL;DR: The techniques used to detect and correct spelling errors in the data base of Chemical Abstracts Service are described, which achieves a high level of performance using hashing techniques for dictionary look-up and compression.
Abstract: On-line bibliographic search systems tend to increase the visibility of spelling errors through the use of indexes of unique terms; even low error rates in a data base can result in large numbers of misspelled terms in these indexes. This article describes the techniques used to detect and correct spelling errors in the data base of Chemical Abstracts Service. A computer program for spelling error detection achieves a high level of performance using hashing techniques for dictionary look-up and compression. Heuristic procedures extend the dictionary and increase the proportion of misspelled words in the words flagged. Automatic correction procedures are applied only to words which are known to be misspelled; other corrections are performed manually during the normal editorial cycle. The constraints imposed on the selection of a spelling error detection technique by a complex data base, human factors, and high-volume production are discussed.

Journal ArticleDOI
TL;DR: Results are discussed relative to biomedical literature, and these findings are shown to be subsumed under a more general hypothesis regarding the structure of scientific literature.
Abstract: Citation databases can be readily partitioned by examining the extent to which journals “feed back” to one another. A method has been developed to divide large files into clusters of journals without requiring the use of arbitrary starting points. Results are discussed relative to biomedical literature, and these findings are shown to be subsumed under a more general hypothesis regarding the structure of scientific literature.

Journal ArticleDOI
TL;DR: This communication shows that the approach will effect reductions in the number of interdocument comparisons only if the documents are each indexed by a limited number of indexing terms; if exhaustive indexing is used, many document pairs will be compared several times over and the computation will be greater than when conventional approaches are used to generate the similarity matrix.
Abstract: Some of the automatic classification procedures used in information retrieval derive clusters of documents from an intermediate similarity matrix, the computation of which involves comparing each of the documents in the collection with all of the others. It has recently been suggested that many of these comparisons, specifically those between documents having no terms in common, may be avoided by means of the use of an inverted file to the document collection. This communication shows that the approach will effect reductions in the number of interdocument comparisons only if the documents are each indexed by a limited number of indexing terms; if exhaustive indexing is used, many document pairs will be compared several times over and the computation will be greater than when conventional approaches are used to generate the similarity matrix.

Journal ArticleDOI
TL;DR: The average citation per article in chemistry by age of citation is related to the Poisson frequency distribution, which provides a formula for calculation of a half‐life for chemistry which depends on the structure of the research front exhibited in the literature.
Abstract: The average citation per article in chemistry by age of citation is related to the Poisson frequency distribution. This provides a formula for calculation of a half-life for chemistry which depends on the structure of the research front exhibited in the literature.

Journal ArticleDOI
TL;DR: Action to alleviate information inequity should be guided by the principles of contextualism, incrementalism, motivation of information users, and more knowledge of the absorptive process that is unique to each cultural group.
Abstract: This article suggests that action to alleviate information inequity should be guided by the principles of contextualism, incrementalism, motivation of information users, and more knowledge of the absorptive process that is unique to each cultural group To do this, information services should recognize cultural pluralism and the need to eliminate information poverty as viewed by the members of the groups being served For some it may mean provision of information services to help in their assimilation into the mainstream, while for others it may mean provision of information services for greater cultural cohesion Instead of the term “minority” or “disadvantaged,” the idea of a “cultural community” is advanced as the proper unit of analysis to identify significant numbers of potential users who may have distinct values, beliefs, and attitudes toward external information services To grasp effectively the behavioral and cultural dimensions of information inequity, a national inventory of “cultural communities” is suggested using social mapping This technique would include: a delineation of cultural groups, a description of the indigenous social/information organization, a plotted pattern of the movement of information within the group, the information values of the group, and an analytical description of the information poor within each group These groups can then be visualized as a series of different information constituencies, with different information needs and different capacities for absorbing information

Journal ArticleDOI
TL;DR: The nonbibliographic area encompasses a number of different types of data bases, including referral, numeric, textual-numeric, chemical and physical properties, and full-text, which need to be studied for their possible applications in improving and extending traditional reference and other library and information services.
Abstract: Over 400 data bases are available on-line, and the majority of these are nonbibliographic. The nonbibliographic area encompasses a number of different types of data bases, including referral, numeric, textual-numeric, chemical and physical properties, and full-text. The growth in nonbibliographic data base services has been not nearly as visible to information specialists as the growth in bibliographic on-line data base services. One reason for this is that the services are primarily being marketed to and used by end users, with libraries and information centers largely being bypassed. Some nonbibliographic data base systems are end-user oriented, but there are many others that need to be studied for their possible applications in improving and extending traditional reference and other library and information services.

Journal ArticleDOI
Gerard Salton1
TL;DR: A number of recent law cases are examined to illustrate how privacy cases are currently being adjudicated in the United States and to identify the limits of currently available privacy protection.
Abstract: The role and importance of information privacy in the modern society are briefly described. A number of recent law cases are then examined to illustrate how privacy cases are currently being adjudicated in the United States and to identify the limits of currently available privacy protection. Finally, certain issues are raised regarding the available techniques for insuring data confidentiality and security.

Journal ArticleDOI
Jerry Specht1
TL;DR: Patron use of the University of Illinois' online circulation system was studied by interviewing and observing sample patrons at public terminals and it was found that the greatest increase in user success per dollar expended is likely to come from the addition of online user aids.
Abstract: Patron use of the University of Illinois' online circulation system (LCS) was studied by interviewing and observing sample patrons at public terminals. It was found that 56% of the “original known-item searches” (searches in which the user had not obtained any information from the card catalog) were successful—with 16% failing as a result of error in using LCS—and that 86% of the “location searches” (searches in which the user had obtained the call number of the item from the card catalog and was looking for the location) were successful—with 8% failing as a result of error in using LCS. It was also found that the way in which graduate students use the system is significantly different from the way in which undergraduates use it. It is suggested that the greatest increase in user success per dollar expended is likely to come from the addition of online user aids. It is also suggested that the evaluation of patron use of online catalogs will continue to rely on sample interviews.

Journal ArticleDOI
TL;DR: Over 130 on‐line bibliographic data bases in science and technology, available through common vendors as of December 1979, are identified and reviewed.
Abstract: Over 130 on-line bibliographic data bases in science and technology, available through common vendors as of December 1979, are identified and reviewed. These have been classified as (1) discipline-wide, transdisciplinary, or multidisciplinary and (2) specialty or problem-oriented. Selected data bases of both types are discussed, including areas covered and functions served.

Journal ArticleDOI
TL;DR: An assessment of the roles of libraries and of information specialists in the world of electronic publication is suggested, suggesting that while the library as an institution may decline in importance, the information specialist of the future is likely to provide information support services much richer and more varied than those offered by the librarian of today.
Abstract: The current place of on-line systems within the communication process in science and technology is defined. On-line systems can be termed “value-added information sources,” as illustrated by examples of available bibliographic data bases, numeric data banks, and referral data bases. Limitations of existing patterns of data base production and distribution are described, and the management considerations in operating on-line search services in libraries are outlined, including financing, facilities, staffing, service promotion, document delivery, and evaluation. The role of library schools in preparing searchers of on-line systems is also reviewed. A number of future developments in on-line systems are predicted based on available technological forecasts. The ways in which professionals can be expected to use future on-line systems are described, highlighting important differences from the present information-seeking environment. The article concludes with an assessment of the roles of libraries and of information specialists in the world of electronic publication, suggesting that while the library as an institution may decline in importance, the information specialist of the future is likely to provide information support services much richer and more varied than those offered by the librarian of today.

Journal ArticleDOI
TL;DR: The journal literature on a subject area in psychology—operant conditioning—as indexed in the 1978 issues of Psychological Abstracts and Index Medicus is compared and considerable differences between PA and IM in their recency of coverage of the literature are found.
Abstract: The journal literature on a subject area in psychology—operant conditioning—as indexed in the 1978 issues of Psychological Abstracts (PA) and Index Medicus (IM) is compared. Considerable overlap is found for this subject in the coverage of journal titles by the two indexing tools, but use of both is necessary to assure a comprehensive search. The extent of overlap corresponds closely with the findings of the Bearman-Kunberger overlap study. Considerable differences between PA and IM in their recency of coverage of the literature are found. Core journals for articles on operant conditioning are identified. Scattering of the articles among journal titles is found to conform to Bradford's law.

Journal ArticleDOI
TL;DR: A preliminary application of the retrieval performance of book indexes to the subject indexing of two major encyclopedias showed one encyclopedia apparently superior in both the finding and discrimination abilities of retrieval performance.
Abstract: The retrieval performance of book indexes can be measured in terms of their ability to direct a user selectively to text material whose identity but not location is known. The method requires human searchers to base their searching strategies on actual passages from the book rather than on test queries, natural or contrived. It circumvents the need for relevance judgment, but still yields performance indicators that correspond approximately to the recall and precision ratios of large document retrieval system evaluation. A preliminary application of the method to the subject indexing of two major encyclopedias showed one encyclopedia apparently superior in both the finding and discrimination abilities of retrieval performance. The method is presently best suited for comparative testing since its ability to yield absolute or reproducible measures is as yet not established.


Journal ArticleDOI
TL;DR: Questions regarding training for computer-based reference services, including who is to be trained and who is responsible for training are raised, and a summary of the training provided to date is provided.
Abstract: This article raises several questions regarding training for computer-based reference services, including who is to be trained and who is responsible for training. It discusses these issues and then provides a summary of the training provided to date by search service suppliers, database suppliers, library schools and extension programs, library cooperatives, and professional organizations. The available training materials are also discussed. Some projections are made of likely future activities.

Journal ArticleDOI
TL;DR: It is shown that as a control system, BC is subject to the laws of cybernetics and only the descriptive, transcriptive, and ordering functions of a BC system can be subjected to full control governed by generally applicable rules.
Abstract: The concept of bibliographic control (BC) is explored from its origin to its development into Universal Bibliographic Control (UBC). It is analyzed as to its functions and operations, namely (a) the form-oriented or descriptive function, (b) the transcription of descriptive data onto a document surrogate, (c) the sequential ordering of these surrogates, and (d) the content-oriented or exploitative function. It is shown that as a control system, BC is subject to the laws of cybernetics. Only the descriptive, transcriptive, and ordering functions of a BC system can be subjected to full control governed by generally applicable rules, while the content-oriented retrieval function, being based on subjective judgments of relevance by indexers and ultimate users, are not completely controllable. The attainable limits of BC and UBC can thus be established.

Journal ArticleDOI
TL;DR: The authors point out that while technology has led toward centralization of automated library services, new developments are now pushing toward decentralization and coordination is a requirement to avoid fragmentation in this new environment.
Abstract: Bibliographic control before and after MARC is reviewed. The capability of keying into online systems has brought an interdependence among libraries, the service centers that mediate between them, and the large utilities that process and distribute data. From this has developed the basic network structure among libraries in the United States. The independent development of major networks has brought problems in standardization and coordination. The authors point out that while technology has led toward centralization of automated library services, new developments are now pushing toward decentralization. Coordination is a requirement to avoid fragmentation in this new environment.

Journal ArticleDOI
TL;DR: Texts of American English varying in subject and style have been found to fulfill the condition of variety-generation techniques for text compression, and the best results are obtained using a symbol set generated from a sample of the complete data base, although results from subsets of the data base are almost as good.
Abstract: The use of variety-generation techniques for text compression depends on the selection of symbol sets, or sets of variable-length character strings occurring approximately equifrequently in the text in question. In order that the method perform efficiently in a variety of situations, the symbol set must be reasonably independent of the particular text used in its generation. Hence, texts of different origins must be similar in their microstructure for the technique to work well. Texts of American English varying in subject and style have been found to fulfill this condition. On average the texts can be represented with a space saving of just over 50% on the space used by a fixed-length 8-bit representation of the characters, and the best results are obtained using a symbol set generated from a sample of the complete data base, although results from subsets of the data base are almost as good.

Journal ArticleDOI
TL;DR: In this article, the problem of forecasting monthly demands for library network services is considered, especially in terms of using forecasts as inputs to policy analysis models and the use of forecasts as an aid to budgeting and staffing decisions.
Abstract: The problem of forecasting monthly demands for library network services is considered, especially in terms of using forecasts as inputs to policy analysis models and in terms of the use of forecasts as an aid to budgeting and staffing decisions. Forecasting methods considered include Box-Jenkins time-series methodology, adaptive filtering, and linear regression. Using demand data from the Illinois Library and Information Network for 1971–1978, it is shown that fading-memory regression is the most appropriate method, in terms of both accuracy and ease of use.