scispace - formally typeset
Search or ask a question

Showing papers on "Cataloging published in 1970"


Book
01 Jan 1970

56 citations


Book
01 Jan 1970

52 citations


01 Jun 1970
TL;DR: Kilgour's truncation algorithm was used to identify duplicate book orders in the in-process list of Yale University Library, and found recall to be about 90%.
Abstract: F. G. Kilgour's truncation algorithm for machine retrieval from large bibliographic files (1) was tested for performance in matching user-supplied, unedited search clues to bibliographic data contained in a library catalog. Kilgour had previously tested the algorithm to identify duplicate book orders in the in-process list of Yale University Library, and found recall to be about 90%. We have now tested the algorithm, by manual simulation, on data derived from 126 case studies of actual searches of the catalog at Yale University Library. The algorithm achieved 70% recall when compared to results of conventional manual searching. Precision was not determined. Frederick G. Kilgour (1) has proposed an algorithm for machine retrieval of bibliographic entries from very large files, including library catalogs. The algorithm is designed to cope with misspellings and other discrepancies in the user's input when searching a file that contains entries of high editorial quality. The algorithm truncates and matches the user's version and the file's version of author-and-title data in a bibliographic entry. Kilgour reported on a test of his algorithm in which it was used to check for duplicate book orders in a 20,000-entry in-nrocess (acquisition) list at Yale University Library. In this paper we report on a test of his algorithm as applied to a library catalog, rather than an in-process list. The opportunity to test Kilqour's method when applied to retrieval from a library catalog was provided by the ready availability of data derived from a current study (2) of catalog use at Sterling Memorial Library (3.5 million books) at Yale University. This study collects, from a rigidly randomized sample of catalog users, precise information on the clues available to them at the moment of initiating a search. Search clues are recorded exactly as known to the catalog user, employing his own spelling -right or wrong. For each catalog user studied, the outcome of the search is ascertained; complete catalol information is recorded fo. documents identified as pertinent in successful searches. In our test, search clues known to catalog users who seek specific documents, and catalog data corresponding to documents identified by these users, were truncated and matched, by manual simulation, according to Kilgour's algorithm. We were thus able to test its recall performance with real catalog searches. A test of the method's precision was not immediately feasible, because it would require comparison of input data with the entire catalog or a substantial portion of it. However, it is felt that the de:ermination of recall performance should at least indicate whether the method shows sufficient nromise in catalog searching to warrant evaluation of its precision in such an application. Data used in our evaluation came from 126 searches in which the catalog user was successful in locating the specific document he was seeking. The two most successful versions of Kilgour's truncation algorithm were tested, those with formulae 3-3-1 and 5-5-1 (where the three figures stand for the number of initial characters to be retained from the author's last name, the title's first word, and the title's second word). Both user data and catalog data were truncated; where truncated versions matched, the entry was considered retrieved. It should be noted that certain allowances which favored the algorithm were made in our test. Kilqour applied his method to only those entries in the file having a nersonal or corporate name main entry, thus excluding title main entries. Some title main entries were included in our sample cf 126 catalog searches, and all but two were considered retrieved, since the user's clue corresponded perfectly to catalog data; thus any algorithm would have retrieved them. In two title main entries the user's clue did not match perfectly, so we eliminated them from our test, reducing the sample to 124. Further, in our test, all cases where a user had information on any name entry (not just the main entry) in the catalog, that information was considered as though it were a main entry. Thus a user's clue which matched only a joint author and title was still considered retrieved by us, although in Kilgour's test it could not have been, since his test was performed on a single-entry file. Finally, where the only difference was one of punctuation, or where there was a difference because translated or transliterated data were supplied by the user, full credit was given and the item was considered retrieved. In his test on the 20,000-entry in-process list, Kilgour found that his algorithm produced a precision of 97.3%; that is, 97.3% of the "duplicate" references retrieved by the algorithm were indeed duplicates. (It should be noted, rowever. *This work was supported in part by a grant from the H. S. Office of Education.

12 citations



Journal Article
TL;DR: Joseph Z. Nitechi 48b Refiling by the Second; Lucy A. Poucher and, Richard E. Moore 4gj The Mechanization of the Filing Rules for Library Catalogs: Dictionary or Divided.
Abstract: Joseph Z. Nitechi 48b Refiling by the Second. Lucy A. Poucher and, Richard E. Moore 4gj The Mechanization of the Filing Rules for Library Catalogs: Dictionary or Divided. lessica L. Harris and Theodore C. Hi,nes More on D,C Numbers on LC Cards: Ouantity and Quality. Iohn McKin lap Reply to.|ohn McKinlay. Benjamin A. Custer Searching MARC/DPS Records for Area Studies: Results Using Keywords, LC and DC Class Iudith A. Hudson Indexing a Classified Catalog. Cl,at\"a Hotme The Indexing of \"The Reference Shelf.\" lo'hn B. White Worn Book Checklist for Academic Libraries. Les Mattison Library Services to University Branch Campuses: The Ohio State Experience. C. lames Schmidt, Elaine K. Rast, and lohn Linford Dewey and Religion. Robert N. Broadu.s Schrettinger on Class and the Subject Heading: A Note on Early Nineteenth-Century Thinking. Sidney L. lachson Dr. S. R. Ranganathan. Pauline Atherton .fohn B. Corbin. Mary Pound Regional Groups Report. Marian Sanner Resources and Technical Services Division: Annual Reports, 1969/ rg7o. President's Report. W. Carl lackson bgr Acquisitions Section Report. Connie R. Dunlap 596 Cataloging and Classification Section Report. Esther D. Koch 5g8 Reproducrion of Library Materials Section Report. Comparative Numbers. 502

10 citations


Journal ArticleDOI
TL;DR: A concept for mechanized descriptive cataloging is presented, together with four areas of research programs to be undertaken, which will lead to a new generation of descriptive catalogers.
Abstract: A concept for mechanized descriptive cataloging is presented, together with four areas of research programs to be undertaken

5 citations










Journal Article
TL;DR: An attempt has been made to use the 701 Calculator as a tool in the task of searching library files for documents referring to special subjects, but the present system includes only reports which have been written in certain agencies throughout the country and does not include periodicals or books.
Abstract: At the U. S. Naval Ordance Test Station, an attempt has been made to use the 701 Calculator as a tool in the task of searching library files for documents referring to special subjects. The present system includes only reports which have been written in certain agencies throughout the country and does not include periodicals or books. Furthermore, the subjects are for the most part related to the development and testing of items of naval ordnance. In any organization that includes research and development in its functions, it is economical in both time and money to be able to determine what has been done in a field before new programs are started. Scientists and engineers, therefore, are anxious to learn what is in the literature prior to starting some new task. Frequently, however, the labor of searching library files is so great or so unprofitable that it is either not done, or done very incompletely. One of the reasons for the difficulty in searching is that the cataloging of reports may be such that important aspects of their contents are obscured. For example, the following report, Equilibrium Composition and Thermodynamic Properties of Combustion Gases, could logically be cataloged under one or more of several subject headings, which might or might not be appropriate, depending somewhat upon the technical skill of the cataloger. This particular report was filed in the China Lake Technical Library under two subjects: Gases and Physics. Both of these are standard Library of Congress subject headings, and are more or less descriptive of the report. However, under each subject heading there were found to be several hundred other reports filed, in itself a situation that could discourage searching. More serious, however, was the fact that scientists interested in such a category of ordnance development might be equally likely to search under the subjects of Combustion or Physical Chemistry. Most serious, however, was the fact that there was no indication in the cataloging process that one of the main contributions of the report was to describe a numerical method by means of which the thermodynamic properties were computed. As a result, for one reason or another, the report was, in certain respects, lost, as far as many interested individuals were concerned. To avoid some of the difficulty of cataloging documents by subject heading, a system can be used that depends upon a document being described by several single terms called descriptors.(1) In the library application of this system, there is a card for each descriptor. As a document comes to the library it is given an acquisition serial number and this number is entered upon as many different descriptor cards as seem necessary to describe the document. In the example above, if the serial number of the report had been 1234, this number might have been entered on the following cards: Thermodynamics; Combustion; Gases; Computation; Fuel; Impulse; Pressure; Temperature; Entropy; Enthalpy; Adiabatic. Some descriptors do not seem related to the title, but could have been assigned after a brief inspection of the contents by the cataloger. To use such a system when information of a certain type is desired, an individual would list descriptors that would, in his opinion, describe his needs. These descriptor cards would then be pulled from the files and be visually compared for numbers that matched on the several cards. Reports corresponding to these matching serial numbers would then be withdrawn. The original purpose of the 701 program to be described was to mechanize the above procedure with a view to the possible establishment of a daily schedule for library searching. In designing the 701 system, attention was given to the current size of the file and the expected growth during the next five years. The two quantities considered were the expected total number of serial numbers and the total number of descriptors. …


Journal ArticleDOI
TL;DR: Numerical estimates of the present situation indicate that s=100% is optimal now, meaning that everything should be cataloged centrally, and it is suggested that this potential savings of over $0.1 million per day, due to decreased duplication of effort by local libraries, could easily justify the use of a computer system to allow the value of s= 100% to be reached.
Abstract: How should the library community organize its efforts to keep up with a cataloging volume that grows at about 5% per year? The decision is modeled as the problem of selecting the optimal value of a single variable, s, which is the fraction of all cataloging done centrally. It is assumed that the library community has B titles to catalog, and that the G (where G=s·B) most widely acquired titles will be cataloged centrally (e.g., at the Library of Congress), the cataloging records then being distributed to all libraries that have acquired each of the G titles. The other variables used in the model are the distribution of “popularity” among titles and the costs of cataloging books centrally or locally. The optimal s value is first determined in the general case. Then numerical estimates of the present situation are made, which indicate that s=100% is optimal now, meaning that everything (820 titles per day) should be cataloged centrally. At that rate, it is estimated that all cataloging could be done for $0.88 million per day, compared to the $1.07 million that is now being spent with s=40%. It is suggested that this potential savings of over $0.1 million per day, due to decreased duplication of effort by local libraries, could easily justify the use of a computer system to allow the value of s=100% to be reached.


Journal Article
TL;DR: Some of the problems in cataloging and classification are pointed out, to arouse an interest in self-analysis on the part of the small library, and to offer some suggestions as to how the small institution can streamline techniques and economize on meager resources with no loss of value to the card catalog.
Abstract: This article is based on experience with small library collections. It is an effort to point out some of the problems in cataloging and classification, to arouse an interest in self-analysis on the part of the small library, and to offer some suggestions as to how the small institution can streamline techniques and economize on meager resources with no loss of value to the card catalog. It is recognized that the catalogs in many small libraries are unsuited for their tasks as a result of adhering to philosophies of larger institutions. The small institution has neither the need nor resources for such completeness in cataloging and classification. Deviation from standard rules is not advocated. However, consistency in treatment is advised and adherence in depth to standard rules is questioned.



Journal Article
TL;DR: Photography and printing equipment used in the 1950s and 1960s, including microfilm, offset, and Sequential card cameras, are still in use today.
Abstract: ing camera Microfilm f printing Photo * offset Sequential card camera Xerox * printing Tabulation equipment P Pa