Showing papers on "Document retrieval published in 1970"

PDF

Open Access

Journal Article•DOI•

Automatic Processing of Foreign Language Documents.

[...]

Gerard Salton¹•Institutions (1)

01 May 1970-Journal of the Association for Information Science and Technology

TL;DR: The methods are evaluated and it is shown that the effectiveness of the mixed language processing is approximately equivalent to that of the standard process operating within a single language only.

...read moreread less

Abstract: Experiments conducted over the last few years with the SMART document retrieval system have shown that fully automatic text processing methods using relatively simple English language analysis tools are as effective for document indexing, classification, search, and retrieval as the more elaborate manual methods normally used. The present study describes an extension of the SMART procedures to German language materials. A multilingual thesaurus is used for the analysis of documents and search requests, and tools are provided which make it possible to process English documents against German queries, and vice versa. The methods are evaluated and it is shown that the effectiveness of the mixed language processing is approximately equivalent to that of the standard process operating within a single language only.

...read moreread less

137 citations

Journal Article•DOI•

On Deriving Design Equations for Information Retrieval Systems.

[...]

William S. Cooper¹•Institutions (1)

University of California, Berkeley¹

01 Nov 1970-Journal of the Association for Information Science and Technology

TL;DR: The question of how design equations, when and if they become available, will be the keystones of retrieval system theory, are derived is investigated in three case studies corresponding to three different types of retrieval systems.

...read moreread less

Abstract: In recent research on document retrieval systems, considerable attention has been devoted to the problem of defining appropriate performance measures, but very little has been done to derive design equations that make use of them. Design equations show the relationships that obtain between retrieval performance and the design characteristics of the system under analysis. Because design equations, when and if they become available, will be the keystones of retrieval system theory, the question of how they can be derived is an important one. In this paper, the question is investigated in three case studies corresponding to three different types of retrieval systems. A design equation is derived for each of the three system types, the equation in each case showing the relationship between expected search length, used as a performance measure, and certain system characteristics having to do with the distribution of index terms over the document collection and the number of errors in the search requests.

...read moreread less

11 citations

Journal Article•DOI•

A Highly Associative Document Retrieval System.

[...]

Carl Cagan¹•Institutions (1)

Washington State University¹

01 Sep 1970-Journal of the Association for Information Science and Technology

TL;DR: This paper describes a document retrieval system implemented with a subset of the medical literature and introduces methods for computation of term‐term association factors, indexing, assignment ofterm‐document relevance values, and computations for recall and relevance.

...read moreread less

Abstract: This paper describes a document retrieval system implemented with a subset of the medical literature. With the exception of the development of a negative dictionary, all system operations are completely automatic. Introduced are methods for computation of term-term association factors, indexing, assignment of term-document relevance values, and computations for recall and relevance. High weights are provided for low-frequency terms, and retrieval is performed directly from highly connected term-document files without elaboration. Recall and relevance are based on quantitative internal system computations, and results are compared with user evaluations.

...read moreread less

10 citations

Posted Content•DOI•

Named Entity Recognition and Normalization Applied to Large-Scale Information Extraction from the Materials Science Literature

[...]

Leigh Weston¹, Vahe Tshitoyan, John Dagdelen, Olga Kononova, Kristin A. Persson, Gerbrand Ceder, Anubhav Jain - Show less +3 more•Institutions (1)

Lawrence Berkeley National Laboratory¹

01 Jan 1970-ChemRxiv

TL;DR: This paper applied text mining with named entity recognition (NER), along with entity normalization, for large-scale information extraction from the published materials science literature and achieved an overall accuracy of 87% on a test set.

...read moreread less

Abstract: Over the past decades, the number of published materials science articles has increased manyfold. Now, a major bottleneck in the materials discovery pipeline arises in connecting new results with the previously established literature. A potential solution to this problem is to map the unstructured raw-text of published articles onto a structured database entry that allows for programmatic querying. To this end, we apply text-mining with named entity recognition (NER), along with entity normalization, for large-scale information extraction from the published materials science literature. The NER is based on supervised machine learning with a recurrent neural network architecture, and the model is trained to extract summary-level information from materials science documents, including: inorganic material mentions, sample descriptors, phase labels, material properties and applications, as well as any synthesis and characterization methods used. Our classifer, with an overall accuracy (f1) of 87% on a test set, is applied to information extraction from 3.27 million materials science abstracts - the most information-dense section of published articles.Overall, we extract more than 80 million materials-science-related named entities, and the content of each abstract is represented as a database entry in a structured format. Our database shows far greater recall in document retrieval when compared to traditional text-based searches due to an entity normalization procedure that recognizes synonyms. We demonstrate that simple database queries can be used to answer complex \meta-questions" of the published literature that would have previously required laborious, manual literature searches to answer. All of our data has been made freely available for bulk download; we have also made a public facing application programming interface (https://github.com/materialsintelligence/matscholar) and website http://matscholar.herokuapp.com/search for easy interfacing with the data, trained models and functionality described in this paper. These results will allow researchers to access targeted information on a scale and with a speed that has not been previously available, and can be expected to accelerate the pace of future materials science discovery.

...read moreread less

3 citations

Patent•

Random information retrieval system

[...]

James E Young¹•Institutions (1)

Xerox¹

14 Jul 1970

2 citations

A Relational Structure for Document Retrieval in Coding Theory

[...]

Nicholas Matthew Esser

01 May 1970

2 citations

Design considerations of on-line document retrieval systems,

[...]

J. T. Cordaro, R. T. Chien

01 Feb 1970

TL;DR: Abstract : Response time of on-line document retrieval systems are analyzed and Linear and inverted file organizations are considered and their response times are evaluated.

...read moreread less

Abstract: : Response time of on-line document retrieval systems are analyzed. Linear and inverted file organizations are considered and their response times are evaluated.

...read moreread less

1 citations

Journal Article•DOI•

Information Retrieval From the Management Point of View

[...]

Louis Kaplan

01 May 1970-College & Research Libraries

1 citations