scispace - formally typeset
Search or ask a question
Topic

Library classification

About: Library classification is a research topic. Over the lifetime, 2991 publications have been published within this topic receiving 26781 citations.


Papers
More filters
BookDOI
25 Jul 2014
TL;DR: Data Classification: Algorithms and Applications explores the underlying algorithms of classification as well as applications of classification in a variety of problem domains, including text, multimedia, social network, and biological data.
Abstract: Comprehensive Coverage of the Entire Area of Classification Research on the problem of classification tends to be fragmented across such areas as pattern recognition, database, data mining, and machine learning. Addressing the work of these different communities in a unified way, Data Classification: Algorithms and Applications explores the underlying algorithms of classification as well as applications of classification in a variety of problem domains, including text, multimedia, social network, and biological data. This comprehensive book focuses on three primary aspects of data classification: Methods-The book first describes common techniques used for classification, including probabilistic methods, decision trees, rule-based methods, instance-based methods, support vector machine methods, and neural networks. Domains-The book then examines specific methods used for data domains such as multimedia, text, time-series, network, discrete sequence, and uncertain data. It also covers large data sets and data streams due to the recent importance of the big data paradigm. Variations-The book concludes with insight on variations of the classification process. It discusses ensembles, rare-class learning, distance function learning, active learning, visual learning, transfer learning, and semi-supervised learning as well as evaluation aspects of classifiers.

514 citations

Journal ArticleDOI
TL;DR: This work introduces a new methodology for constructing classification systems at the level of individual publications, and presents an application in which a classification system is produced that includes almost 10 million publications.
Abstract: Classifying journals or publications into research areas is an essential element of many bibliometric analyses. Classification usually takes place at the level of journals, where the Web of Science subject categories are the most popular classification system. However, journal-level classification systems have two important limitations: They offer only a limited amount of detail, and they have difficulties with multidisciplinary journals. To avoid these limitations, we introduce a new methodology for constructing classification systems at the level of individual publications. In the proposed methodology, publications are clustered into research areas based on citation relations. The methodology is able to deal with very large numbers of publications. We present an application in which a classification system is produced that includes almost 10 million publications. Based on an extensive analysis of this classification system, we discuss the strengths and the limitations of the proposed methodology. Important strengths are the transparency and relative simplicity of the methodology and its fairly modest computing and memory requirements. The main limitation of the methodology is its exclusive reliance on direct citation relations between publications. The accuracy of the methodology can probably be increased by also taking into account other types of relations–for instance, based on bibliographic coupling. © 2012 Wiley Periodicals, Inc.

490 citations

Proceedings ArticleDOI
29 Nov 2001
TL;DR: In this article, a hierarchical classification method that can classify documents to both leaf and internal categories has been proposed, which considers the degree of misclassification in measuring the classification performance.
Abstract: Hierarchical classification refers to the assignment of one or more suitable categories from a hierarchical category space to a document. While previous work in hierarchical classification focused on virtual category trees where documents are assigned only to the leaf categories, we propose a top-down level-based classification method that can classify documents to both leaf and internal categories. As the standard performance measures assume independence between categories, they have not considered the documents incorrectly classified into categories that are similar to or not far from correct ones in the category tree. We therefore propose category-similarity measures and distance-based measures to consider the degree of misclassification in measuring the classification performance. An experiment has been carried out to measure the performance of our proposed hierarchical classification method. The results showed that our method performs well for a Reuters text collection when enough training documents are given and the new measures have indeed considered the contributions of misclassified documents.

418 citations

Journal ArticleDOI
TL;DR: It is suggested that external variables that affect perceived ease of use and usefulness need to be considered as important factors in the process of designing, implementing, and operating digital library systems to help decrease the mismatch between system design and local users' realities, and further facilitate the successful adoption ofdigital library systems in developing countries.

279 citations


Network Information
Related Topics (5)
Metadata
43.9K papers, 642.7K citations
80% related
Information technology
53.9K papers, 894.1K citations
79% related
Search engine indexing
20.9K papers, 516.9K citations
79% related
Personal knowledge management
14.5K papers, 498.2K citations
77% related
Information system
107.5K papers, 1.8M citations
77% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202314
202233
202143
202045
201970
201861