scispace - formally typeset
Search or ask a question
Journal

arXiv: Digital Libraries 

About: arXiv: Digital Libraries is an academic journal. The journal publishes majorly in the area(s): Citation & Metadata. Over the lifetime, 2516 publications have been published receiving 29745 citations.


Papers
More filters
Posted Content
TL;DR: The authors proposed a content-based book recommendation system that utilizes information extraction and a machine-learning algorithm for text categorization, which has the advantage of being able to recommend previously unrated items to users with unique interests and to provide explanations for its recommendations.
Abstract: Recommender systems improve access to relevant products and information by making personalized suggestions based on previous examples of a user's likes and dislikes. Most existing recommender systems use social filtering methods that base recommendations on other users' preferences. By contrast, content-based methods use information about an item itself to make suggestions. This approach has the advantage of being able to recommended previously unrated items to users with unique interests and to provide explanations for its recommendations. We describe a content-based book recommending system that utilizes information extraction and a machine-learning algorithm for text categorization. Initial experimental results demonstrate that this approach can produce accurate recommendations.

1,268 citations

Posted Content
TL;DR: A dynamical model of collaborative tagging is presented that predicts regularities in user activity, tag frequencies, kinds of tags used, bursts of popularity in bookmarking and a remarkable stability in the relative proportions of tags within a given url.
Abstract: Collaborative tagging describes the process by which many users add metadata in the form of keywords to shared content Recently, collaborative tagging has grown in popularity on the web, on sites that allow users to tag bookmarks, photographs and other content In this paper we analyze the structure of collaborative tagging systems as well as their dynamical aspects Specifically, we discovered regularities in user activity, tag frequencies, kinds of tags used, bursts of popularity in bookmarking and a remarkable stability in the relative proportions of tags within a given url We also present a dynamical model of collaborative tagging that predicts these stable patterns and relates them to imitation and shared knowledge

997 citations

Posted Content
TL;DR: This paper uses a large test corpus to evaluate Kea’s effectiveness in terms of how many author-assigned keyphrases are correctly identified, and describes the system, which is simple, robust, and publicly available.
Abstract: Keyphrases provide semantic metadata that summarize and characterize documents. This paper describes Kea, an algorithm for automatically extracting keyphrases from text. Kea identifies candidate keyphrases using lexical methods, calculates feature values for each candidate, and uses a machine-learning algorithm to predict which candidates are good keyphrases. The machine learning scheme first builds a prediction model using training documents with known keyphrases, and then uses the model to find keyphrases in new documents. We use a large test corpus to evaluate Kea's effectiveness in terms of how many author-assigned keyphrases are correctly identified. The system is simple, robust, and publicly available.

898 citations

Journal ArticleDOI
TL;DR: Alberto Martin-Martin is funded for a four-year doctoral fellowship by the Ministerio de Educacion, Cultura, y Deportes (Spain) and an international mobility grant from Universidad de Granada and CEI BioTic Granadafunded a research stay at the University of Wolverhampton.
Abstract: Despite citation counts from Google Scholar (GS), Web of Science (WoS), and Scopus being widely consulted by researchers and sometimes used in research evaluations, there is no recent or systematic evidence about the differences between them. In response, this paper investigates 2,448,055 citations to 2,299 English-language highly-cited documents from 252 GS subject categories published in 2006, comparing GS, the WoS Core Collection, and Scopus. GS consistently found the largest percentage of citations across all areas (93%-96%), far ahead of Scopus (35%-77%) and WoS (27%-73%). GS found nearly all the WoS (95%) and Scopus (92%) citations. Most citations found only by GS were from non-journal sources (48%-65%), including theses, books, conference papers, and unpublished materials. Many were non-English (19%-38%), and they tended to be much less cited than citing sources that were also in Scopus or WoS. Despite the many unique GS citing sources, Spearman correlations between citation counts in GS and WoS or Scopus are high (0.78-0.99). They are lower in the Humanities, and lower between GS and WoS than between GS and Scopus. The results suggest that in all areas GS citation data is essentially a superset of WoS and Scopus, with substantial extra coverage.

669 citations

Posted Content
TL;DR: In this paper, the authors re-examine the question of the growth of science and analyse it across all disciplines and also separately for the natural sciences and for the medical and health sciences.
Abstract: Many studies in information science have looked at the growth of science. In this study, we re-examine the question of the growth of science. To do this we (i) use current data up to publication year 2012 and (ii) analyse it across all disciplines and also separately for the natural sciences and for the medical and health sciences. Furthermore, the data are analysed with an advanced statistical technique - segmented regression analysis - which can identify specific segments with similar growth rates in the history of science. The study is based on two different sets of bibliometric data: (1) The number of publications held as source items in the Web of Science (WoS, Thomson Reuters) per publication year and (2) the number of cited references in the publications of the source items per cited reference year. We have looked at the rate at which science has grown since the mid-1600s. In our analysis of cited references we identified three growth phases in the development of science, which each led to growth rates tripling in comparison with the previous phase: from less than 1% up to the middle of the 18th century, to 2 to 3% up to the period between the two world wars and 8 to 9% to 2012.

617 citations

Network Information
Related Journals (5)
Scientometrics
7K papers, 234.4K citations
90% related
arXiv: Physics and Society
7K papers, 103.4K citations
84% related
arXiv: Social and Information Networks
7.2K papers, 85.6K citations
84% related
arXiv: Computers and Society
6.4K papers, 55.4K citations
83% related
Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
2021321
2020294
2019208
2018320
2017188
2016175