scispace - formally typeset
Search or ask a question

Showing papers in "arXiv: Digital Libraries in 2016"


Posted Content
TL;DR: It is concluded that for many purposes the fractional counting approach is preferable over the full counting one and that different approaches can be taken to construct a bibliometric network.
Abstract: The analysis of bibliometric networks, such as co-authorship, bibliographic coupling, and co-citation networks, has received a considerable amount of attention. Much less attention has been paid to the construction of these networks. We point out that different approaches can be taken to construct a bibliometric network. Normally the full counting approach is used, but we propose an alternative fractional counting approach. The basic idea of the fractional counting approach is that each action, such as co-authoring or citing a publication, should have equal weight, regardless of for instance the number of authors, citations, or references of a publication. We present two empirical analyses in which the full and fractional counting approaches yield very different results. These analyses deal with co-authorship networks of universities and bibliographic coupling networks of journals. Based on theoretical considerations and on the empirical analyses, we conclude that for many purposes the fractional counting approach is preferable over the full counting one.

240 citations


Posted Content
TL;DR: A review of the state-of-the-art in both scholarly use of social media and altmetrics can be found in this article, where the authors examine the role of these platforms in the scholarly communication process and the factors that affect this use.
Abstract: Social media has become integrated into the fabric of the scholarly communication system in fundamental ways: principally through scholarly use of social media platforms and the promotion of new indicators on the basis of interactions with these platforms. Research and scholarship in this area has accelerated since the coining and subsequent advocacy for altmetrics -- that is, research indicators based on social media activity. This review provides an extensive account of the state-of-the art in both scholarly use of social media and altmetrics. The review consists of two main parts: the first examines the use of social media in academia, examining the various functions these platforms have in the scholarly communication process and the factors that affect this use. The second part reviews empirical studies of altmetrics, discussing the various interpretations of altmetrics, data collection and methodological limitations, and differences according to platform. The review ends with a critical discussion of the implications of this transformation in the scholarly communication system.

232 citations


Posted Content
TL;DR: It is concluded that no one indicator is superior but that the h-index (which includes the productivity of a journal) and SNIP (which aims to normalise for field effects) may be the most effective at the moment.
Abstract: Evaluating the quality of academic journal is becoming increasing important within the context of research performance evaluation. Traditionally, journals have been ranked by peer review lists such as that of the Association of Business Schools (UK) or though their journal impact factor (JIF). However, several new indicators have been developed, such as the h-index, SJR, SNIP and the Eigenfactor which take into account different factors and therefore have their own particular biases. In this paper we evaluate these metrics both theoretically and also through an empirical study of a large set of business and management journals. We show that even though the indicators appear highly correlated in fact they lead to large differences in journal rankings. We contextualize our results in terms of the UK's large scale research assessment exercise (the RAE/REF) and particularly the ABS journal ranking list. We conclude that no one indicator is superior but that the h-index (which includes the productivity of a journal) and SNIP (which aims to normalize for field effects) may be the most effective at the moment.

93 citations


Posted ContentDOI
TL;DR: It is found that it is feasible to depict an accurate representation of the current state of the Bibliometrics community using data from GSC (the most influential authors, documents, journals, and publishers), and a taxonomy of all the errors that may affect the reliability of the data contained in each of these platforms is presented.
Abstract: Following in the footsteps of the model of scientific communication, which has recently gone through a metamorphosis (from the Gutenberg galaxy to the Web galaxy), a change in the model and methods of scientific evaluation is also taking place. A set of new scientific tools are now providing a variety of indicators which measure all actions and interactions among scientists in the digital space, making new aspects of scientific communication emerge. In this work we present a method for ―capturing‖ the structure of an entire scientific community (the Bibliometrics, Scientometrics, Informetrics, Webometrics, and Altmetrics community) and the main agents that are part of it (scientists, documents, and sources) through the lens of Google Scholar Citations (GSC). Additionally, we compare these author ―portraits‖ to the ones offered by other profile or social platforms currently used by academics (ResearcherID, ResearchGate, Mendeley, and Twitter), in order to test their degree of use, completeness, reliability, and the validity of the information they provide. A sample of 814 authors (researchers in Bibliometrics with a public profile created in GSC) was subsequently searched in the other platforms, collecting the main indicators computed by each of them. The data collection was carried out on September, 2015. The Spearman correlation (α= 0.05) was applied to these indicators (a total of 31), and a Principal Component Analysis was carried out in order to reveal the relationships among metrics and platforms as well as the possible existence of metric clusters. We found that it is feasible to depict an accurate representation of the current state of the Bibliometrics community using data from GSC (the most influential authors, documents, journals, and publishers). Regarding the number of authors found in each platform, GSC takes the first place (814 authors), followed at a distance by ResearchGate (543), which is currently growing at a vertiginous speed. The number of Mendeley profiles is high, although 17.1% of them are basically empty. ResearcherID is also affected by this issue (34.45% of the profiles are empty), as is Twitter (47% of the Twitter accounts have published less than 100 tweets). Only 11% of our sample (93 authors) have created a profile in all the platforms analyzed in this study. From the PCA, we found two kinds of impact on the Web: first, all metrics related to academic impact. This first group can further be divided into usage metrics (views and downloads) and citation metrics. Second, all metrics related to connectivity and popularity (followers). ResearchGate indicators, as well as Mendeley readers, present a high correlation to all the indicators from GSC, but only a moderate correlation to the indicators in ResearcherID. Twitter indicators achieve only low correlations to the rest of the indicators, the highest of these being to GSC (0.42-0.46), and to Mendeley (0.41-0.46). Lastly, we present a taxonomy of all the errors that may affect the reliability of the data contained in each of these platforms, with a special emphasis in GSC, since it has been our main source of data. These errors alert us to the danger of blindly using any of these platforms for the assessment of individuals, without verifying the veracity and exhaustiveness of the data. In addition to this working paper, we also have made available a website where all the data obtained for each author and the results of the analysis of the most cited documents can be found: Scholar Mirrors.

93 citations


Posted Content
TL;DR: In this article, a comparative analysis is conducted of five ranking systems: ARWU, Leiden, THE, QS, and U-Multirank, and four secondary analyses are presented investigating national academic systems and selected pairs of indicators, providing more insight into how their institutional coverage, rating methods, the selection of indicators and their normalization influence the ranking positions of given institutions.
Abstract: To provide users insight into the value and limits of world university rankings, a comparative analysis is conducted of 5 ranking systems: ARWU, Leiden, THE, QS and U-Multirank. It links these systems with one another at the level of individual institutions, and analyses the overlap in institutional coverage, geographical coverage, how indicators are calculated from raw data, the skewness of indicator distributions, and statistical correlations between indicators. Four secondary analyses are presented investigating national academic systems and selected pairs of indicators. It is argued that current systems are still one-dimensional in the sense that they provide finalized, seemingly unrelated indicator values rather than offering a data set and tools to observe patterns in multi-faceted data. By systematically comparing different systems, more insight is provided into how their institutional coverage, rating methods, the selection of indicators and their normalizations influence the ranking positions of given institutions.

75 citations


Journal ArticleDOI
TL;DR: A model to predict authors' future h-indices based on their current scientific impact is developed and an online tool is developed that allows users to generate informed h-index predictions.
Abstract: A widely used measure of scientific impact is citations. However, due to their heavy-tailed distribution, citations are fundamentally difficult to predict. Instead, to characterize scientific impact, we address two analogous questions asked by many scientific researchers: "How will my h-index evolve over time, and which of my previously or newly published papers will contribute to it?" To answer these questions, we perform two related tasks. First, we develop a model to predict authors' future h-indices based on their current scientific impact. Second, we examine the factors that drive papers---either previously or newly published---to increase their authors' predicted future h-indices. By leveraging relevant factors, we can predict an author's h-index in five years with an R2 value of 0.92 and whether a previously (newly) published paper will contribute to this future h-index with an F1 score of 0.99 (0.77). We find that topical authority and publication venue are crucial to these effective predictions, while topic popularity is surprisingly inconsequential. Further, we develop an online tool that allows users to generate informed h-index predictions. Our work demonstrates the predictability of scientific impact, and can help scholars to effectively leverage their position of "standing on the shoulders of giants."

67 citations


Journal ArticleDOI
TL;DR: A method that uses altmetric data to analyse researchers' interactions, as a way of mapping the contexts of potential societal impact, and suggests that this mapping method can be used as an input within broader methodologies in case studies of societal impact assessment.
Abstract: In this article, we develop a method that uses altmetric data to analyse researchers' interactions, as a way of mapping the contexts of potential societal impact In the face of an increasing policy demand for quantitative methodologies to assess societal impact, social media data (altmetrics) has been presented as a potential method to capture broader forms of impact However, current altmetric indicators were extrapolated from traditional citation approaches and are seen as problematic for assessing societal impact In contrast, established qualitative methodologies for societal impact assessment are based on interaction approaches These argue that assessment should focus on mapping the contexts in which engagement among researchers and stakeholders take place, as a means to understand the pathways to societal impact Following these approaches, we propose to shift the use of altmetric data towards network analysis of researchers and stakeholders We carry out two case studies, analysing researchers' networks with Twitter data The comparison illustrates the potential of Twitter networks to capture disparate degrees of policy engagement We propose that this mapping method can be used as an input within broader methodologies in case studies of societal impact assessment

62 citations


Posted Content
TL;DR: In this paper, the authors argue that short-term citations can be considered as currency at the research front, whereas longterm citations contribute to the codification of knowledge claims into concept symbols.
Abstract: We argue that citation is a composed indicator: short-term citations can be considered as currency at the research front, whereas long-term citations can contribute to the codification of knowledge claims into concept symbols. Knowledge claims at the research front are more likely to be transitory and are therefore problematic as indicators of quality. Citation impact studies focus on short-term citation, and therefore tend to measure not epistemic quality, but involvement in current discourses in which contributions are positioned by referencing. We explore this argument using three case studies: (1) citations of the journal Soziale Welt as an example of a venue that tends not to publish papers at a research front, unlike, for example, JACS; (2) Robert Merton as a concept symbol across theories of citation; and (3) the Multi-RPYS ("Multi-Referenced Publication Year Spectroscopy") of the journals Scientometrics, Gene, and Soziale Welt. We show empirically that the measurement of "quality" in terms of citations can further be qualified: short-term citation currency at the research front can be distinguished from longer-term processes of incorporation and codification of knowledge claims into bodies of knowledge. The recently introduced Multi-RPYS can be used to distinguish between short-term and long-term impacts.

57 citations


Journal ArticleDOI
TL;DR: In this paper, the authors explore if and how Microsoft Academic (MA) could be used for bibliometric analyses, and they find that MA offers structured and rich metadata, which facilitates data retrieval, handling and processing.
Abstract: We explore if and how Microsoft Academic (MA) could be used for bibliometric analyses. First, we examine the Academic Knowledge API (AK API), an interface to access MA data, and compare it to Google Scholar (GS). Second, we perform a comparative citation analysis of researchers by normalizing data from MA and Scopus. We find that MA offers structured and rich metadata, which facilitates data retrieval, handling and processing. In addition, the AK API allows retrieving frequency distributions of citations. We consider these features to be a major advantage of MA over GS. However, we identify four main limitations regarding the available metadata. First, MA does not provide the document type of a publication. Second, the 'fields of study' are dynamic, too specific and field hierarchies are incoherent. Third, some publications are assigned to incorrect years. Fourth, the metadata of some publications did not include all authors. Nevertheless, we show that an average-based indicator (i.e. the journal normalized citation score; JNCS) as well as a distribution-based indicator (i.e. percentile rank classes; PR classes) can be calculated with relative ease using MA. Hence, normalization of citation counts is feasible with MA. The citation analyses in MA and Scopus yield uniform results. The JNCS and the PR classes are similar in both databases, and, as a consequence, the evaluation of the researchers' publication impact is congruent in MA and Scopus. Given the fast development in the last year, we postulate that MA has the potential to be used for full-fledged bibliometric analyses.

50 citations


Journal ArticleDOI
TL;DR: The results show that in terms of both the quantity of papers produced and their scientific impact, the concentration of research funding in the hands of the so-called ‘elite’ of researchers generally produces diminishing marginal returns.
Abstract: In most countries, basic research is supported by research councils that select, after peer review, the individuals or teams that are to receive funding. Unfortunately, the number of grants these research councils can allocate is not infinite and, in most cases, a minority of the researchers receive the majority of the funds. However, evidence as to whether this is an optimal way of distributing available funds is mixed. The purpose of this study is to measure the relation between the amount of funding provided to 12,720 researchers in Quebec over a fifteen year period (1998-2012) and their scientific output and impact from 2000 to 2013. Our results show that both in terms of the quantity of papers produced and of their scientific impact, the concentration of research funding in the hands of a so-called "elite" of researchers generally produces diminishing marginal returns. Also, we find that the most funded researchers do not stand out in terms of output and scientific impact.

50 citations


Journal ArticleDOI
TL;DR: Different RPYS approaches in this study were able to identify the complete range of works of the celebrated icons as well as many less known works relevant for the history of climate change research, confirming the potential of the RPYS method for historical studies.
Abstract: This bibliometric analysis focuses on the general history of climate change research and, more specifically, on the discovery of the greenhouse effect. First, the Reference Publication Year Spectroscopy (RPYS) is applied to a large publication set on climate change of 222,060 papers published between 1980 and 2014. The references cited therein were extracted and analyzed with regard to publications, which are cited most frequently. Second, a new method for establishing a more subject-specific publication set for applying RPYS (based on the co-citations of a marker reference) is proposed (RPYS-CO). The RPYS of the climate change literature focuses on the history of climate change research in total. We identified 35 highly-cited publications across all disciplines, which include fundamental early scientific works of the 19th century (with a weak connection to climate change) and some cornerstones of science with a stronger connection to climate change. By using the Arrhenius (1896) paper as a RPYS-CO marker paper, we selected only publications specifically discussing the discovery of the greenhouse effect and the role of carbon dioxide. Also, we focused on the time period 1800-1850 to reveal the contributions of J.B.J Fourier in terms of cited references. Using different RPYS approaches in this study, we were able to identify the complete range of works of the celebrated icons as well as many less known works relevant for the history of climate change research. The analyses confirmed the potential of the RPYS method for historical studies: Seminal papers are detected on the basis of the references cited by the overall community without any further assumptions.

Journal ArticleDOI
TL;DR: It is argued that despite its great value, bibliometric analysis of FA should be used with caution and noted coverage limitations and potential biases in each analysis.
Abstract: Thomson Reuters' Web of Science (WoS) began systematically collecting acknowledgment information in August 2008. Since then, bibliometric analysis of funding acknowledgment (FA) has been growing and has aroused intense interest and attention from both academia and policy makers. Examining the distribution of FA by citation index database, by language, and by acknowledgment type, we noted coverage limitations and potential biases in each analysis. We argue that in spite of its great value, bibliometric analysis of FA should be used with caution.

Posted Content
TL;DR: The comparison of the effects of the different FICs on citation impact shows that the JIF has indeed the strongest correlations with the citation scores, but the correlation between Fics and citation impact is lower, if citations are normalized instead of using raw citation counts.
Abstract: Using percentile shares, one can visualize and analyze the skewness in bibliometric data across disciplines and over time. The resulting figures can be intuitively interpreted and are more suitable for detailed analysis of the effects of independent and control variables on distributions than regression analysis. We show this by using percentile shares to analyze so-called "factors influencing citation impact" (FICs; e.g., the impact factor of the publishing journal) across year and disciplines. All articles (n= 2,961,789) covered by WoS in 1990 (n= 637,301), 2000 (n= 919,485), and 2010 (n= 1,405,003) are used. In 2010, nearly half of the citation impact is accounted for by the 10% most-frequently cited papers; the skewness is largest in the humanities (68.5% in the top-10% layer) and lowest in agricultural sciences (40.6%). The comparison of the effects of the different FICs (the number of cited references, number of authors, number of pages, and JIF) on citation impact shows that JIF has indeed the strongest correlations with the citation scores. However, the correlation between FICs and citation impact is lower, if citations are normalized instead of using raw citation counts.

Journal ArticleDOI
TL;DR: An overview of a relatively newly provided source of altmetrics data which could possibly be used for societal impact measurements in scientometrics, and recommends that the analysis of Web of Science publications with at least one policy-related mention is repeated regularly in order to check the usefulness of the data.
Abstract: In this short communication, we provide an overview of a relatively newly provided source of altmetrics data which could possibly be used for societal impact measurements in scientometrics. Recently, Altmetric - a start-up providing publication level metrics - started to make data for publications available which have been mentioned in policy-related documents. Using data from Altmetric, we study how many papers indexed in the Web of Science (WoS) are mentioned in policy-related documents. We find that less than 0.5% of the papers published in different subject categories are mentioned at least once in policy-related documents. Based on our results, we recommend that the analysis of (WoS) publications with at least one policy-related mention is repeated regularly (annually). Mentions in policy-related documents should not be used for impact measurement until new policy-related sites are tracked.

Journal ArticleDOI
TL;DR: This article analyzed disciplinary differences in researchers credit attribution practices in collaborative context and found that the important differences traditionally observed between disciplines in terms of team size are greatly reduced when acknowledgees are taken into account.
Abstract: Acknowledgments are one of many conventions by which researchers publicly bestow recognition towards individuals, organizations and institutions that contributed in some way to the work that led to publication. Combining data on both co-authors and acknowledged individuals, the present study analyses disciplinary differences in researchers credit attribution practices in collaborative context. Our results show that the important differences traditionally observed between disciplines in terms of team size are greatly reduced when acknowledgees are taken into account. Broadening the measurement of collaboration beyond co-authorship by including individuals credited in the acknowledgements allows for an assessment of collaboration practices and team work that might be closer to the reality of contemporary research, especially in the social sciences and humanities.

Journal ArticleDOI
TL;DR: In this article, the authors investigated the relationship between the rank of authors and their contributions and found that the regularity in the authorship contributions decreases with the number of authors in a paper.
Abstract: Science is becoming increasingly more interdisciplinary, giving rise to more diversity in the areas of expertise within research labs and groups. This also have brought changes to the role researchers in scientific works. As a consequence, multi-authored scientific papers have now became a norm for high quality research. Unfortunately, such a phenomenon induces bias to existing metrics employed to evaluate the productivity and success of researchers. While some metrics were adapted to account for the rank of authors in a paper, many journals are now requiring a description of the specific roles of each author in a publication. Surprisingly, the investigation of the relationship between the rank of authors and their contributions has been limited to a few studies. By analyzing such kind of data, here we show, quantitatively, that the regularity in the authorship contributions decreases with the number of authors in a paper. Furthermore, we found that the rank of authors and their roles in papers follows three general patterns according to the nature of their contributions, such as writing, data analysis, and the conduction of experiments. This was accomplished by collecting and analyzing the data retrieved from PLoS ONE and by devising an entropy-based measurement to quantify the effective number of authors in a paper according to their contributions. The analysis of such patterns confirms that some aspects of the author ranking are in accordance with the expected convention, such as the fact that the first and last authors are more likely to contribute more in a scientific work. Conversely, such analysis also revealed that authors in the intermediary positions of the rank contribute more in certain specific roles, such as the task of collecting data. This indicates that the an unbiased evaluation of researchers must take into account the distinct types of scientific contributions.

Posted Content
TL;DR: This short communication provides background on the journal mapping/clustering and an explanation about and instructions for the routine, and compares journal maps for 2015 with those for 2014 and shows the delineations among fields and subfields to be sensitive to fluctuations.
Abstract: Journal maps and classifications for 11,359 journals listed in the combined Journal Citation Reports 2015 of the Science and Social Sciences Citation Indexes are provided at this http URL A routine using VOSviewer for integrating the journal mapping and their hierarchical clustering is also made available. In this short communication, we provide background on the journal mapping/clustering and an explanation and instructions about the routine. We compare 2015 journal maps with those for 2014 and show the delineations among fields and subfields to be sensitive to fluctuations. Labels for fields and sub-fields are not provided by the routine, but can be added by an analyst for pragmatic or intellectual reasons. The routine provides a means for testing one's assumptions against a baseline without claiming authority, clusters of related journals can be visualized to understand communities. The routine is generic and can be used for any 1-mode network.

Posted Content
TL;DR: This article conducted a comparative study of pre-print papers and their final published counterparts and found that the text contents of the scientific papers generally changed very little from their preprint to final published versions.
Abstract: Academic publishers claim that they add value to scholarly communications by coordinating reviews and contributing and enhancing text during publication. These contributions come at a considerable cost: U.S. academic libraries paid $1.7 billion for serial subscriptions in 2008 alone. Library budgets, in contrast, are flat and not able to keep pace with serial price inflation. We have investigated the publishers' value proposition by conducting a comparative study of pre-print papers and their final published counterparts. This comparison had two working assumptions: 1) if the publishers' argument is valid, the text of a pre-print paper should vary measurably from its corresponding final published version, and 2) by applying standard similarity measures, we should be able to detect and quantify such differences. Our analysis revealed that the text contents of the scientific papers generally changed very little from their pre-print to final published versions. These findings contribute empirical indicators to discussions of the added value of commercial publishers and therefore should influence libraries' economic decisions regarding access to scholarly publications.

Posted Content
TL;DR: A longitudinal bibliometric analysis of publications indexed in Thomson Reuters' Incites and Elsevier's Scopus, and published from Persian Gulf States and neighbouring Middle East countries, shows clear effects of major political events during the past 35 years as mentioned in this paper.
Abstract: A longitudinal bibliometric analysis of publications indexed in Thomson Reuters' Incites and Elsevier's Scopus, and published from Persian Gulf States and neighbouring Middle East countries, shows clear effects of major political events during the past 35 years. Predictions made in 2006 by the US diplomat Richard N. Haass on political changes in the Middle East have come true in the Gulf States' national scientific research systems, to the extent that Iran has become in 2015 by far the leading country in the Persian Gulf, and South-East Asian countries including China, Malaysia and South Korea have become major scientific collaborators, displacing the USA and other large Western countries. But collaborations patterns among Persian Gulf States show no apparent relationship with differences in Islam denominations.

Journal ArticleDOI
TL;DR: A new field-normalized indicator is introduced, which is rooted in early insights in bibliometrics, and is compared with several established field- Normalized indicators, and confirms the ability of established indicators to field- normalize citations.
Abstract: In this paper, a new field-normalized indicator is introduced, which is rooted in early insights in bibliometrics, and is compared with several established field-normalized indicators (e.g. the mean normalized citation score, MNCS, and indicators based on percentile approaches). Garfield (1979) emphasizes that bare citation counts from different fields cannot be compared for evaluative purposes, because the "citation potential" can vary significantly between the fields. Garfield (1979) suggests that "the most accurate measure of citation potential is the average number of references per paper published in a given field". Based on this suggestion, the new indicator is basically defined as follows: the citation count of a focal paper is divided by the mean number of cited references in a field to normalize citations. The new indicator is called citation score normalized by cited references (CSNCR). The theoretical analysis of the CSNCR shows that it has the properties of consistency and homogeneous normalization. The close relation of the new indicator to the MNCS is discussed. The empirical comparison of the CSNCR with other field-normalized indicators shows that it is slightly poorer able to field-normalize citation counts than other cited-side normalized indicators (e.g. the MNCS), but its results are favorable compared to two citing-side indicator variants (SNCS indicators). Taken as a whole, the results of this study confirm the ability of established indicators to field-normalize citations.

Posted Content
TL;DR: This contest was used as an opportunity to test the Article Level Eigenfactor (ALEF), a novel citation-based ranking algorithm, and evaluate its performance against competing algorithms that drew upon multiple facets of the data from a large, real world dataset.
Abstract: Microsoft Research hosted the 2016 WSDM Cup Challenge based on the Microsoft Academic Graph. The goal was to provide static rankings for the articles that make up the graph, with the rankings to be evaluated against those of human judges. While the Microsoft Academic Graph provided metadata about many aspects of each scholarly document, we focused more narrowly on citation data and used this contest as an opportunity to test the Article Level Eigenfactor (ALEF), a novel citation-based ranking algorithm, and evaluate its performance against competing algorithms that drew upon multiple facets of the data from a large, real world dataset (122M papers and 757M citations). Our final submission to this contest was scored at 0.676, earning second place.

Journal ArticleDOI
TL;DR: In this article, the authors identified arguments for counting methods in a sample of 32 bibliometric studies published in 2016 and compared the result with discussions of arguments for counts methods in three older studies.
Abstract: Most publication and citation indicators are based on datasets with multi-authored publications and thus a change in counting method will often change the value of an indicator. Therefore it is important to know why a specific counting method has been applied. I have identified arguments for counting methods in a sample of 32 bibliometric studies published in 2016 and compared the result with discussions of arguments for counting methods in three older studies. Based on the underlying logics of the arguments I have arranged the arguments in four groups. Group 1 focuses on arguments related to what an indicator measures, Group 2 on the additivity of a counting method, Group 3 on pragmatic reasons for the choice of counting method, and Group 4 on an indicator's influence on the research community or how it is perceived by researchers. This categorization can be used to describe and discuss how bibliometric studies with publication and citation indicators argue for counting methods.

Posted Content
TL;DR: In this article, the evolution of new fields of technology can be traced by adopting a citation-based recursive ranking method for patents and it is demonstrated that the laser / inkjet printer technology emerged from the recombination of two existing technologies: sequential printing and static image production.
Abstract: By adopting a citation-based recursive ranking method for patents the evolution of new fields of technology can be traced. Specifically, it is demonstrated that the laser / inkjet printer technology emerged from the recombination of two existing technologies: sequential printing and static image production. The dynamics of the citations coming from the different "precursor" classes illuminates the mechanism of the emergence of new fields and give the possibility to make predictions about future technological development. For the patent network the optimal value of the PageRank damping factor is close to 0.5; the application of d=0.85 leads to unacceptable ranking results.

Posted Content
TL;DR: This paper discusses three different evaluation frameworks and proposes a methodology to operationalize them and capture societal interactions between social sciences and humanities (SSH) researchers and their local context.
Abstract: Current evaluation frameworks in research policy were designed to address: 1) life and natural sciences, 2) global research communities, and; 3) scientific impact. This is problematic, as they do not adapt well to SSH scholarship, to local interests, or to consider broader societal impacts. This paper discusses three different evaluation frameworks and proposes a methodology to operationalize them and capture societal interactions between social sciences and humanities (SSH) researchers and their local context. Here we propose a network approach for identifying societal contributions in local contexts. The goal of this approach is not to develop indicators for benchmarking, but to map interactions for strategic assessment. The absolute 'value' or 'weight' of the interactions cannot be captured, but we hope the method can identify the hot spots where they are taking place.

Journal ArticleDOI
Chaomei Chen1
TL;DR: Several grand challenges concerning the creation, adaptation, and diffusion of scholarly knowledge, and advance quantitative and qualitative approaches to the study of scholarlyknowledge are identified.
Abstract: The constantly growing body of scholarly knowledge of science, technology, and humanities is an asset of the mankind. While new discoveries expand the existing knowledge, they may simultaneously render some of it obsolete. It is crucial for scientists and other stakeholders to keep their knowledge up to date. Policy makers, decision makers, and the general public also need an efficient communication of scientific knowledge. Several grand challenges concerning the creation, adaptation, and diffusion of scholarly knowledge, and advance quantitative and qualitative approaches to the study of scholarly knowledge are identified.

Posted Content
TL;DR: The Cited ReferencesExplorer (CRExplorer) as mentioned in this paper is a tool that can be used to identify those publications which have been frequently cited by the researchers in a field and thereby to study for example the historical roots of a research field or topic.
Abstract: We introduce a new tool - the CitedReferencesExplorer (CRExplorer, this http URL) - which can be used to disambiguate and analyze the cited references (CRs) of a publication set downloaded from the Web of Science (WoS). The tool is especially suitable to identify those publications which have been frequently cited by the researchers in a field and thereby to study for example the historical roots of a research field or topic. CRExplorer simplifies the identification of key publications by enabling the user to work with both a graph for identifying most frequently cited reference publication years (RPYs) and the list of references for the RPYs which have been most frequently cited. A further focus of the program is on the standardization of CRs. It is a serious problem in bibliometrics that there are several variants of the same CR in the WoS. In this study, CRExplorer is used to study the CRs of all papers published in the Journal of Informetrics. The analyses focus on the most important papers published between 1980 and 1990.

Journal ArticleDOI
TL;DR: It is hypothesized that “first page results syndrome” in conjunction with the fact that Google Scholar favours the most cited documents are suggesting the growing trend of citing old documents is partly caused by Google Scholar.
Abstract: A study released by the Google Scholar team found an apparently increasing fraction of citations to old articles from studies published in the last 24 years (1990-2013). To demonstrate this finding we conducted a complementary study using a different data source (Journal Citation Reports), metric (aggregate cited half-life), time spam (2003-2013), and set of categories (53 Social Science subject categories and 167 Science subject categories). Although the results obtained confirm and reinforce the previous findings, the possible causes of this phenomenon keep unclear. We finally hypothesize that first page results syndrome in conjunction with the fact that Google Scholar favours the most cited documents are suggesting the growing trend of citing old documents is partly caused by Google Scholar.

Proceedings ArticleDOI
TL;DR: This paper addresses the issues of enabling the collection of fresh and relevant Web and Social Web content for a topic of interest through seamless integration of Web and social Media in a novel integrated focused crawler.
Abstract: Researchers in the Digital Humanities and journalists need to monitor, collect and analyze fresh online content regarding current events such as the Ebola outbreak or the Ukraine crisis on demand. However, existing focused crawling approaches only consider topical aspects while ignoring temporal aspects and therefore cannot achieve thematically coherent and fresh Web collections. Especially Social Media provide a rich source of fresh content, which is not used by state-of-the-art focused crawlers. In this paper we address the issues of enabling the collection of fresh and relevant Web and Social Web content for a topic of interest through seamless integration of Web and Social Media in a novel integrated focused crawler. The crawler collects Web and Social Media content in a single system and exploits the stream of fresh Social Media content for guiding the crawler.

Posted Content
TL;DR: In this article, a comparison between full and fractional counting at the network level is made, and three counting schemes are compared analytically; routines for applying these approaches to bibliometric data are also provided.
Abstract: In their study entitled "Constructing bibliometric networks: A comparison between full and fractional counting," Perianes-Rodriguez, Waltman, & van Eck (2016; henceforth abbreviated as PWvE) provide arguments for the use of fractional counting at the network level as different from the level of publications. Whereas fractional counting in the latter case divides the credit among co-authors (countries, institutions, etc.), fractional counting at the network level can normalize the relative weights of links and thereby clarify the structures in the network. PWvE, however, propose a counting scheme for fractional counting that is one among other possible ones. Alternative schemes proposed by Batagelj and Cerin\v{s}ek (2013) and Park, Yoon, & Leydesdorff (2016; henceforth abbreviated as PYL) are discussed in an appendix. However, our approach is not correctly identified as identical to their Equation A3. Here below, we distinguish three approaches analytically; routines for applying these approaches to bibliometric data are also provided.

Proceedings ArticleDOI
TL;DR: In this paper, the authors explore the use of binary, archive-specific classifiers generated on the basis of the content cached by an Aggregator, to determine whether or not to query an archive for a given URI.
Abstract: The Memento protocol provides a uniform approach to query individual web archives. Soon after its emergence, Memento Aggregator infrastructure was introduced that supports querying across multiple archives simultaneously. An Aggregator generates a response by issuing the respective Memento request against each of the distributed archives it covers. As the number of archives grows, it becomes increasingly challenging to deliver aggregate responses while keeping response times and computational costs under control. Ad-hoc heuristic approaches have been introduced to address this challenge and research has been conducted aimed at optimizing query routing based on archive profiles. In this paper, we explore the use of binary, archive-specific classifiers generated on the basis of the content cached by an Aggregator, to determine whether or not to query an archive for a given URI. Our results turn out to be readily applicable and can help to significantly decrease both the number of requests and the overall response times without compromising on recall. We find, among others, that classifiers can reduce the average number of requests by 77% compared to a brute force approach on all archives, and the overall response time by 42% while maintaining a recall of 0.847.