scispace - formally typeset
Search or ask a question

Showing papers in "Journal of the Association for Information Science and Technology in 2015"


Journal ArticleDOI
TL;DR: In this paper, an extensive analysis of the presence of different altmetric indicators provided by Altmetric.com across scientific fields is presented, particularly focusing on their relationship with citations.
Abstract: An extensive analysis of the presence of different altmetric indicators provided by Altmetric.com across scientific fields is presented, particularly focusing on their relationship with citations. Our results confirm that the presence and density of social media altmetric counts are still very low and not very frequent among scientific publications, with 15%-24% of the publications presenting some altmetric activity and concentrating in the most recent publications, although their presence is increasing over time. Publications from the social sciences, humanities and the medical and life sciences show the highest presence of altmetrics, indicating their potential value and interest for these fields. The analysis of the relationships between altmetrics and citations confirms previous claims of positive correlations but relatively weak, thus supporting the idea that altmetrics do not reflect the same concept of impact as citations. Also, altmetric counts do not always present a better filtering of highly cited publications than journal citation scores. Altmetrics scores (particularly mentions in blogs) are able to identify highly cited publications with higher levels of precision than journal citation scores (JCS), but they have a lower level of recall. The value of altmetrics as a complementary tool of citation analysis is highlighted, although more research is suggested to disentangle the potential meaning and value of altmetric indicators for research evaluation.

501 citations


Journal ArticleDOI
TL;DR: The results show that rankings based on ResearchGate statistics correlate moderately well with other rankings of academic institutions, suggesting that ResearchGate use broadly reflects the traditional distribution of academic capital.
Abstract: ResearchGate is a social network site for academics to create their own profiles, list their publications, and interact with each other. Like Academia.edu, it provides a new way for scholars to disseminate their work and hence potentially changes the dynamics of informal scholarly communication. This article assesses whether ResearchGate usage and publication data broadly reflect existing academic hierarchies and whether individual countries are set to benefit or lose out from the site. The results show that rankings based on ResearchGate statistics correlate moderately well with other rankings of academic institutions, suggesting that ResearchGate use broadly reflects the traditional distribution of academic capital. Moreover, while Brazil, India, and some other countries seem to be disproportionately taking advantage of ResearchGate, academics in China, South Korea, and Russia may be missing opportunities to use ResearchGate to maximize the academic impact of their publications.

278 citations


Journal ArticleDOI
TL;DR: The relationship between collaboration and scientific impact using three indicators of collaboration (number of authors, number of addresses, and number of countries) derived from articles published between 1900 and 2011 is analyzed in this paper.
Abstract: This article provides the first historical analysis of the relationship between collaboration and scientific impact using three indicators of collaboration (number of authors, number of addresses, and number of countries) derived from articles published between 1900 and 2011. The results demonstrate that an increase in the number of authors leads to an increase in impact, from the beginning of the last century onward, and that this is not due simply to self‐citations. A similar trend is also observed for the number of addresses and number of countries represented in the byline of an article. However, the constant inflation of collaboration since 1900 has resulted in diminishing citation returns: Larger and more diverse (in terms of institutional and country affiliation) teams are necessary to realize higher impact. The article concludes with a discussion of the potential causes of the impact gain in citations of collaborative papers.

270 citations


Journal ArticleDOI
TL;DR: The hip‐index, a model for predicting academic influence that achieves good performance on this data set using only four features, was found, among those evaluated, those based on the number of times a reference is mentioned in the body of a citing paper.
Abstract: The importance of a research article is routinely measured by counting how many times it has been cited. However, treating all citations with equal weight ignores the wide variety of functions that citations perform. We want to automatically identify the subset of references in a bibliography that have a central academic influence on the citing paper. For this purpose, we examine the effectiveness of a variety of features for determining the academic influence of a citation. By asking authors to identify the key references in their own work, we created a data set in which citations were labeled according to their academic influence. Using automatic feature selection with supervised machine learning, we found a model for predicting academic influence that achieves good performance on this data set using only four features. The best features, among those we evaluated, were those based on the number of times a reference is mentioned in the body of a citing paper. The performance of these features inspired us to design an influence-primed h-index (the hip-index). Unlike the conventional h-index, it weights citations by how many times a reference is mentioned. According to our experiments, the hip-index is a better indicator of researcher performance than the conventional h-index.

196 citations


Journal ArticleDOI
TL;DR: This study addresses the issue of little consensus on many aspects of Wikipedia's content as an encyclopedic collection of human knowledge by systematically reviewing 110 peer‐reviewed publications on Wikipedia content, summarizing the current findings, and highlighting the major research trends.
Abstract: Wikipedia might possibly be the best-developed attempt thus far of the enduring quest to gather all human knowledge in one place. Its accomplishments in this regard have made it an irresistible point of inquiry for researchers from various fields of knowledge. A decade of research has thrown light on many aspects of the Wikipedia community, its processes, and content. However, due to the variety of the fields inquiring about Wikipedia and the limited synthesis of the extensive research, there is little consensus on many aspects of Wikipedia’s content as an encyclopedic collection of human knowledge. This study addresses the issue by systematically reviewing 110 peer-reviewed publications on Wikipedia content, summarizing the current findings, and highlighting the major research trends. Two major streams of research are identified: the quality of Wikipedia content (including comprehensiveness, currency, readability and reliability) and the size of Wikipedia. Moreover, we present the key research trends in terms of the domains of inquiry, research design, data source, and data gathering methods. This review synthesizes scholarly understanding of Wikipedia content and paves the way for future studies.

174 citations


Journal ArticleDOI
TL;DR: Examination of data from the popular review platform Amazon indicates that review helpfulness is positively related to reviewer profile and review depth but is negatively related to review rating.
Abstract: This article examines review helpfulness as a function of reviewer reputation, review rating, and review depth. In drawing data from the popular review platform Amazon, results indicate that review helpfulness is positively related to reviewer profile and review depth but is negatively related to review rating. Users seem to have a proclivity for reviews contributed by reviewers with a positive track record. They also appreciate reviews with lambasting comments and those with adequate depth. By highlighting its implications for theory and practice, the article concludes with limitations and areas for further research.

136 citations


Journal ArticleDOI
Henk F. Moed1, Gali Halevi1
TL;DR: The Multidimensional Research Assessment Matrix of scientific output as mentioned in this paper is a set of 10 important metrics, including altmetrics, applied at the level of individual articles, individual researchers, research groups, and institutions.
Abstract: This article introduces the Multidimensional Research Assessment Matrix of scientific output. Its base notion holds that the choice of metrics to be applied in a research assessment process depends on the unit of assessment, the research dimension to be assessed, and the purposes and policy context of the assessment. An indicator may by highly useful within one assessment process, but less so in another. For instance, publication counts are useful tools to help discriminate between those staff members who are research active, and those who are not, but are of little value if active scientists are to be compared with one another according to their research performance. This paper gives a systematic account of the potential usefulness and limitations of a set of 10 important metrics, including altmetrics, applied at the level of individual articles, individual researchers, research groups, and institutions. It presents a typology of research impact dimensions and indicates which metrics are the most appropriate to measure each dimension. It introduces the concept of a “meta-analysis” of the units under assessment in which metrics are not used as tools to evaluate individual units, but to reach policy inferences regarding the objectives and general setup of an assessment process.

124 citations


Journal ArticleDOI
TL;DR: This research identified what factors relating to the individual and their environment may facilitate the main elements of serendipity and further influence its perception and validated and augmented prior research while consolidating previous models of serentipity.
Abstract: Serendipity is not an easy word to define. Its meaning has been stretched to apply to experiences ranging from the mundane to the exceptional. Serendipity, however, is consistently associated with unexpected and positive personal, scholarly, scientific, organizational, and societal events and discoveries. Diverse serendipitous experiences share a conceptual space; therefore, what lessons can we draw from an exploration of how serendipity unfolds and what may influence it? This article describes an investigation of work-related serendipity. Twelve professionals and academics from a variety of fields were interviewed. The core of the semi-structured interviews focused on participants' own work-related experiences that could be recalled and discussed in depth. This research validated and augmented prior research while consolidating previous models of serendipity into a single model of the process of serendipity, consisting of: Trigger, Connection, Follow-up, and Valuable Outcome, and an Unexpected Thread that runs through 1 or more of the first 4 elements. Together, the elements influence the Perception of Serendipity. Furthermore, this research identified what factors relating to the individual and their environment may facilitate the main elements of serendipity and further influence its perception.

98 citations


Journal ArticleDOI
TL;DR: With the exception of Russia, the BRICS countries have increased their output in terms of most frequently cited papers at a higher rate than the top‐cited countries worldwide.
Abstract: The BRICS countries (Brazil, Russia, India, China, and South Africa) are notable for their increasing participation in science and technology. The governments of these countries have been boosting their investments in research and development to become part of the group of nations doing research at a world-class level. This study investigates the development of the BRICS countries in the domain of top-cited papers (top 10% and 1% most frequently cited papers) between 1990 and 2010. To assess the extent to which these countries have become important players at the top level, we compare the BRICS countries with the top-performing countries worldwide. As the analyses of the (annual) growth rates show, with the exception of Russia, the BRICS countries have increased their output in terms of most frequently cited papers at a higher rate than the top-cited countries worldwide. By way of additional analysis, we generate coauthorship networks among authors of highly cited papers for 4 time points to view changes in BRICS participation (1995, 2000, 2005, and 2010). Here, the results show that all BRICS countries succeeded in becoming part of this network, whereby the Chinese collaboration activities focus on the US.

96 citations


Journal ArticleDOI
TL;DR: It is found that data from both sources could be used to predict the quality of Chinese universities and companies, and the disadvantage of Google Trends in this regard was due to Google's smaller user base in China.
Abstract: Numerous studies have explored the possibility of uncovering information from web search queries but few have examined the factors that affect web query data sources. We conducted a study that investigated this issue by comparing Google Trends and Baidu Index. Data from these two services are based on queries entered by users into Google and Baidu, two of the largest search engines in the world. We first compared the features and functions of the two services based on documents and extensive testing. We then carried out an empirical study that collected query volume data from the two sources. We found that data from both sources could be used to predict the quality of Chinese universities and companies. Despite the differences between the two services in terms of technology, such as differing methods of language processing, the search volume data from the two were highly correlated and combining the two data sources did not improve the predictive power of the data. However, there was a major difference between the two in terms of data availability. Baidu Index was able to provide more search volume data than Google Trends did. Our analysis showed that the disadvantage of Google Trends in this regard was due to Google's smaller user base in China. The implication of this finding goes beyond China. Google's user bases in many countries are smaller than that in China, so the search volume data related to those countries could result in the same issue as that related to China.

93 citations


Journal ArticleDOI
TL;DR: In conclusion, research funders should not incentivize international collaboration on the basis that it is, in general, higher quality because its higher impact may be primarily due to its larger audience.
Abstract: International collaboration tends to result in more highly cited research and, partly as a result of this, many research funding schemes are specifically international in scope. Nevertheless, it is not clear whether this citation advantage is the result of higher quality research or due to other factors, such as a larger audience for the publications. To test whether the apparent advantage of internationally collaborative research may be due to additional interest in articles from the countries of the authors, this article assesses the extent to which the national affiliations of the authors of articles affect the national affiliations of their Mendeley readers. Based on English-language Web of Science articles in 10 fields from science, medicine, social science, and the humanities, the results of statistical models comparing author and reader affiliations suggest that, in most fields, Mendeley users are disproportionately readers of articles authored from within their own country. In addition, there are several cases in which Mendeley users from certain countries tend to ignore articles from specific other countries, although it is not clear whether this reflects national biases or different national specialisms within a field. In conclusion, research funders should not incentivize international collaboration on the basis that it is, in general, higher quality because its higher impact may be primarily due to its larger audience. Moreover, authors should guard against national biases in their reading to select only the best and most relevant publications to inform their research.

Journal ArticleDOI
TL;DR: This paper derives 7 dimensions of participations from the literature on participation and exemplifies those dimensions using a set of 102 cases of contemporary participation that include uses of the Internet and new media.
Abstract: Participation is today central to many kinds of research and design practice in information studies and beyond. From user-generated content to crowdsourcing to peer production to fan fiction to citizen science, the concept remains both unexamined and heterogeneous in its definition. Intuitions about participation are confirmed by some examples, but scandalized by others, and it is difficult to pinpoint why participation seems to be robust in some cases and partial in others. In this paper we offer an empirically based, comparative analysis of participation that demonstrates its multidimensionality and provides a framework that allows clear distinctions and better analyses of the role of participation. We derive 7 dimensions of participations from the literature on participation and exemplify those dimensions using a set of 102 cases of contemporary participation that include uses of the Internet and new media.

Journal ArticleDOI
TL;DR: It is found that users have more difficulty extracting information from search results pages on the smaller screens, although they exhibit less eye movement as a result of an infrequent use of the scroll function, and there is no significant difference between the 2 screens in time spent on searchresults pages and the accuracy of finding answers.
Abstract: In recent years, searching the web on mobile devices has become enormously popular. Because mobile devices have relatively small screens and show fewer search results, search behavior with mobile devices may be different from that with desktops or laptops. Therefore, examining these differences may suggest better, more efficient designs for mobile search engines. In this experiment, we use eye tracking to explore user behavior and performance. We analyze web searches with 2 task types on 2 differently sized screens: one for a desktop and the other for a mobile device. In addition, we examine the relationships between search performance and several search behaviors to allow further investigation of the differences engendered by the screens. We found that users have more difficulty extracting information from search results pages on the smaller screens, although they exhibit less eye movement as a result of an infrequent use of the scroll function. However, in terms of search performance, our findings suggest that there is no significant difference between the 2 screens in time spent on search results pages and the accuracy of finding answers. This suggests several possible ideas for the presentation design of search results pages on small devices.

Journal ArticleDOI
TL;DR: A comprehensive taxonomy of the factors influencing the user adoption of EMR and classifies these factors into meaningful categories is provided and has implications for researchers and practitioners.
Abstract: In the past three decades, several studies have extracted antecedents to the user adoption of health information systems (HIS). This study proposes a reflective pause on the HIS adoption literature to broaden our understanding of factors contributing to the user adoption of electronic medical record (EMR). This paper provides a comprehensive taxonomy of the factors influencing the user adoption of EMR and classifies these factors into meaningful categories. We searched the selected keywords on several academic databases and found an initial set of 9,684 studies. We excluded papers on the basis of their title, abstract, and full text (89 remaining papers). The effectiveness of adoption theories has been explored based on the empirical results identified in the EMR research. Furthermore, according to the conceptualization of the factors in the literature, a list of 78 factors affecting EMR adoption was identified. These factors were classified to eight categories: individual factors, psychological factors, behavioural factors, environmental factors, organizational factors, financial factors, legal factors, and technical factors. The results have implications for researchers and practitioners, including policymakers, marketers, information technology (IT) professionals, health information management (HIM) practitioners, health practice managers, and EMR system developers.

Journal ArticleDOI
TL;DR: A publisher ranking study based on a citation data grant from Elsevier and matching metadata from WorldCat, which creates a unique relational database designed to compare citation counts to books with international library holdings or libcitations for scholarly book publishers.
Abstract: This is a publisher ranking study based on a citation data grant from Elsevier, specifically, book titles cited in Scopus history journals (2007-2011) and matching metadata from WorldCat® (i.e., OCLC numbers, ISBN codes, publisher records, and library holding counts). Using both resources, we have created a unique relational database designed to compare citation counts to books with international library holdings or libcitations for scholarly book publishers. First, we construct a ranking of the top 500 publishers and explore descriptive statistics at the level of publisher type (university, commercial, other) and country of origin. We then identify the top 50 university presses and commercial houses based on total citations and mean citations per book (CPB). In a third analysis, we present a map of directed citation links between journals and book publishers. American and British presses/publishing houses tend to dominate the work of library collection managers and citing scholars; however, a number of specialist publishers from Europe are included. Distinct clusters from the directed citation map indicate a certain degree of regionalism and subject specialization, where some journals produced in languages other than English tend to cite books published by the same parent press. Bibliometric rankings convey only a small part of how the actual structure of the publishing field has evolved; hence, challenges lie ahead for developers of new citation indices for books and bibliometricians interested in measuring book and publisher impacts.

Journal ArticleDOI
TL;DR: A study of software engineers was conducted to understand the role that contextual factors play in shaping their information‐seeking behavior and revealed a set of contextual factors and related information behaviors that may inform contextual approaches to information seeking in other professional domains.
Abstract: Information seeking in the workplace can vary substantially from one search to the next due to changes in the context of the search. Modeling these dynamic contextual effects is an important challenge facing the research community because it has the potential to lead to more responsive search systems. With this motivation, a study of software engineers was conducted to understand the role that contextual factors play in shaping their information-seeking behavior. Research was conducted in the field in a large technology company and comprised six unstructured interviews, a focus group, and 13 in-depth, semistructured interviews. Qualitative analysis revealed a set of contextual factors and related information behaviors. Results are formalized in the contextual model of source selection, the main contributions of which are the identification of two types of conditioning variables (requirements and constraints) that mediate between the contextual factors and source-selection decisions, and the articulation of dominant source-selection patterns. The study has implications for the design of context-sensitive search systems in this domain and may inform contextual approaches to information seeking in other professional domains.

Journal ArticleDOI
TL;DR: This work measures synergy for the Russian national, provincial, and regional innovation systems as reduction of uncertainty using mutual information among the 3 distributions of firm sizes, technological knowledge bases of firms, and geographical locations.
Abstract: We measure synergy for the Russian national, provincial, and regional innovation systems as reduction of uncertainty using mutual information among the 3 distributions of firm sizes, technological knowledge bases of firms, and geographical locations. Half a million units of data at firm level in 2011 were obtained from the Orbis™ database of Bureau Van Dijk. The firm level data were aggregated at the levels of 8 Federal Districts, the regional level of 83 Federal Subjects, and the single level of the Russian Federation. Not surprisingly, the knowledge base of the economy is concentrated in the Moscow region (22.8%) and Saint Petersburg (4.0%). Except in Moscow itself, high-tech manufacturing does not add synergy to any other unit at any of the various levels of geographical granularity; instead it disturbs regional coordination. Knowledge-intensive services (KIS; including laboratories) contribute to the synergy in all Federal Districts (except the North-Caucasian Federal District), but only in 30 of the 83 Federal Subjects. The synergy in KIS is concentrated in centers of administration. The knowledge-intensive services (which are often state affiliated) provide backbone to an emerging knowledge-based economy at the level of Federal Districts, but the economy is otherwise not knowledge based (except for the Moscow region).

Journal ArticleDOI
TL;DR: It is argued thatsocial support and information were inextricably connected within participant interactions and that social support is, itself, a form of information that impacts actions and emotional experiences, contributing to participants being able to make sense of their experiences and to move forward both physically and emotionally.
Abstract: This research examines interactions among members of an online breast cancer community, focusing on how information and social support were exchanged, how these exchanges influenced health decisions, and how the community was integrated into participants' everyday lives. This article is the result of a 2-year ethnography comprising online archives analysis, participant observation, and 31 interviews. In the course of the research, the findings revealed that, not only did participants exchange valuable information and helpful social support, there was often little separation between the two, with each overlaying the other throughout most interactions. Expressions of support permeated many informational messages and at the same time served as information to participants. This article argues that social support and information were inextricably connected within participant interactions and that social support is, itself, a form of information that impacts actions and emotional experiences, contributing to participants being able to make sense of their experiences and to move forward both physically and emotionally. This research builds on work in information science that looks at the ways in which people exchange information in informal environments and extends that research by drawing on conceptualizations of social support to exhibit the connections between social support and information.

Journal ArticleDOI
TL;DR: The results suggest that health research bloggers rarely self‐cite and that the vast majority of their blog posts (90%) include a general discussion of the issue covered in the article, with more than one quarter providing health‐related advice based on the article(s) covered.
Abstract: Blogs that cite academic articles have emerged as a potential source of alternative impact metrics for the visibility of the blogged articles. Nevertheless, to evaluate more fully the value of blog citations, it is necessary to investigate whether research blogs focus on particular types of articles or give new perspectives on scientific discourse. Therefore, we studied the characteristics of peer-reviewed references in blogs and the typical content of blog posts to gain insight into bloggers' motivations. The sample consisted of 391 blog posts from 2010 to 2012 in Researchblogging.org's health category. The bloggers mostly cited recent research articles or reviews from top multidisciplinary and general medical journals. Using content analysis methods, we created a general classification scheme for blog post content with 10 major topic categories, each with several subcategories. The results suggest that health research bloggers rarely self-cite and that the vast majority of their blog posts (90%) include a general discussion of the issue covered in the article, with more than one quarter providing health-related advice based on the article(s) covered. These factors suggest a genuine attempt to engage with a wider, nonacademic audience. Nevertheless, almost 30% of the posts included some criticism of the issues being discussed.

Journal ArticleDOI
TL;DR: The results suggest that the United States showed overwhelming dominance in all bilateral UIG combinations with the exception of the government‐government network, and the university sector in English‐speaking wealthy countries and the government sector of non‐English‐speaking, less‐wealthy countries played a key role in international collaborations between OECD countries.
Abstract: Previous studies of international scientific collaboration have rarely gone beyond revealing the structural relationships between countries. Considering how scientific collaboration is actually initiated, this study focuses on the organization and sector levels of international coauthorship networks, going beyond a country-level description. Based on a network analysis of coauthorship networks between members of the Organisation for Economic Cooperation and Development (OECD), this study attempts to gain a better understanding of international scientific collaboration by exploring the structure of the coauthorship network in terms of university-industry-government (UIG) relationships, the mode of knowledge production, and the underlying dynamic of collaboration in terms of geographic, linguistic, and economic factors. The results suggest that the United States showed overwhelming dominance in all bilateral UIG combinations with the exception of the government-government (GG) network. Scientific collaboration within the industry sector was concentrated in a few players, whereas that between the university and industry sectors was relatively less concentrated. Despite the growing participation from other sectors, universities were still the main locus of knowledge production, with the exception of 5 countries. The university sector in English-speaking wealthy countries and the government sector of non–English-speaking, less-wealthy countries played a key role in international collaborations between OECD countries. The findings did not provide evidence supporting the institutional proximity argument.

Journal ArticleDOI
TL;DR: In this paper, the authors identify methodological weaknesses and measure the distortions that result from them in the university performance rankings in the 2004-2010 VQR (Research Quality Evaluation), completed in July 2013, was Italy's second national research assessment exercise.
Abstract: The 2004–2010 VQR (Research Quality Evaluation), completed in July 2013, was Italy's second national research assessment exercise. The VQR performance evaluation followed a pattern also seen in other nations, as it was based on a selected subset of products. In this work, we identify the exercise's methodological weaknesses and measure the distortions that result from them in the university performance rankings. First, we create a scenario in which we assume the efficient selection of the products to be submitted by the universities and, from this, simulate a set of rankings applying the precise VQR rating criteria. Next, we compare these “VQR rankings” with those that would derive from the application of more-appropriate bibliometrics. Finally, we extend the comparison to university rankings based on the entire scientific production for the period, as indexed in the Web of Science.

Journal ArticleDOI
TL;DR: A micro‐sociological, symbolic interactionist approach is used to examine the use of one type of information—biomedical information—in the everyday life interactions of chronic illness patients and their families, demonstrating use of biomedical information in interactions that construct a valued self for the patient.
Abstract: Information use intrigues information behavior researchers, though many have struggled with how to conceptualize and study this phenomenon. Some work suggests that information may have social uses, hinting that information use is more complicated than previous frameworks suggest. Therefore, we use a micro-sociological, symbolic interactionist approach to examine the use of one type of information—biomedical information—in the everyday life interactions of chronic illness patients and their families. Based on a grounded theory analysis of 60 semi-structured interviews (30 individual patient interviews and 30 family group interviews) and observations within the family group interviews, we identify 4 categories of information use: (a) knowing my body; (b) mapping the social terrain; (c) asserting autonomy; and (d) puffing myself up. Extending previous research, the findings demonstrate use of biomedical information in interactions that construct a valued self for the patient: a person who holds authority, and who is unique and cared for. In so doing, we contribute novel insights regarding the use of information to manage social emotions such as shame, and to construct embodied knowledge that is mobilized in action to address disease-related challenges. We thus offer an expanded conceptualization of information use that provides new directions for research and practice.

Journal ArticleDOI
TL;DR: A global map of science based on aggregated journal–journal citations from 1996–2012 is constructed using Scopus data and can be compared with mappings based on the Journal Citation Reports at the Web of Science.
Abstract: Using Scopus data, we construct a global map of science based on aggregated journal-journal citations from 1996-2012 N of journals=20,554. This base map enables users to overlay downloads from Scop...

Journal ArticleDOI
TL;DR: A clustering and authorship attribution study over the State of the Union addresses from 1790 to 2014 shows that chronology tends to play a central role in forming clusters, a factor that is more important than political affiliation.
Abstract: This paper describes a clustering and authorship attribution study over the State of the Union addresses from 1790 to 2014 (224 speeches delivered by 41 presidents). To define the style of each presidency, we have applied a principal component analysis (PCA) based on the part-of-speech (POS) frequencies. From Roosevelt (1934), each president tends to own a distinctive style whereas previous presidents tend usually to share some stylistic aspects with others. Applying an automatic classification based on the frequencies of all content-bearing word-types we show that chronology tends to play a central role in forming clusters, a factor that is more important than political affiliation. Using the 300 most frequent word-types, we generate another clustering representation based on the style of each president. This second view shares similarities with the first one, but usually with more numerous and smaller clusters. Finally, an authorship attribution approach for each speech can reach a success rate of around 95.7% under some constraints. When an incorrect assignment is detected, the proposed author often belongs to the same party and has lived during roughly the same time period as the presumed author. A deeper analysis of some incorrect assignments reveals interesting reasons justifying difficult attributions.

Journal ArticleDOI
TL;DR: The paper examines this claim and argues for the continued value of Boolean systems, and suggests two further considerations: the important role of human expertise in searching (expert searchers and “information literate” users) and the role of library and information science and knowledge organization (KO) in the design and use of classical databases.
Abstract: This paper considers classical bibliographic databases based on the Boolean retrieval model (such as MEDLINE and PsycInfo). This model is challenged by modern search engines and information retrieval (IR) researchers, who often consider Boolean retrieval a less efficient approach. The paper examines this claim and argues for the continued value of Boolean systems, and suggests two further considerations: (a) the important role of human expertise in searching (expert searchers and “information literate” users) and (b) the role of library and information science and knowledge organization (KO) in the design and use of classical databases. An underlying issue is the kind of retrieval system for which one should aim. Warner's (2010) differentiation between the computer science traditions and an older library-oriented tradition seems important; the former aim to transform queries automatically into (ranked) sets of relevant documents, whereas the latter aims to increase the “selection power” of users. The Boolean retrieval model is valuable in providing users with the power to make informed searches and have full control over what is found and what is not. These issues may have significant implications for the maintenance of information science and KO as research fields as well as for the information profession as a profession in its own right.

Journal ArticleDOI
TL;DR: Qualitative analyses of two sources of data regarding how and why people search Twitter reveal numerous characteristics of Twitter search that differentiate it from more commonly studied search domains, such as web search.
Abstract: Micro-blogging services such as Twitter represent constantly evolving, user-generated sources of information. Previous studies show that users search such content regularly but are often dissatisfied with current search facilities. We argue that an enhanced understanding of the motivations for search would aid the design of improved search systems, better reflecting what people need. Building on previous research, we present qualitative analyses of two sources of data regarding how and why people search Twitter. The first, a diary study (p = 68), provides descriptions of Twitter information needs (n = 117) and important meta-data from active study participants. The second data set was established by collecting first-person descriptions of search behavior (n = 388) tweeted by twitter users themselves (p = 381) and complements the first data set by providing similar descriptions from a more plentiful source. The results of our analyses reveal numerous characteristics of Twitter search that differentiate it from more commonly studied search domains, such as web search. The findings also shed light on some of the difficulties users encounter. By highlighting examples that go beyond those previously published, this article adds to the understanding of how and why people search such content. Based on these new insights, we conclude with a discussion of possible design implications for search systems that index micro-blogging content.

Journal ArticleDOI
TL;DR: A method to automatically remove false and irrelevant matches from GB citation searches is introduced in addition to introducing refinements to a previous GB manual citation extraction method.
Abstract: Recent studies have shown that counting citations from books can help scholarly impact assessment and that Google Books (GB) is a useful source of such citation counts, despite its lack of a public citation index. Searching GB for citations produces approximate matches, however, and so its raw results need time-consuming human filtering. In response, this article introduces a method to automatically remove false and irrelevant matches from GB citation searches in addition to introducing refinements to a previous GB manual citation extraction method. The method was evaluated by manual checking of sampled GB results and comparing citations to about 14,500 monographs in the Thomson Reuters Book Citation Index (BKCI) against automatically extracted citations from GB across 24 subject areas. GB citations were 103% to 137% as numerous as BKCI citations in the humanities, except for tourism (72%) and linguistics (91%), 46% to 85% in social sciences, but only 8% to 53% in the sciences. In all cases, however, GB had substantially more citing books than did BKCI, with BKCI's results coming predominantly from journal articles. Moderate correlations between the GB and BKCI citation counts in social sciences and humanities, with most BKCI results coming from journal articles rather than books, suggests that they could measure the different aspects of impact, however.

Journal ArticleDOI
TL;DR: The study resulted in a user‐generated framework for designing affordances on social media sites to counter acts of cyberbullying, and a typological analysis of the values present in the participants' design recommendations, applying Cheng and Fleischman's values framework.
Abstract: This study looks at mean and cruel online behavior through the lens of design, with the goal of developing positive technologies for youth. Narrative inquiry was used as a research method, allowing two focus groups—one composed of teens and the other of undergraduate students—to map out 4 cyberbullying stories. Each cyberbullying story revealed 2 subplots—the story that “is” (as perceived by these participants) and the story that “could be” (if the participants' design recommendations were embedded in social media). The study resulted in a user-generated framework for designing affordances on social media sites to counter acts of cyberbullying. Seven emergent design themes are evident in the participants' cyberbullying narratives: design for hesitation, design for consequence, design for empathy, design for personal empowerment, design for fear, design for attention, and design for control and suppression. We conclude with a typological analysis of the values present in the participants' design recommendations, applying Cheng and Fleischman's values framework (2010).

Journal ArticleDOI
TL;DR: This article revisits the causal relationship between research articles published and economic growth in Organisation for Economic Co‐operation and Development (OECD) countries for the period 1981–2011, using bootstrap panel causality analysis, which accounts for cross‐section dependency and heterogeneity across countries.
Abstract: The causal relation between research and economic growth is of particular importance for political support of science and technology as well as for academic purposes. This article revisits the causal relationship between research articles published and economic growth in Organisation for Economic Co-operation and Development (OECD) countries for the period 1981–2011, using bootstrap panel causality analysis, which accounts for cross-section dependency and heterogeneity across countries. The article, by the use of the specific method and the choice of the country group, makes a contribution to the existing literature. Our empirical results support unidirectional causality running from research output (in terms of total number of articles published) to economic growth for the US, Finland, Hungary, and Mexico; the opposite causality from economic growth to research articles published for Canada, France, Italy, New Zealand, the UK, Austria, Israel, and Poland; and no causality for the rest of the countries. Our findings provide important policy implications for research policies and strategies for OECD countries.

Journal ArticleDOI
TL;DR: The results show that task stage could help interpret certain types of dwell time as reliable indicators of document usefulness in certain task types, as was topic knowledge, and the latter played a more significant role when both were available.
Abstract: Personalization of information retrieval tailors search towards individual users to meet their particular information needs by taking into account information about users and their contexts, often through implicit sources of evidence such as user behaviors. This study looks at users' dwelling behavior on documents and several contextual factors: the stage of users' work tasks, task type, and users' knowledge of task topics, to explore whether or not taking account contextual factors could help infer document usefulness from dwell time. A controlled laboratory experiment was conducted with 24 participants, each coming 3 times to work on 3 subtasks in a general work task. The results show that task stage could help interpret certain types of dwell time as reliable indicators of document usefulness in certain task types, as was topic knowledge, and the latter played a more significant role when both were available. This study contributes to a better understanding of how dwell time can be used as implicit evidence of document usefulness, as well as how contextual factors can help interpret dwell time as an indicator of usefulness. These findings have both theoretical and practical implications for using behaviors and contextual factors in the development of personalization systems.