scispace - formally typeset
Search or ask a question

Showing papers presented at "ACM international conference on Digital libraries in 2013"


Book ChapterDOI
09 Dec 2013
TL;DR: It is proposed that the theory of Information Grounds (IG) may provide a valuable lens for understanding how social media fosters collaboration and social engagement among information professionals.
Abstract: Researchers are increasingly grappling with ways of theorizing social media and its use. This review essay proposes that the theory of Information Grounds (IG) may provide a valuable lens for understanding how social media fosters collaboration and social engagement among information professionals. The paper presents literature that helps us understand how social media can be seen as IG, and maps the characteristics of social media to the seven propositions of IG theory. This work is part of a wider study investigating the ways in which Information Technology (IT) professionals experience social media.

14 citations


Book ChapterDOI
09 Dec 2013
TL;DR: The results show that individual users contributed to more than half of the information dissemination, and the common man played an active part in creating and facilitating this information flow.
Abstract: Twitter has become a critical force in generating and disseminating information pertaining to news events, public and media action, especially in situations such as protests, where public activism and media coverage form a symbiotic relationship. This study identifies different types of users or the "key actors", e.g., traditional media organizations, new media organizations, non-government organizations and individual users who posted on Twitter in the period before, during and after the mass protests pertaining to a gang-rape incident in the Indian capital city of New Delhi in December 2012. The study especially focuses on the role of ordinary citizens or The Common Man in creating and disseminating information. Our results show that individual users contributed to more than half of the information dissemination, and the common man played an active part in creating and facilitating this information flow. Our findings can be leveraged by digital libraries for customizing the library experience for individual users as well as virtual communities according to the new dynamic paradigms of information creation and consumption.

8 citations


Book ChapterDOI
09 Dec 2013
TL;DR: A large scale feasibility study is conducted on a real world data set from one of the biggest mathematical digital libraries, i.e. Zentralblatt MATH, with special focus on its practical applicability.
Abstract: The ever increasing amount of digitally available information is curse and blessing at the same time. On the one hand, users have increasingly large amounts of information at their fingertips. On the other hand, the assessment and refinement of web search results becomes more and more tiresome and difficult for non-experts in a domain. Therefore, established digital libraries offer specialized collections with a certain degree of quality. This quality can largely be attributed to the great effort invested into semantic enrichment of the provided documents e.g. by annotating their documents with respect to a domain-specific taxonomy. This process is still done manually in many domains, e.g. chemistry (CAS), medicine (MeSH), or mathematics (MSC). But due to the growing amount of data, this manual task gets more and more time consuming and expensive. The only solution for this problem seems to employ automated classification algorithms, but from evaluations done in previous research, conclusions to a real world scenario are difficult to make. We therefore conducted a large scale feasibility study on a real world data set from one of the biggest mathematical digital libraries, i.e. Zentralblatt MATH, with special focus on its practical applicability.

8 citations


Book ChapterDOI
09 Dec 2013
TL;DR: This paper proposes a method for extracting travel-related event information, such as an event name or a schedule from automatically identified newspaper articles, in which particular events are mentioned.
Abstract: In this paper, we propose a method for extracting travel-related event information, such as an event name or a schedule from automatically identified newspaper articles, in which particular events are mentioned We analyze news corpora using our method, extracting venue names from them We then find web pages that refer to event schedules for these venues To confirm the effectiveness of our method, we conducted several experiments From the experimental results, we obtained a precision of 915% and a recall of 759% for the automatic extraction of event information from news articles, and a precision of 908% and a recall of 528% for the automatic identification of event-related web pages

8 citations


Book ChapterDOI
09 Dec 2013
TL;DR: A topology-driven vicarious learning definition is introduced and the first centrality method for ranking vicarious learners is proposed, which supports the significance and uniqueness of the proposed approach.
Abstract: Despite being a topic of growing interest in social learning theory, vicarious learning has not been well-studied so far in digital library related tasks. In this paper, we address a novel ranking problem in research collaboration networks, which focuses on the role of vicarious learner. We introduce a topology-driven vicarious learning definition and propose the first centrality method for ranking vicarious learners. Results obtained on DBLP networks support the significance and uniqueness of the proposed approach.

7 citations


Book ChapterDOI
09 Dec 2013
TL;DR: This work proposes two additional notions of mixing due to negative edges — anti-assortativity and anti-disassortivity — which pertain to the show of distrust towards "similar" nodes and "dissimilar" nodes respectively, and classifies nodes based on a local measure of their trustworthiness, rather than based on in-degrees, in order to study mixing patterns.
Abstract: In some online social media such as Slashdot, actors are allowed to explicitly show their trust or distrust towards each other. Such a network, called a signed network, contains positive and negative edges. Traditional notions of assortativity and disassortativity are not sufficient to study the mixing patterns of connections between actors in a signed network, owing to the presence of negative edges. Towards this end, we propose two additional notions of mixing due to negative edges — anti-assortativity and anti-disassortativity — which pertain to the show of distrust towards "similar" nodes and "dissimilar" nodes respectively. We classify nodes based on a local measure of their trustworthiness, rather than based on in-degrees, in order to study mixing patterns. We also use some simple techniques to quantify a node's bias towards assortativity, disassortativity, anti-assortativity and anti-disassortativity in a signed network. Our experiments with the Slashdot Zoo network suggest that: (i) "low-trust" nodes show varied forms of mixing — reasonable assortativity, high disassortativity, slight anti-assortativity and slight anti-disassortativity, and (ii) "high-trust" nodes mix highly assortatively while showing very little disassortativity, anti-assortativity or anti-disassortativity.

6 citations


Book ChapterDOI
09 Dec 2013
TL;DR: It is argued that the social sustainability of digital libraries can and should be studied at different levels — from the very broad issues of sustainable development of societies and communities to more specific issues of information seeking and retrieval, ICT infrastructure and tools for access to, and use of, digital libraries.
Abstract: Major factors related to the social sustainability of digital libraries have been discussed. Some indicators for the social sustainability of digital libraries have been proposed. A conceptual model and a theoretical research framework for study of the social sustainability of digital libraries is proposed. It is argued that the social sustainability of digital libraries can and should be studied at different levels — from the very broad issues of sustainable development of societies and communities to more specific issues of information seeking and retrieval, ICT infrastructure and tools for access to, and use of, digital libraries.

5 citations


Book ChapterDOI
09 Dec 2013
TL;DR: This paper uses Wikipedia to semantically bridge the gap between query terms and textual content and shows significant improvements over query expansion approaches: the overall retrieval quality is increased up to 74.5% in mean average precision.
Abstract: Today, a vast amount of information is made available over the Web in the form of unstructured text indexed by Web search engines. But especially for searches on abstract concepts or context terms, a simple keyword-based Web search may compromise retrieval quality, because query terms may or may not directly occur in the texts (vocabulary problem). The respective state-of-the-art solution is query expansion leading to an increase in recall, although it often also leads to a steep decrease of retrieval precision. This decrease however is a severe problem for digital library providers: in libraries it is vital to ensure high quality retrieval meeting current standards. In this paper we present an approach allowing even for abstract context searches (conceptual queries) with high retrieval quality by using Wikipedia to semantically bridge the gap between query terms and textual content. We do not expand queries, but extract the most important terms from each text document in a focused Web collection and then enrich them with features gathered from Wikipedia. These enriched terms are further used to compute the relevance of a document with respect to a conceptual query. The evaluation shows significant improvements over query expansion approaches: the overall retrieval quality is increased up to 74.5% in mean average precision.

4 citations


Book ChapterDOI
09 Dec 2013
TL;DR: TLIS VIVO as discussed by the authors is a researcher networking system of the Library and Information Science field based in Taiwan, by using ORCID, VDIVO, and Linked Open Data technologies.
Abstract: This paper presents a prototype of TLIS VIVO, a researcher networking system of the Library and Information Science field based in Taiwan, by using ORCID, VIVO, and Linked Open Data technologies. It extends VIVO with the author identifier system ORCID, and integrates data thus harvested from Chinese Library & Information Science Abstracts (CLISA), and Scopus. The study demonstrates a practical approach to increase the visibility, collaboration of local and global researchers.

4 citations


Book ChapterDOI
09 Dec 2013
TL;DR: The evolving scholarly activities and needs of researchers in this relatively new research environment in Qatar are studied, and previous findings from a well-established environment in the United States are compared.
Abstract: Qatar has become an active research producer of data, publications, and other scholarly works. We studied the evolving scholarly activities and needs of researchers in this relatively new research environment, and compared them with previous findings from a well-established environment in the United States. The initial findings shed some light on the similarities and differences in information seeking behavior and information needs. We also highlighted some requirements and solutions appropriate for future international digital libraries and social reference management systems.

4 citations


Book ChapterDOI
09 Dec 2013
TL;DR: A scaleable architecture suitable to create a community driven platform for preservation and curation of complex digital objects is proposed and a novel means for presenting preserved results including technical meta-data is provided, allowing for public review and potentially further community induced improvements.
Abstract: Preservation of complex, non-linear digital objects such as digital art or ancient computer environments has been a domain reserved for experts until now. Digital culture, however, is a broader phenomenon. With the introduction of the so-called Web 2.0 digital culture became a mass culture. New methods of content creation, publishing and cooperation lead to new cultural achievements. Therefore, novel tools and strategies are required, both for preservation but in particular for curation and presentation. We propose a scaleable architecture suitable to create a community driven platform for preservation and curation of complex digital objects. Further, we provide novel means for presenting preserved results including technical meta-data, and thus, allowing for public review and potentially further community induced improvements.

Book ChapterDOI
Yuqi Wang1, Yin Zhang1, Yanfei Yin1, Deng Yi1, Baogang Wei1 
09 Dec 2013
TL;DR: This work proposes an incremental, cluster-based algorithm on Stream Processing Architecture, which is scalable and suitable to real-time environment, and experimental results show the algorithm is efficient, while still maintains comparable accuracy and interpretability.
Abstract: By helping users discover books they may be interested in, recommender systems fully exploit the resources of digital libraries and better facilitate users' reading demands. Traditional memory-based collaborative filtering (CF) methods are effective and easy to interpret. However, when datasets become larger, the traditional way turns to be infeasible in both time and space. In order to address this challenge, we propose an incremental, cluster-based algorithm on Stream Processing Architecture, which is scalable and suitable to real-time environment. Our experimental results on MovieLens datasets and CADAL user-chapter logs show our algorithm is efficient, while still maintains comparable accuracy and interpretability.

Book ChapterDOI
09 Dec 2013
TL;DR: This paper attempts to look at how the scholarly communications are taking place through the various social media and community networks across the disciplines.
Abstract: This paper attempts to look at how the scholarly communications are taking place through the various social media and community networks across the disciplines Now the social and community networks had become an essential part of our lives and they have now grown beyond from a channel of communication with family and friends to a professional and scholarly communication channels The most popular scholarly networks have now millions of academic and research articles The technological features of these networks enable interoperability, openness and thus can be shared with other networks using various sharing applications This paper presents an insight of the scholarly communications through social networks

Book ChapterDOI
09 Dec 2013
TL;DR: Investigation of individuals' personality and perceptions of information quality on intention to play Human Computation Games for mobile information sharing revealed that personality traits of extraversion and openness, as well as perceived information accuracy and relevancy were significant in predicting intent to play.
Abstract: Applications that use gaming elements to harness human intelligence to tackle computational tasks are increasing in popularity and may be termed as Human Computation Games (HCGs). Recently, HCGs have been utilized to offer a more engaging experience for mobile information sharing. Yet, there is a lack of research that examines individuals' personality and behaviors related to HCGs. This understanding may be important in identifying HCG features that suit different personality orientations. Thus, this study aims to investigate the influence of individuals' personality and perceptions of information quality on intention to play HCG for mobile information sharing. In a study of 205 participants, results revealed that personality traits of extraversion and openness, as well as perceived information accuracy and relevancy were significant in predicting intention to play. Implications of our work are also discussed.

Book ChapterDOI
09 Dec 2013
TL;DR: The results showed performance for IDDM-IS to be better than the in-degree and sentiment-values baseline approaches and could provide a fine-grained description of the influence diffusion paths using the bloggers' influence styles.
Abstract: Previous studies on detecting blogosphere influence diffusion had used blog features such as in-degree and sentiment links. The approaches in most of these studies assumed that influence increases with the number of links and largely ignored the possible effect of bloggers' influence style on the diffusion of influence between linked bloggers where influence could be further described through the engagement style, persuasion style, and the persona of the bloggers. In this paper, we propose an Influence Diffusion Detection Model — Influence Style (IDDM-IS) that includes the use of bloggers' influence styles to detect influence diffusion through the blogosphere. Our study analyzed 107 bloggers with varying influence styles to detect the influence diffusion path. The results showed performance for IDDM-IS to be better than the in-degree and sentiment-values baseline approaches. In addition, IDDM-IS could provide a fine-grained description of the influence diffusion paths using the bloggers' influence styles.

Book ChapterDOI
09 Dec 2013
TL;DR: This article gives a brief introduction to the digital curation (DC) concept emphasizing on its role to the Knowledge Management (KM) and Library community users.
Abstract: This article gives a brief introduction to the digital curation (DC) concept emphasizing on its role to the Knowledge Management (KM) and Library community users.

Book ChapterDOI
09 Dec 2013
TL;DR: The creation of metadata schema for the output describing Thai Lanna inscriptions and storing a set of character images by reusing and extending the metadata schema from Thai L Hanna archives is presented.
Abstract: Digitization has been applied to the Thai Lanna script, or so-called the Fakkham script. Two major requirements to create a digital collection of the Thai Lanna inscriptions is to preserve them in the digital format and to infer for the engraving period of each inscriptions supporting the linguists and historians to interpret the contents within an appropriate context. The digital inscription images can be processed by the image processing techniques to create an important output, a set of each character images. It does not only use for a data set training in a preservation process, but also indicate the evolution of ancient characters, character relationships and stories in the historical times. This paper presents two key issues. First, the creation of metadata schema for the output describing Thai Lanna inscriptions and storing a set of character images by reusing and extending the metadata schema from Thai Lanna archives. Second, the architecture of utilization showing the contributions of our metadata schema design that serves as the foundation of the architecture.

Book ChapterDOI
09 Dec 2013
TL;DR: Protege software was used for creating the main components of the digital library ontology, viz. individuals, properties and classes, etc. that can be visually seen as a knowledge map of digital library research.
Abstract: The paper reported method for designing and engineering digital library domain ontology. Based on the digital library knowledge map, Protege software was used for creating the main components of the digital library ontology, viz. individuals, properties and classes, etc. that can be visually seen as a knowledge map of digital library research

Book ChapterDOI
09 Dec 2013
TL;DR: Analysis of whether gender, field of study, and experience influenced user performance and preference in four different hierarchical layouts shows that, generally, the three factors did not show significant differences between layouts.
Abstract: Lately, more studies have started to look into adapting information visualization to individual users. This paper adds to those studies by analysing whether gender, field of study, and experience influenced user performance and preference in four different hierarchical layouts. The results show that, generally, the three factors did not show significant differences between layouts, but also revealed some interesting indications for further studies.

Book ChapterDOI
09 Dec 2013
TL;DR: This work proposes fuzzy matching based indexing and retrieval algorithms inspired from prefix trees and discusses a query term truncation and decayed score based retrieval algorithm for better retrieval of the documents for the given query.
Abstract: Kannada is a phonetic language. In Kannada language, the morphological forms of terms (especially of nouns and verbs) are formed by adding different morphological suffixes to their pure forms. Hence, when queried for morphological forms, search engines based on exact matching fail to identify other semantically similar and morphologically different terms, and thus reduce the quality of the search results. We observe that even though the morphological forms of a term look different, they can be grouped together based on their common prefixes. In this work we propose fuzzy matching based indexing and retrieval algorithms. We propose an indexing mechanism inspired from prefix trees. We also derive our inspirations from the fact that the Unicode encodes the Kannada terms very similar to the way terms are generated using Kannada grammar. We also discuss a query term truncation and decayed score based retrieval algorithm for better retrieval of the documents for the given query. The indexing and retrieval systems still are based on the tf-idf based indexing and retrieval. However, the novelty of the work lies in the way the algorithms bring together the similar terms. This solution can be scaled to work for other South Indian languages with no or little modification as their Unicode encoding and morphological behaviors are similar to Kannada.

Book ChapterDOI
09 Dec 2013
TL;DR: A semi-automated Content Alert System using mobile SMS alerts, implemented at two university libraries, incorporates a number of publishers on a larger scale and is a record of the project details, methods used and the findings.
Abstract: This paper is a case study of a semi-automated Content Alert System using mobile SMS alerts, implemented at two university libraries; The university of Swaziland, Swaziland, Southern Africa and Bundelkhand University Library in India. The project ran in two phases; first a content alert system was tried and tested at the University of Swaziland with the help of Emerald Publishers and a prototype was developed on its completion. The second phase used the prototype to create a similar content alert service with a larger heterogeneous user group and incorporated a number of publishers on a larger scale. This paper is a record of the project details, methods used and the findings of the projects.

Book ChapterDOI
09 Dec 2013
TL;DR: This work proposes a new duplicate detection and reconciliation technique called RefConcile, aimed specifically at bibliographic references, which uses dedicated blocking and matching techniques tailored to this type of data.
Abstract: Comprehensive bibliographies often rely on community contributions. In such settings, de-duplication is mandatory for the bibliography to be useful. Ideally, de-duplication works online, i.e., when adding new references, so the bibliography remains duplicate-free at all times. While de-duplication is well researched, generic approaches do not achieve the result quality required for automated reconciliation. To overcome this problem, we propose a new duplicate detection and reconciliation technique called RefConcile. Aiming specifically at bibliographic references, it uses dedicated blocking and matching techniques tailored to this type of data. Our evaluation based on a large real-world collection of bibliographic references shows that RefConcile scales well, and that it detects and reconciles duplicates highly accurately.

Book ChapterDOI
09 Dec 2013
TL;DR: This policy analysis, along with data from interviews of scholars and archivists, is intended to serve as a basis for developing mobile applications for assisting scholars in their research activities and introduces an early prototype of such a mobile application–AMTracker.
Abstract: Doing research in the archive is the cornerstone of humanities scholarship. Various archives institute policies regarding the use of technological devices, such as mobile phones, laptops, and cameras in their reading rooms. Such policies directly affect the scholars as the devices mediate the nature of their interaction with the source materials in terms of capturing, organizing, note taking, and record keeping for future use of found materials. In this paper, we present our analysis of the policies of thirty archives regarding the use of technology in their reading rooms. This policy analysis, along with data from interviews of scholars and archivists, is intended to serve as a basis for developing mobile applications for assisting scholars in their research activities. In this paper we introduce an early prototype of such a mobile application–AMTracker.

Book ChapterDOI
09 Dec 2013
TL;DR: A plug-in system–PuntStore as a general solution to very large digital library and a new index engine pLSM in PuntStore to meet the specific needs in digital libraries.
Abstract: In the era of Big Data, actual demands of collecting large volumes of complex digital information have brought new challenges to digital library software. This scenario calls for the construction of Very Large Digital Library (VLDL). New approaches and technologies are needed to deal with the various issues in designing and developing VLDL. In this paper, we design a plug-in system–PuntStore as a general solution to very large digital library. PuntStore supports different kinds of storage engines and index engines to deal with the problem of storing and retrieving data efficiently. We also design a new index engine pLSM in PuntStore to meet the specific needs in digital libraries. The successful adoption of PuntStore in the project of the Digital Library on History of Science and Technology in China (DLHSTC) shows that PuntStore can function effectively in supporting VLDL systems.

Book ChapterDOI
09 Dec 2013
TL;DR: The components of the SIARD Archive Browser are presented which is a simple to use and platform-independent tool for browsing a SIARD archive with more functionality.
Abstract: The Software-Independent Archival of Relational Databases (SIARD) project developed a tool known as the "SIARD Suite" for preserving relational databases. The tool converts a relational database to a XML format. This paper presents the components of the SIARD Archive Browser which is a simple to use and platform-independent tool for browsing a SIARD Archive. This may be helpful for users interested in using the software. Moreover, it may be useful for people who want to re-use the code and develop software for browsing a SIARD archive with more functionality.

Book ChapterDOI
09 Dec 2013
TL;DR: This paper analyzes the data structures and workloads of services provided by modern DLs and proposes a data-storage strategy model, and describes the development of PuntTable, a flexible repository system for DLs.
Abstract: Digital libraries (DLs) provide various contents and services which become increasingly comprehensive and customizable. This has placed growing pressure on the repository systems of DLs. Common repository tools, such as Fedora and DSpace, have been widely deployed in DL systems. However, those repository tools often use traditional relational database management systems plus file systems as the storage layer, which cannot provide additional functionality. Complex services, such as user-generated content, recommendations, social networks services, etc. generate complex and heavy workloads on structured, semi-structured, and unstructured data. Those common repositories are not designed to handle such workloads, so the pressures are transferred to upper application layers. In this paper, we analyze the data structures and workloads of services provided by modern DLs and propose a data-storage strategy model. Based on this model, we describe the development of PuntTable, a flexible repository system for DLs. By integrating various data stores and making it extensible and flexible, PuntTable can easily support complex content and services. We deploy PuntTable to the Digital Library on History of Science & Technology in China, and evaluate the data-storage strategy and PuntTable.

Book ChapterDOI
09 Dec 2013
TL;DR: SRec, an automatic real-time slide capturing and sharing system, filters speaker body occlusion, recovers slide content, generates image/voice digital notes and shares them through social media services.
Abstract: With the popularity of social media services, academic lecture slides could be shared immediately and discussed among hundreds of friends, which may extend the audience scope. Aiming at slide recording in speaker-occlusion scenes, we present our prototype system SRec, an automatic real-time slide capturing and sharing system. Based on depth information, it filters speaker body occlusion, recovers slide content, generates image/voice digital notes and shares them through social media services. The experiment shows SRec could capture and recover common text and picture slides shown in LCD and projector screens efficiently.

Book ChapterDOI
09 Dec 2013
TL;DR: This poster provides novel measures for analyzing the following phenomena in signed networks: (i) reciprocal behavior between pairs of actors in terms of trusting/distrusting each other, and (ii) equivalence of actorsIn terms of their patterns of Trust/Distrusting other actors.
Abstract: Signed Networks allow explicit show of trust/distrust relationships between actors. In this poster, we provide novel measures for analyzing the following phenomena in signed networks: (i) reciprocal behavior between pairs of actors in terms of trusting/distrusting each other, and (ii) equivalence of actors in terms of their patterns of trusting/distrusting other actors.

Book ChapterDOI
09 Dec 2013
TL;DR: The Hybrid Model which is the concoction of the Geometric Model's dimensions and the Network Model's relations using the Manhattan distance method for computing semantic distance between geo-spatial query concept and the related geo- Spatial concept in the data sources is proposed.
Abstract: The interest on the geo-spatial information system is increasing swiftly, which leads to the development of the competent information retrieval system. Among the several semantic similarity models, the existing models such as Geometric Model characterizes the geo-spatial concept using their dimensions (i.e. properties) and the Network Model, using their spatial relations which has yielded less precision. For retrieving the geo-spatial information efficiently, the dimensions and the spatial relations between the geo-spatial concepts must be considered. Hence this paper proposes the Hybrid Model which is the concoction of the Geometric Model's dimensions and the Network Model's relations using the Manhattan distance method for computing semantic distance between geo-spatial query concept and the related geo-spatial concept in the data sources. The results and analysis illustrates that the Hybrid Model using Manhattan distance method could yield better precision, recall and the relevant information retrieval. Further the Manhattan Based Similarity Measure (MBSM) algorithm is proposed which uses the Manhattan Distance Method for computing the semantic similarity among the geo-spatial concepts which yields 10% increase in precision compared to the existing semantic similarity models.