scispace - formally typeset
Search or ask a question
Topic

Data access

About: Data access is a research topic. Over the lifetime, 13141 publications have been published within this topic receiving 172859 citations. The topic is also known as: Data access.


Papers
More filters
Journal ArticleDOI
TL;DR: The architecture and implementation of a set of integrated replica management services, based on the web services model, are described and their performance under demanding Grid conditions is evaluated.
Abstract: Within the European DataGrid project, Work Package 2 has designed and implemented a set of integrated replica management services for use by data intensive scientific applications. These services, based on the web services model, enable movement and replication of data at high speed from one geographical site to another, management of distributed replicated data, optimization of access to data, and the provision of a metadata management tool. In this paper we describe the architecture and implementation of these services and evaluate their performance under demanding Grid conditions.

50 citations

Proceedings ArticleDOI
26 Jun 2016
TL;DR: This paper implements a prototype system, Orthrus, that is motivated by the principles of separation of database component functionality and advanced planning of transactions, and finds that these two principles alone result in significantly improved scalability on high-contention workloads, and an order of magnitude increase in throughput for a non-trivial subset of these contended workloads.
Abstract: Although significant recent progress has been made in improving the multi-core scalability of high throughput transactional database systems, modern systems still fail to achieve scalable throughput for workloads involving frequent access to highly contended data. Most of this inability to achieve high throughput is explained by the fundamental constraints involved in guaranteeing ACID --- the addition of cores results in more concurrent transactions accessing the same contended data for which access must be serialized in order to guarantee isolation. Thus, linear scalability for contended workloads is impossible. However, there exist flaws in many modern architectures that exacerbate their poor scalability, and result in throughput that is much worse than fundamentally required by the workload. In this paper we identify two prevalent design principles that limit the multi-core scalability of many (but not all) transactional database systems on contended workloads: the multi-purpose nature of execution threads in these systems, and the lack of advanced planning of data access. We demonstrate the deleterious results of these design principles by implementing a prototype system, Orthrus, that is motivated by the principles of separation of database component functionality and advanced planning of transactions. We find that these two principles alone result in significantly improved scalability on high-contention workloads, and an order of magnitude increase in throughput for a non-trivial subset of these contended workloads.

50 citations

Journal ArticleDOI
Tian Luo1, Rubao Lee1, Michael P. Mesnier2, Feng Chen2, Xiaodong Zhang1 
01 Jun 2012
TL;DR: The performance evaluation shows that hStorage-DB can automatically make proper decisions for data allocation in different storage devices and make substantial performance improvements in a cost-efficient way.
Abstract: As storage systems become increasingly heterogeneous and complex, it adds burdens on DBAs, causing suboptimal performance even after a lot of human efforts have been made. In addition, existing monitoring-based storage management by access pattern detections has difficulties to handle workloads that are highly dynamic and concurrent. To achieve high performance by best utilizing heterogeneous storage devices, we have designed and implemented a heterogeneity-aware software framework for DBMS storage management called hStorage-DB, where semantic information that is critical for storage I/O is identified and passed to the storage manager. According to the collected semantic information, requests are classified into different types. Each type is assigned a proper QoS policy supported by the underlying storage system, so that every request will be served with a suitable storage device. With hStorage-DB, we can well utilize semantic information that cannot be detected through data access monitoring but is particularly important for a hybrid storage system. To show the effectiveness of hStorage-DB, we have implemented a system prototype that consists of an I/O request classification enabled DBMS, and a hybrid storage system that is organized into a two-level caching hierarchy. Our performance evaluation shows that hStorage-DB can automatically make proper decisions for data allocation in different storage devices and make substantial performance improvements in a cost-efficient way.

50 citations

Journal ArticleDOI
TL;DR: Sharing spatially specific data, which includes the characteristics and behaviors of individuals, households, or communities in geographical space, raises distinct technical and ethical challenges.
Abstract: Scholarly communication is at an unprecedented turning point created in part by the increasing saliency of data stewardship and data sharing. Formal data management plans represent a new emphasis in research, enabling access to data at higher volumes and more quickly, and the potential for replication and augmentation of existing research. Data sharing has recently transformed the practice, scope, content, and applicability of research in several disciplines, in particular in relation to spatially specific data. This lends exciting potentiality, but the most effective ways in which to implement such changes, particularly for disciplines involving human subjects and other sensitive information, demand consideration. Data management plans, stewardship, and sharing, impart distinctive technical, sociological, and ethical challenges that remain to be adequately identified and remedied. Here, we consider these and propose potential solutions for their amelioration.

50 citations

Patent
03 Nov 1998
TL;DR: The Data Socket client as discussed by the authors allows the user or program to access any data source available on the user's machine as well as data anywhere on a network, such as a LAN, WAN or the Internet.
Abstract: A Data Socket client and associated applications and/or tools which provide programs with access to data from various sources and having various types or formats, wherein the access is provided invisibly to the user. The Data Socket client allows the user or program to access any data source available on the user's machine as well as data anywhere on a network, such as a LAN, WAN or the Internet. In the preferred embodiment, the Data Socket client addresses data sources or I/O sources using a URL (uniform resource locator), much the way that a URL is used to address web pages anywhere in the world. The present invention also includes new Data Socket URLs which allow the user to access I/O sources.

50 citations


Network Information
Related Topics (5)
Software
130.5K papers, 2M citations
86% related
Cloud computing
156.4K papers, 1.9M citations
86% related
Cluster analysis
146.5K papers, 2.9M citations
85% related
The Internet
213.2K papers, 3.8M citations
85% related
Information system
107.5K papers, 1.8M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202351
2022125
2021403
2020721
2019906
2018816