scispace - formally typeset
Search or ask a question
Topic

Data access

About: Data access is a research topic. Over the lifetime, 13141 publications have been published within this topic receiving 172859 citations. The topic is also known as: Data access.


Papers
More filters
Patent
27 Feb 2002
TL;DR: In this paper, a storage apparatus acquires static constituent information of a database management system (DBMS) through a network by using a DBMS information acquisition/communication program, database information communication portion, and host information setting program, and stores them as DBMS data information in its memory.
Abstract: A storage apparatus acquires static constituent information of a database management system (DBMS) through a network by using a DBMS information acquisition/communication program, DBMS information communication portion, and host information setting program, and stores them as DBMS data information in its memory. The physical storage position managing/optimizing execution portion within a control program of the storage apparatus makes data allocation and cache control that consider the characteristics of the data base management system by using DBMS data information, thereby improving the data access performance of the storage apparatus.

43 citations

Journal ArticleDOI
01 Oct 2014
TL;DR: ScalaGiST - scalable generalized search tree that can be seamlessly integrated with Hadoop, together with a cost-based data access optimizer for efficient query processing at run-time is presented.
Abstract: MapReduce has become the state-of-the-art for data parallel processing. Nevertheless, Hadoop, an open-source equivalent of MapReduce, has been noted to have sub-optimal performance in the database context since it is initially designed to operate on raw data without utilizing any type of indexes. To alleviate the problem, we present ScalaGiST - scalable generalized search tree that can be seamlessly integrated with Hadoop, together with a cost-based data access optimizer for efficient query processing at run-time. ScalaGiST provides extensibility in terms of data and query types, hence is able to support unconventional queries (e.g., multi-dimensional range and k-NN queries) in MapReduce systems, and can be dynamically deployed in large cluster environments for handling big users and data.We have built ScalaGiST and demonstrated that it can be easily instantiated to common B+-tree and R-tree indexes yet for dynamic distributed environments. Our extensive performance study shows that ScalaGiST can provide efficient write and read performance, elastic scaling property, as well as effective support for MapReduce execution of ad-hoc analytic queries. Performance comparisions with recent proposals of specialized distributed index structures, such as SpatialHadoop, Data Mapping, and RT-CAN further confirm its efficiency.

43 citations

Proceedings Article
01 Jan 2005
TL;DR: RAP as discussed by the authors is a prototype implementation of an RDF store with integrated maintenance capabilities and access control using user defined policies, all actions to the store are routed through RAP policy engine, to determine whether the action is permitted or prohibited.
Abstract: Specialized stores for RDF data are essential parts of many Semantic Web applications. Current RDF stores have primarily focused on efficiently storing and querying large volumes of data and little attention has been given other features common to many database systems, including how information can updated and maintained or access to data controlled. The problem is complicated by the fact that the addition or deletion of a simple fact (i.e., an RDF triple) are not atomic since they can trigger reasoning that can result in adding or deleting derived triples. Current access control mechanisms for RDF stores largely ignore this aspect. We describe a policy based mechanism to determine access control for an RDF store. RAP is a prototype implementation of an RDF store with integrated maintenance capabilities and access control using user defined policies. All actions to the store are routed through RAP policy engine, to determine whether the action is permitted or prohibited. In the RAP framework, the same RDF store is also used to store the policy, as well as metadata about the triples, allowing greater range in policy specification.

43 citations

Book ChapterDOI
12 Mar 2012
TL;DR: This paper describes a novel compressed time series database named tsdb whose goal is to allow large time series to be stored and consolidated in realtime with limited disk space usage and has shown that tsdb is suitable for handling a large number of time series.
Abstract: Large-scale network monitoring systems require efficient storage and consolidation of measurement data. Relational databases and popular tools such as the Round-Robin Database show their limitations when handling a large number of time series. This is because data access time greatly increases with the cardinality of data and number of measurements. The result is that monitoring systems are forced to store very few metrics at low frequency in order to grant data access within acceptable time boundaries. This paper describes a novel compressed time series database named tsdb whose goal is to allow large time series to be stored and consolidated in realtime with limited disk space usage. The validation has demonstrated the advantage of tsdb over traditional approaches, and has shown that tsdb is suitable for handling a large number of time series.

43 citations

Journal ArticleDOI
11 Jun 2014
TL;DR: This work describes programmatic testing of various federation access modes including direct access over the wide area network and staging of remote data files to local disk and a time-dependent cost-of-data-access matrix is made taking into account network performance and key site performance factors.
Abstract: In the past year the ATLAS Collaboration accelerated its program to federate data storage resources using an architecture based on XRootD with its attendant redirection and storage integration services. The main goal of the federation is an improvement in the data access experience for the end user while allowing more efficient and intelligent use of computing resources. Along with these advances come integration with existing ATLAS production services (PanDA and its pilot services) and data management services (DQ2, and in the next generation, Rucio). Functional testing of the federation has been integrated into the standard ATLAS and WLCG monitoring frameworks and a dedicated set of tools provides high granularity information on its current and historical usage. We use a federation topology designed to search from the site's local storage outward to its region and to globally distributed storage resources. We describe programmatic testing of various federation access modes including direct access over the wide area network and staging of remote data files to local disk. To support job-brokering decisions, a time-dependent cost-of-data-access matrix is made taking into account network performance and key site performance factors. The system's response to production-scale physics analysis workloads, either from individual end-users or ATLAS analysis services, is discussed.

43 citations


Network Information
Related Topics (5)
Software
130.5K papers, 2M citations
86% related
Cloud computing
156.4K papers, 1.9M citations
86% related
Cluster analysis
146.5K papers, 2.9M citations
85% related
The Internet
213.2K papers, 3.8M citations
85% related
Information system
107.5K papers, 1.8M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202351
2022125
2021403
2020721
2019906
2018816