scispace - formally typeset
Proceedings ArticleDOI

ES 2 : A cloud data storage system for supporting both OLTP and OLAP

TLDR
ES2 - the elastic data storage system of epiC, which is designed to support both functionalities within the same storage, and experimental results which demonstrate the efficiency of the system.
Abstract
Cloud computing represents a paradigm shift driven by the increasing demand of Web based applications for elastic, scalable and efficient system architectures that can efficiently support their ever-growing data volume and large-scale data analysis. A typical data management system has to deal with real-time updates by individual users, and as well as periodical large scale analytical processing, indexing, and data extraction. While such operations may take place in the same domain, the design and development of the systems have somehow evolved independently for transactional and periodical analytical processing. Such a system-level separation has resulted in problems such as data freshness as well as serious data storage redundancy. Ideally, it would be more efficient to apply ad-hoc analytical processing on the same data directly. However, to the best of our knowledge, such an approach has not been adopted in real implementation. Intrigued by such an observation, we have designed and implemented epiC, an elastic power-aware data-itensive Cloud platform for supporting both data intensive analytical operations (ref. as OLAP) and online transactions (ref. as OLTP). In this paper, we present ES2 - the elastic data storage system of epiC, which is designed to support both functionalities within the same storage. We present the system architecture and the functions of each system component, and experimental results which demonstrate the efficiency of the system.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

In-Memory Big Data Management and Processing: A Survey

TL;DR: This survey aims to provide a thorough review of a wide range of in-memory data management and processing proposals and systems, including both data storage systems and data processing frameworks.
Journal ArticleDOI

Open challenges for data stream mining research

TL;DR: This article presents a discussion on eight open challenges for data stream mining, which cover the full cycle of knowledge discovery and involve such problems as protecting data privacy, dealing with legacy systems, handling incomplete and delayed information, analysis of complex data, and evaluation of stream mining algorithms.
Proceedings ArticleDOI

Big Data Processing in Cloud Computing Environments

TL;DR: This paper presents the key issues of big data processing, including cloud computing platform, cloud architecture, cloud database and data storage scheme, and introduces Map Reduce optimization strategies and applications reported in the literature.
Journal ArticleDOI

Distributed data management using MapReduce

TL;DR: This article aims to provide a comprehensive review of a wide range of proposals and systems that focusing fundamentally on the support of distributed data management and processing using the MapReduce framework.
Proceedings ArticleDOI

Query optimization for massively parallel data processing

TL;DR: A query optimization scheme for MapReduce-based query processing systems by embedding into Hive a query optimizer which is designed to generate an efficient query plan based on the proposed cost model.
References
More filters
Proceedings Article

Bigtable: A Distributed Storage System for Structured Data (Awarded Best Paper!).

TL;DR: Bigtable as mentioned in this paper is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers, including web indexing, Google Earth and Google Finance.
Proceedings ArticleDOI

Dynamo: amazon's highly available key-value store

TL;DR: D Dynamo is presented, a highly available key-value storage system that some of Amazon's core services use to provide an "always-on" experience and makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.
Journal ArticleDOI

Bigtable: A Distributed Storage System for Structured Data

TL;DR: The simple data model provided by Bigtable is described, which gives clients dynamic control over data layout and format, and the design and implementation of Bigtable are described.
Journal ArticleDOI

Cassandra: a decentralized structured storage system

TL;DR: Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of failure.
Proceedings ArticleDOI

Dryad: distributed data-parallel programs from sequential building blocks

TL;DR: The Dryad execution engine handles all the difficult problems of creating a large distributed, concurrent application: scheduling the use of computers and their CPUs, recovering from communication or computer failures, and transporting data between vertices.
Related Papers (5)