Journal ArticleDOI
Data page layouts for relational databases on deep memory hierarchies
Anastassia Ailamaki,David J. DeWitt,Mark D. Hill +2 more
- Vol. 11, Iss: 3, pp 198-215
Reads0
Chats0
TLDR
This paper proposes a new data organization model called PAX (Partition Attributes Across), that significantly improves cache performance by grouping together all values of each attribute within each page, and shows that PAX performs well across different memory system designs.Abstract:
Relational database systems have traditionally optimized for I/O performance and organized records sequentially on disk pages using the N-ary Storage Model (NSM) (a.k.a., slotted pages). Recent research, however, indicates that cache utilization and performance is becoming increasingly important on modern platforms. In this paper, we first demonstrate that in-page data placement is the key to high cache performance and that NSM exhibits low cache utilization on modern platforms. Next, we propose a new data organization model called PAX (Partition Attributes Across), that significantly improves cache performance by grouping together all values of each attribute within each page. Because PAX only affects layout inside the pages, it incurs no storage penalty and does not affect I/O behavior. According to our experimental results (which were obtained without using any indices on the participating relations), when compared to NSM: (a) PAX exhibits superior cache and memory bandwidth utilization, saving at least 75% of NSM's stall time due to data cache accesses; (b) range selection queries and updates on memory-resident relations execute 1725% faster; and (c) TPC-H queries involving I/O execute 1148% faster. Finally, we show that PAX performs well across different memory system designs.read more
Citations
More filters
Proceedings ArticleDOI
FAWN: a fast array of wimpy nodes
David G. Andersen,Jason Franklin,Michael Kaminsky,Amar Phanishayee,Lawrence Tan,Vijay K. Vasudevan +5 more
TL;DR: The key contributions of this paper are the principles of the FAWN architecture and the design and implementation of FAWN-KV--a consistent, replicated, highly available, and high-performance key-value storage system built on a FAWN prototype.
Book ChapterDOI
Staircase join: teach a relational DBMS to watch its (axis) steps
TL;DR: This text proposes a local change to the database kernel, the staircase join, which encapsulates the necessary tree knowledge needed to improve XPath performance and reports on quite promising experiments with a staircase join enhanced main-memory database kernel.
Journal ArticleDOI
SQL-on-Hadoop: full circle back to shared-nothing database architectures
TL;DR: This paper compares the performance of Impala and Hive, the new emerging class of SQL-on-Hadoop systems that exploit a shared-nothing parallel database architecture over Hadoop, and examines the strengths and limitations of each system.
Proceedings ArticleDOI
Query processing techniques for solid state drives
TL;DR: This paper investigates data structures and algorithms that leverage fast random reads to speed up selection, projection, and join operations in relational query processing, and introduces FlashJoin, a general pipelined join algorithm that minimizes accesses to base and intermediate relational data.
Journal ArticleDOI
Brighthouse: an analytic data warehouse for ad-hoc queries
TL;DR: Additional benefits resulting from Knowledge Grid for compressed, column-oriented databases, including assistance in query optimization and execution, are demonstrated by minimizing the need of data reads and data decompression.
References
More filters
Book
Computer Architecture: A Quantitative Approach
TL;DR: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today.
Proceedings ArticleDOI
Access path selection in a relational database management system
TL;DR: System R as mentioned in this paper is an experimental database management system developed to carry out research on the relational model of data, which chooses access paths for both simple (single relation) and complex queries (such as joins), given a user specification of desired data as a boolean expression of predicates.
Book
Database Management Systems
TL;DR: New to this edition are the early coverage of the ER model, new chapters on Internet databases, data mining, and spatial databases, and a new supplement on practical SQL assignments (with solutions for instructors' use).
Proceedings Article
DBMSs on a Modern Processor: Where Does Time Go?
TL;DR: This paper examines four commercial DBMSs running on an Intel Xeon and NT 4.0 and introduces a framework for analyzing query execution time, and finds that database developers should not expect the overall execution time to decrease significantly without addressing stalls related to subtle implementation issues.