scispace - formally typeset
Journal ArticleDOI

Data page layouts for relational databases on deep memory hierarchies

Reads0
Chats0
TLDR
This paper proposes a new data organization model called PAX (Partition Attributes Across), that significantly improves cache performance by grouping together all values of each attribute within each page, and shows that PAX performs well across different memory system designs.
Abstract
Relational database systems have traditionally optimized for I/O performance and organized records sequentially on disk pages using the N-ary Storage Model (NSM) (a.k.a., slotted pages). Recent research, however, indicates that cache utilization and performance is becoming increasingly important on modern platforms. In this paper, we first demonstrate that in-page data placement is the key to high cache performance and that NSM exhibits low cache utilization on modern platforms. Next, we propose a new data organization model called PAX (Partition Attributes Across), that significantly improves cache performance by grouping together all values of each attribute within each page. Because PAX only affects layout inside the pages, it incurs no storage penalty and does not affect I/O behavior. According to our experimental results (which were obtained without using any indices on the participating relations), when compared to NSM: (a) PAX exhibits superior cache and memory bandwidth utilization, saving at least 75% of NSM's stall time due to data cache accesses; (b) range selection queries and updates on memory-resident relations execute 1725% faster; and (c) TPC-H queries involving I/O execute 1148% faster. Finally, we show that PAX performs well across different memory system designs.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

FAWN: a fast array of wimpy nodes

TL;DR: The key contributions of this paper are the principles of the FAWN architecture and the design and implementation of FAWN-KV--a consistent, replicated, highly available, and high-performance key-value storage system built on a FAWN prototype.
Book ChapterDOI

Staircase join: teach a relational DBMS to watch its (axis) steps

TL;DR: This text proposes a local change to the database kernel, the staircase join, which encapsulates the necessary tree knowledge needed to improve XPath performance and reports on quite promising experiments with a staircase join enhanced main-memory database kernel.
Journal ArticleDOI

SQL-on-Hadoop: full circle back to shared-nothing database architectures

TL;DR: This paper compares the performance of Impala and Hive, the new emerging class of SQL-on-Hadoop systems that exploit a shared-nothing parallel database architecture over Hadoop, and examines the strengths and limitations of each system.
Proceedings ArticleDOI

Query processing techniques for solid state drives

TL;DR: This paper investigates data structures and algorithms that leverage fast random reads to speed up selection, projection, and join operations in relational query processing, and introduces FlashJoin, a general pipelined join algorithm that minimizes accesses to base and intermediate relational data.
Journal ArticleDOI

Brighthouse: an analytic data warehouse for ad-hoc queries

TL;DR: Additional benefits resulting from Knowledge Grid for compressed, column-oriented databases, including assistance in query optimization and execution, are demonstrated by minimizing the need of data reads and data decompression.
References
More filters
Book

Computer Architecture: A Quantitative Approach

TL;DR: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today.
Proceedings ArticleDOI

Access path selection in a relational database management system

TL;DR: System R as mentioned in this paper is an experimental database management system developed to carry out research on the relational model of data, which chooses access paths for both simple (single relation) and complex queries (such as joins), given a user specification of desired data as a boolean expression of predicates.
Book

Database Management Systems

TL;DR: New to this edition are the early coverage of the ER model, new chapters on Internet databases, data mining, and spatial databases, and a new supplement on practical SQL assignments (with solutions for instructors' use).
Proceedings Article

DBMSs on a Modern Processor: Where Does Time Go?

TL;DR: This paper examines four commercial DBMSs running on an Intel Xeon and NT 4.0 and introduces a framework for analyzing query execution time, and finds that database developers should not expect the overall execution time to decrease significantly without addressing stalls related to subtle implementation issues.
Related Papers (5)