Journal ArticleDOI
DB2 with BLU acceleration: so much more than just a column store
Vijayshankar Raman,Gopi K. Attaluri,Ronald J. Barber,Naresh K. Chainani,David Kalmuk,Vincent Kulandaisamy,Jens Leenstra,Sam Lightstone,Shaorong Liu,Guy M. Lohman,Tim Malkemus,Rene Mueller,Ippokratis Pandis,Berni Schiefer,David C. Sharpe,Richard S. Sidle,Adam J. Storm,Liping Zhang +17 more
- Vol. 6, Iss: 11, pp 1080-1091
TLDR
Full integration with DB2 ensures that DB2 with BLU Acceleration benefits from the full functionality and robust utilities of a mature product, while still enjoying order-of-magnitude performance gains from revolutionary technology without even having to change the SQL.Abstract:
DB2 with BLU Acceleration deeply integrates innovative new techniques for defining and processing column-organized tables that speed read-mostly Business Intelligence queries by 10 to 50 times and improve compression by 3 to 10 times, compared to traditional row-organized tables, without the complexity of defining indexes or materialized views on those tables. But DB2 BLU is much more than just a column store. Exploiting frequency-based dictionary compression and main-memory query processing technology from the Blink project at IBM Research - Almaden, DB2 BLU performs most SQL operations - predicate application (even range predicates and IN-lists), joins, and grouping - on the compressed values, which can be packed bit-aligned so densely that multiple values fit in a register and can be processed simultaneously via SIMD (single-instruction, multipledata) instructions. Designed and built from the ground up to exploit modern multi-core processors, DB2 BLU's hardware-conscious algorithms are carefully engineered to maximize parallelism by using novel data structures that need little latching, and to minimize data-cache and instruction-cache misses. Though DB2 BLU is optimized for in-memory processing, database size is not limited by the size of main memory. Fine-grained synopses, late materialization, and a new probabilistic buffer pool protocol for scans minimize disk I/Os, while aggressive prefetching reduces I/O stalls. Full integration with DB2 ensures that DB2 with BLU Acceleration benefits from the full functionality and robust utilities of a mature product, while still enjoying order-of-magnitude performance gains from revolutionary technology without even having to change the SQL, and can mix column-organized and row-organized tables in the same tablespace and even within the same query.read more
Citations
More filters
Journal ArticleDOI
In-Memory Big Data Management and Processing: A Survey
TL;DR: This survey aims to provide a thorough review of a wide range of in-memory data management and processing proposals and systems, including both data storage systems and data processing frameworks.
Proceedings Article
Impala: A Modern, Open-Source SQL Engine for Hadoop.
Marcel Kornacker,Alexander Behm,Victor Bittorf,Taras Bobrovytsky,Casey Ching,Alan Choi,Justin Erickson,Martin Grund,Daniel Hecht,Matthew Jacobs,Ishaan Joshi,Lenni Kuff,Dileep Kumar,Alex Leblang,Nong Li,Ippokratis Pandis,Henry Noel Robinson,David Rorke,Silvius Rus,John Russell,Dimitris Tsirogiannis,Skye Wanderman-Milne,Michael Yoder +22 more
TL;DR: This paper presents Impala from a user’s perspective, gives an overview of its architecture and main components and briefly demonstrates its superior performance compared against other popular SQL-on-Hadoop systems.
Proceedings ArticleDOI
Morsel-driven parallelism: a NUMA-aware query evaluation framework for the many-core age
TL;DR: The morsel-driven query execution framework is presented, where scheduling becomes a fine-grained run-time task that is NUMA-aware and the degree of parallelism is not baked into the plan but can elastically change during query execution, so the dispatcher can react to execution speed of different morsels but also adjust resources dynamically in response to newly arriving queries in the workload.
Proceedings ArticleDOI
Rethinking SIMD Vectorization for In-Memory Databases
TL;DR: This paper presents novel vectorized designs and implementations of database operators, based on advanced SIMD operations, such as gathers and scatters, and highlights the impact of efficient vectorization on the algorithmic design of in-memorydatabase operators, as well as the architectural design and power efficiency of hardware.
Journal ArticleDOI
The Design and Implementation of Modern Column-Oriented Database Systems
TL;DR: The design and implementation of modern column-oriented database systems can be found in this paper, with a specific focus on three influential research prototypes, MonetDB, C-Store, and X100, which form the basis for several well-known commercial column-store implementations.
References
More filters
Proceedings ArticleDOI
Access path selection in a relational database management system
TL;DR: System R as mentioned in this paper is an experimental database management system developed to carry out research on the relational model of data, which chooses access paths for both simple (single relation) and complex queries (such as joins), given a user specification of desired data as a boolean expression of predicates.
Journal ArticleDOI
ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging
TL;DR: ARIES as discussed by the authors is a database management system applicable not only to database management systems but also to persistent object-oriented languages, recoverable file systems and transaction-based operating systems.
Book ChapterDOI
C-store: a column-oriented DBMS
Michael Stonebraker,Daniel J. Abadi,Adam Batkin,Xuedong Chen,Mitch Cherniack,Miguel Ferreira,Edmond Lau,Amerson Lin,Samuel Madden,Elizabeth O'Neil,Patrick O'Neil,Alexander Rasin,Nga Tran,Stan Zdonik +13 more
TL;DR: Preliminary performance data on a subset of TPC-H is presented and it is shown that the system the team is building, C-Store, is substantially faster than popular commercial products.
Proceedings Article
C-store: a column-oriented DBMS
Michael Stonebraker,Daniel J. Abadi,Adam Batkin,Xuedong Chen,Mitch Cherniack,Miguel Ferreira,Edmond Lau,Amerson Lin,Samuel Madden,Elizabeth O'Neil,Patrick O'Neil,Alexander Rasin,Nga Tran,Stan Zdonik +13 more
TL;DR: C-Store as mentioned in this paper is a read-optimized relational DBMS that contrasts sharply with most current systems, which are write-optimised, and it uses bitmap indexes to complement B-tree structures.
Proceedings ArticleDOI
Implementation techniques for main memory database systems
David J. DeWitt,Randy H. Katz,Frank Olken,Leonard D. Shapiro,Michael Stonebraker,Darien Wood +5 more
TL;DR: This paper considers the changes necessary to permit a relational database system to take advantage of large amounts of main memory, and evaluates AVL vs B+-tree access methods, hash-based query processing strategies vs sort-merge, and study recovery issues when most or all of the database fits in main memory.