scispace - formally typeset
Open AccessJournal ArticleDOI

MBZip: Multiblock Data Compression

TLDR
MBZip is a synergistic mechanism that compresses multiple data blocks into one single block (called a zipped block), both at the LLC and DRAM, and improves the system performance by 21.9%, with a maximum of 90.3% on a 4-core system.
Abstract
Compression techniques at the last-level cache and the DRAM play an important role in improving system performance by increasing their effective capacities. A compressed block in DRAM also reduces the transfer time over the memory bus to the caches, reducing the latency of a LLC cache miss. Usually, compression is achieved by exploiting data patterns present within a block. But applications can exhibit data locality that spread across multiple consecutive data blocks. We observe that there is significant opportunity available for compressing multiple consecutive data blocks into one single block, both at the LLC and DRAM. Our studies using 21 SPEC CPU applications show that, at the LLC, around 25% (on average) of the cache blocks can be compressed into one single cache block when grouped together in groups of 2 to 8 blocks. In DRAM, more than 30% of the columns residing in a single DRAM page can be compressed into one DRAM column, when grouped together in groups of 2 to 6. Motivated by these observations, we propose a mechanism, namely, MBZip, that compresses multiple data blocks into one single block (called a zipped block), both at the LLC and DRAM. At the cache, MBZip includes a simple tag structure to index into these zipped cache blocks and the indexing does not incur any redirectional delay. At the DRAM, MBZip does not need any changes to the address computation logic and works seamlessly with the conventional/existing logic. MBZip is a synergistic mechanism that coordinates these zipped blocks at the LLC and DRAM. Further, we also explore silent writes at the DRAM and show that certain writes need not access the memory when blocks are zipped. MBZip improves the system performance by 21.9%, with a maximum of 90.3% on a 4-core system.

read more

Citations
More filters
Proceedings ArticleDOI

Safecracker: Leaking Secrets through Compressed Caches

TL;DR: This paper offers the first security analysis of cache compression, one such promising technique that is likely to appear in future processors and finds that cache compression is insecure because the compressibility of a cache line reveals information about its contents.
Journal ArticleDOI

Optimized Lossless Embedded Compression for Mobile Multimedia Applications

TL;DR: The proposed evaluation metrics to assess lossless embedded compression (LEC) algorithms to reflect realistic design considerations for mobile multimedia scenarios are proposed and an optimized LEC implementation is introduced for contemporary multimedia applications in mobile devices based on the proposed metrics.
Proceedings ArticleDOI

CABLE: a CAche-based link encoder for bandwidth-starved manycores

TL;DR: This work presents CABLE, a novel CAche-Based Link Encoder that enables point-to-point link compression between coherent caches, re-purposing the data already stored in the caches as a massive and scalable dictionary for data compression.
Journal ArticleDOI

MemSZ: Squeezing Memory Traffic with Lossy Compression

TL;DR: MemSZ introduces a low latency, parallel design of the Squeeze (SZ) algorithm offering aggressive compression ratios, up to 16:1 in the authors' implementation, and improves the execution time, energy, and memory traffic by up to 15%, 9%, and 64%, respectively.
Journal ArticleDOI

Compacted CPU/GPU Data Compression via Modified Virtual Address Translation

TL;DR: A method to reduce the footprint of compressed data by using modified virtual address translation to permit random access to the data and an important property of this method is that compression, decompression, and reallocation are automatically managed by the new hardware without operating system intervention.
References
More filters
Journal ArticleDOI

SPEC CPU2006 benchmark descriptions

TL;DR: On August 24, 2006, the Standard Performance Evaluation Corporation (SPEC) announced CPU2006, which replaces CPU2000, and the SPEC CPU benchmarks are widely used in both industry and academia.
Journal ArticleDOI

SPEC CPU2000: measuring CPU performance in the New Millennium

J.L. Henning
- 01 Jul 2000 - 
TL;DR: CPU2000 as mentioned in this paper is a new CPU benchmark suite with 19 applications that have never before been in a SPEC CPU suite, including high-performance numeric computing, Web servers, and graphical subsystems.
Journal ArticleDOI

Symbiotic jobscheduling for a simultaneous multithreaded processor

TL;DR: It is demonstrated that performance on a hardware multithreaded processor is sensitive to the set of jobs that are coscheduled by the operating system jobscheduler, and that a small sample of the possible schedules is sufficient to identify a good schedule quickly.
Journal ArticleDOI

System-Level Performance Metrics for Multiprogram Workloads

TL;DR: The authors propose two performance metrics: average normalized turnaround time, a user- oriented metric, and system throughput, a system-oriented metric for developing multiprogram performance metrics in a top-down fashion starting from system-level objectives.