GraphH: A Processing-in-Memory Architecture for Large-Scale Graph Processing

doi:10.1109/TCAD.2018.2821565

Journal ArticleDOI

GraphH: A Processing-in-Memory Architecture for Large-Scale Graph Processing

Guohao Dai, +8 more

- 01 Apr 2019 -

IEEE Transactions on Computer-Aided Desi...

- Vol. 38, Iss: 4, pp 640-653

TLDR

GraphH, a PIM architecture for graph processing on the hybrid memory cube array, is proposed to tackle all four problems mentioned above, including random access pattern causing local bandwidth degradation, poor locality leading to unpredictable global data access, heavy conflicts on updating the same vertex, and unbalanced workloads across processing units.

Abstract:

Large-scale graph processing requires the high bandwidth of data access. However, as graph computing continues to scale, it becomes increasingly challenging to achieve a high bandwidth on generic computing architectures. The primary reasons include: the random access pattern causing local bandwidth degradation, the poor locality leading to unpredictable global data access, heavy conflicts on updating the same vertex, and unbalanced workloads across processing units. Processing-in-memory (PIM) has been explored as a promising solution to providing high bandwidth, yet open questions of graph processing on PIM devices remain in: 1) how to design hardware specializations and the interconnection scheme to fully utilize bandwidth of PIM devices and ensure locality and 2) how to allocate data and schedule processing flow to avoid conflicts and balance workloads. In this paper, we propose GraphH, a PIM architecture for graph processing on the hybrid memory cube array, to tackle all four problems mentioned above. From the architecture perspective, we integrate SRAM-based on-chip vertex buffers to eliminate local bandwidth degradation. We also introduce reconfigurable double-mesh connection to provide high global bandwidth. From the algorithm perspective, partitioning and scheduling methods like index mapping interval-block and round interval pair are introduced to GraphH, thus workloads are balanced and conflicts are avoided. Two optimization methods are further introduced to reduce synchronization overhead and reuse on-chip data. The experimental results on graphs with billions of edges demonstrate that GraphH outperforms DDR-based graph processing systems by up to two orders of magnitude and $5.12 {\times }$ speedup against the previous PIM design.

GraphH: A Processing-in-Memory Architecture for Large-Scale Graph Processing

Citations

GraphQ: Scalable PIM-Based Graph Processing

A Modern Primer on Processing in Memory.

Processing-in-memory: A workload-driven perspective

GraphiDe: A Graph Processing Accelerator leveraging In-DRAM-Computing

Alleviating Irregularity in Graph Analytics Acceleration: a Hardware/Software Co-Design Approach

References

MapReduce: simplified data processing on large clusters

MapReduce: simplified data processing on large clusters

On power-law relationships of the Internet topology

Spark: cluster computing with working sets

Pregel: a system for large-scale graph processing

Related Papers (5)

A scalable processing-in-memory accelerator for parallel graph processing

Ambit: in-memory accelerator for bulk bitwise operations using commodity DRAM technology

Pregel: a system for large-scale graph processing

PIM-enabled instructions: a low-overhead, locality-aware processing-in-memory architecture

Pinatubo: a processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories