scispace - formally typeset
J

Jiajia Li

Researcher at Pacific Northwest National Laboratory

Publications -  45
Citations -  1073

Jiajia Li is an academic researcher from Pacific Northwest National Laboratory. The author has contributed to research in topics: Sparse matrix & Speedup. The author has an hindex of 14, co-authored 40 publications receiving 716 citations. Previous affiliations of Jiajia Li include Chinese Academy of Sciences & Georgia Institute of Technology.

Papers
More filters
Proceedings ArticleDOI

Bridging the gap between deep learning and sparse matrix format selection

TL;DR: This work describes how to effectively bridge the gap between deep learning and the special needs of the pillar HPC problem through a set of techniques on matrix representations, deep learning structure, and cross-architecture model migrations.
Proceedings ArticleDOI

Model-Driven Sparse CP Decomposition for Higher-Order Tensors

TL;DR: A novel, adaptive tensor memoization algorithm, AdaTM, which allows a user to make a space-time tradeoff by automatically tuning algorithmic and machine parameters using a model-driven framework, making its performance more scalable for higher-order data problems.
Proceedings ArticleDOI

Understanding the GPU Microarchitecture to Achieve Bare-Metal Performance Tuning

TL;DR: The toolchain is an attempt to automatically crack different GPU ISA encodings and build an assembler adaptively for the purpose of performance enhancements to applications on GPUs.
Proceedings ArticleDOI

A pattern based algorithmic autotuner for graph processing on GPUs

TL;DR: Gswitch is a pattern-based algorithmic auto-tuning system that dynamically switches between optimization variants with negligible overhead and provides a simple programming interface that conceals low-level tuning details from the user.
Proceedings ArticleDOI

Optimizing sparse tensor times matrix on multi-core and many-core architectures

TL;DR: The optimized design and implementation of sparse tensor-times-dense matrix multiply (SpTTM) for CPU and GPU platforms is presented, which is a critical bottleneck in data analysis and mining applications based on tensor methods, such as the Tucker decomposition.