scispace - formally typeset
Y

Yuan Tang

Researcher at Fudan University

Publications -  32
Citations -  566

Yuan Tang is an academic researcher from Fudan University. The author has contributed to research in topics: Cache-oblivious algorithm & Cache. The author has an hindex of 9, co-authored 31 publications receiving 521 citations. Previous affiliations of Yuan Tang include University of Electronic Science and Technology of China & University of Tennessee.

Papers
More filters
Proceedings ArticleDOI

The pochoir stencil compiler

TL;DR: The Pochoir stencil compiler allows a programmer to write a simple specification of a stencil in a domain-specific stencil language embedded in C++ which the Pochir compiler then translates into high-performing Cilk code that employs an efficient parallel cache-oblivious algorithm.
Proceedings ArticleDOI

Cache-oblivious wavefront: improving parallelism of recursive dynamic programming algorithms without losing cache-efficiency

TL;DR: Techniques are applied to a set of widely known dynamic programming problems, such as Floyd-Warshall's All-Pairs Shortest Paths, Stencil, and LCS, to remove the artificial dependency and preserve the cache-optimality by inheriting the DAC strategy.
Proceedings ArticleDOI

VNET/P: bridging the cloud and high performance computing through fast overlay networking

TL;DR: The design, implementation, and evaluation of a layer 2 virtual networking system that has negligible latency and bandwidth overheads in 1--10 Gbps networks are described, suggesting it is feasible to extend a software-based overlay network designed for computing at wide-area scales into tightly-coupled environments.
Proceedings ArticleDOI

AUTOGEN: automatic discovery of cache-oblivious parallel recursive algorithms for solving dynamic programs

TL;DR: The experimental results show that several autodiscovered algorithms significantly outperform parallel looping and tiled loop-based algorithms and are less sensitive to fluctuations of memory and bandwidth compared with their looping counterparts, and their running times and energy profiles remain relatively more stable.
Proceedings ArticleDOI

Provably Efficient Scheduling of Cache-oblivious Wavefront Algorithms

TL;DR: This paper systematically transform standard cache-oblivious recursive divide-and-conquer algorithms into recursive wavefront algorithms to achieve optimal parallel cache complexity and high parallelism under state-of-the-art schedulers for fork-join programs.