D
Dong Li
Researcher at University of Texas at Austin
Publications - 11
Citations - 217
Dong Li is an academic researcher from University of Texas at Austin. The author has contributed to research in topics: Operand & Clos network. The author has an hindex of 6, co-authored 10 publications receiving 208 citations. Previous affiliations of Dong Li include University of Texas System & Chinese Academy of Sciences.
Papers
More filters
Proceedings ArticleDOI
Priority-based cache allocation in throughput processors
Dong Li,Minsoo Rhu,Daniel R. Johnson,Mike O'Connor,Mattan Erez,Doug Burger,Donald S. Fussell,Stephen W. Redder +7 more
TL;DR: A priority-based cache allocation (PCAL) that provides preferential cache capacity to a subset of high-priority threads while simultaneously allowing lower priority threads to execute without contending for the cache is proposed.
Proceedings ArticleDOI
How to implement effective prediction and forwarding for fusable dynamic multicore architectures
Behnam Robatmili,Dong Li,Hadi Esmaeilzadeh,S. Govindan,Aaron L. Smith,Andrew Putnam,Doug Burger,Stephen W. Keckler +7 more
TL;DR: This paper proposes Iterative Path Prediction to address low next block prediction accuracy and low speculation rates, and proposes Exposed Operand Broadcasts to address the overhead of operand delivery for high fanout instructions by exposing a small number of broadcast operands in the ISA.
Journal ArticleDOI
Scaling Power and Performance viaProcessor Composability
Madhu Saravana Sibi Govindan,Behnam Robatmili,Dong Li,Bertrand A. Maher,Aaron L. Smith,Stephen W. Keckler,Doug Burger +6 more
TL;DR: The study shows that composing multiple dual-issue cores (up to eight) provides performance scaling that is as energy-efficient as frequency scaling in a balanced microarchitecture, and is considerably more efficient than scaling the voltage to achieve additional performance once the maximum frequency at the minimum voltage is attained.
Patent
Data multicasting in a distributed processor system
TL;DR: In this paper, the first message for each of the target instructions including selected information commonly shared by the target instruction is provided, and when two target instructions are located in different directions from one another relative to a router, the replicated messages are routed to the identified target instructions in the different directions.
Compiler-assisted Hybrid Operand Communication
TL;DR: A compiler optimization to reuse broadcast tags for instructions with non-overlapping broadcast live ranges, the speedup is further improved without spending more power, and the results show that this compiler-assisted hybrid token/broadcast model requires only eight architectural broadcasts per block, enabling highly efficient CAMs.