scispace - formally typeset
J

Joonmoo Huh

Researcher at North Carolina State University

Publications -  9
Citations -  29

Joonmoo Huh is an academic researcher from North Carolina State University. The author has contributed to research in topics: SIMD & Computing with Memory. The author has an hindex of 3, co-authored 9 publications receiving 26 citations. Previous affiliations of Joonmoo Huh include Intel.

Papers
More filters
Proceedings ArticleDOI

Improving the effectiveness of searching for isomorphic chains in superword level parallelism

TL;DR: This work describes a new hierarchical approach for Superword Level Parallelism (SLP) that decouple the selection of isomorphic chains and arrange them in a hierarchy of choices at the local and global levels, thereby finding better opportunities for vectorization.
Proceedings ArticleDOI

3D-enabled customizable embedded computer (3DECC)

TL;DR: This paper describes a 3D computer architecture designed to achieve the lowest possible power consumption for “embedded applications” like radar and signal processing and introduces several unique concepts including a low-power SIMD tile, low- power 3D memories, and 3D and 2.5D interconnect that can be tuned at run-time for a specific application.
Proceedings ArticleDOI

Computing in 3D

TL;DR: 3D technologies offer significant potential to improve total performance and performance per unit of power, and the next frontier is to create sophisticated logic on logic solutions that promise further increases in performance/power beyond those attributable to memory interfaces alone.
Proceedings ArticleDOI

Computing in 3D

TL;DR: The concept of Fast Thread Migration using 3DIC technologies is introduced and the design of a power optimized SIMD unit in which over half of the power is employed in the FP units is presented.
Patent

Instruction and logic for permute with out of order loading

TL;DR: In this paper, a processor includes a core that includes logic to determine that an instruction will require strided data converted from source data in memory, logic to load source data into a plurality of preliminary vector registers, and logic to apply permute instructions to the contents of the preliminary vector register to cause corresponding indexed elements from the plurality of structures to be loaded into respective source vector registers.