scispace - formally typeset
Search or ask a question
Author

Hung-Yi Liu

Other affiliations: National Taiwan University, Intel, IBM
Bio: Hung-Yi Liu is an academic researcher from Columbia University. The author has contributed to research in topics: Power optimization & Design space exploration. The author has an hindex of 9, co-authored 17 publications receiving 440 citations. Previous affiliations of Hung-Yi Liu include National Taiwan University & Intel.

Papers
More filters
Proceedings ArticleDOI
29 May 2013
TL;DR: A study on the application of learning-based methods for the DSE problem is presented, and a learning model for HLS that is superior to the best models described in the literature is proposed.
Abstract: This paper makes several contributions to address the challenge of supervising HLS tools for design space exploration (DSE). We present a study on the application of learning-based methods for the DSE problem, and propose a learning model for HLS that is superior to the best models described in the literature. In order to speedup the convergence of the DSE process, we leverage transductive experimental design, a technique that we introduce for the first time to the CAD community. Finally, we consider a practical variant of the DSE problem, and present a solution based on randomized selection with strong theory guarantee.

180 citations

Proceedings ArticleDOI
05 Nov 2006
TL;DR: In this paper, the authors presented an effective voltage assignment technique based on dynamic programming, given a netlist without reconvergent fanouts, the dynamic programming can guarantee an optimal solution for the voltage assignment.
Abstract: Power consumption is a crucial concern in nanometer chip design. Researchers have shown that multiple supply voltage (MSV) is an effective method for power consumption reduction. The underlying idea behind MSV is the trade-off between power saving and performance. In this paper, we present an effective voltage assignment technique based on dynamic programming. Given a netlist without reconvergent fanouts, the dynamic programming can guarantee an optimal solution for the voltage assignment. We then generate a level shifter for each net that connects two blocks in different voltage domains, and perform power-network aware floorplanning for the MSV design. Experimental results show that our floorplanner is very effective in optimizing power consumption under timing constraints.

77 citations

Proceedings ArticleDOI
05 Nov 2007
TL;DR: An economical graph-based representation that needs only a linear number of nodes to the block number to model the block adjacency in a floorplan for the voltage-island generation and can produce better voltage islands in terms of power-network routing resources.
Abstract: Power optimization is a crucial concern for modern circuit designs. Multiple supply voltages (MSV's) provide an effective technique for the power optimization. This paper addresses the voltage-island generation problem for MSV designs at the post-floorplanning stage. We first present a general formulation of this problem that considers level-shifter planning and power-network routing resources. Without loss of solution quality, we propose an economical graph-based representation that needs only a linear number of nodes to the block number to model the block adjacency in a floorplan for the voltage-island generation. In contrast, previous works need a quadratic number of nodes. To tackle the addressed problem, we employ an ILP formulation which consists of (1) level-shifter aware wirelength estimation to capture the timing overhead, (2) voltage-island-clustering inequalities to avoid complicated constraint transformations, and (3) inequalities to capture the power-network routing-resource usage. Compared with previous works, our algorithm can produce better voltage islands in terms of power-network routing resources. Experimental results show that our algorithm can effectively reduce the power-network routing resource by up to 19.46% with a reasonable overhead of 4.03% more power consumption and using reasonable running time.

58 citations

Proceedings ArticleDOI
12 Mar 2012
TL;DR: This work presents a concise library format for characterization and reuse of components specified in high-level languages like SystemC; an algorithm to prune alternative implementations of a component given the context of a specific SoC design; and an algorithm that explores compositionally the design space of the SoC and produces a detailed plan to run high- level synthesis on its components for the final implementation.
Abstract: The growing complexity of System-on-Chip (SoC) design calls for an increased usage of transaction-level modeling (TLM), high-level synthesis tools, and reuse of pre-designed components. In the framework of a compositional methodology for efficient SoC design exploration we present three main contributions: a concise library format for characterization and reuse of components specified in high-level languages like SystemC; an algorithm to prune alternative implementations of a component given the context of a specific SoC design; and an algorithm that explores compositionally the design space of the SoC and produces a detailed plan to run high-level synthesis on its components for the final implementation. The two algorithms are computationally efficient and enable an effective parallelization of the synthesis runs. Through a case study, we show how our methodology returns the essential properties of the design space at the system level by combining the information from the library of components and by identifying automatically those having the most critical impact on the overall design.

45 citations

Proceedings ArticleDOI
14 Mar 2016
TL;DR: The overall organization of SynTunSys is presented, its main components are described, and results from employing it for the design of an industrial chip, the IBM z13 22nm high-performance server chip, yielding on average a 36% improvement in total negative slack and a 7% power reduction.
Abstract: Advanced logic and physical synthesis tools provide a vast number of tunable parameters that can significantly impact physical design quality, but the complexity of the parameter design space requires intelligent search algorithms. To fully utilize the optimization potential of these tools, we propose SynTunSys, a system that adds a new level of abstraction between designers and design tools for managing the design space exploration process. SynTunSys takes control of the synthesis-parameter tuning process, i.e., job submission, results analysis, and next-step decision making, by automating a key portion of a human designer's decision process. We present the overall organization of SynTunSys, describe its main components, and provide results from employing it for the design of an industrial chip, the IBM z13 22nm high-performance server chip. During this major design, SynTunSys provided significant savings in human design effort and achieved a quality of results beyond what human designers alone could achieve, yielding on average a 36% improvement in total negative slack and a 7% power reduction.

27 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This work uses a first-published methodology to compare one commercial and three academic tools on a common set of C benchmarks, aiming at performing an in-depth evaluation in terms of performance and the use of resources.
Abstract: High-level synthesis (HLS) is increasingly popular for the design of high-performance and energy-efficient heterogeneous systems, shortening time-to-market and addressing today’s system complexity. HLS allows designers to work at a higher-level of abstraction by using a software program to specify the hardware functionality. Additionally, HLS is particularly interesting for designing field-programmable gate array circuits, where hardware implementations can be easily refined and replaced in the target device. Recent years have seen much activity in the HLS research community, with a plethora of HLS tool offerings, from both industry and academia. All these tools may have different input languages, perform different internal optimizations, and produce results of different quality, even for the very same input description. Hence, it is challenging to compare their performance and understand which is the best for the hardware to be implemented. We present a comprehensive analysis of recent HLS tools, as well as overview the areas of active interest in the HLS research community. We also present a first-published methodology to evaluate different HLS tools. We use our methodology to compare one commercial and three academic tools on a common set of C benchmarks, aiming at performing an in-depth evaluation in terms of performance and the use of resources.

433 citations

Journal ArticleDOI
14 Jun 2014
TL;DR: Aladdin is presented, a pre-RTL, power-performance accelerator modeling framework and its application to system-on-chip (SoC) simulation and provides researchers an approach to model the power and performance of accelerators in an SoC environment.
Abstract: Hardware specialization, in the form of accelerators that provide custom datapath and control for specific algorithms and applications, promises impressive performance and energy advantages compared to traditional architectures. Current research in accelerator analysis relies on RTL-based synthesis flows to produce accurate timing, power, and area estimates. Such techniques not only require significant effort and expertise but are also slow and tedious to use, making large design space exploration infeasible. To overcome this problem, we present Aladdin, a pre-RTL, power-performance accelerator modeling framework and demonstrate its application to system-on-chip (SoC) simulation. Aladdin estimates performance, power, and area of accelerators within 0.9%, 4.9%, and 6.6% with respect to RTL implementations. Integrated with architecture-level core and memory hierarchy simulators, Aladdin provides researchers an approach to model the power and performance of accelerators in an SoC environment

269 citations

Proceedings ArticleDOI
15 Dec 2014
TL;DR: This work presents MachSuite, a collection of 19 benchmarks for evaluating high-level synthesis tools and accelerator-centric architectures, which spans a broad application space, captures a variety of different program behaviors, and provides implementations tailored towards the needs of accelerator designers and researchers.
Abstract: Recent high-level synthesis and accelerator-related architecture papers show a great disparity in workload selection. To improve standardization within the accelerator research community, we present MachSuite, a collection of 19 benchmarks for evaluating high-level synthesis tools and accelerator-centric architectures. MachSuite spans a broad application space, captures a variety of different program behaviors, and provides implementations tailored towards the needs of accelerator designers and researchers, including support for high-level synthesis. We illustrate these aspects by characterizing each benchmark along five different dimensions, highlighting trends and salient features.

203 citations

Proceedings ArticleDOI
05 Jun 2016
TL;DR: Lin-Analyzer is presented, a high-level accurate performance analysis tool that enables rapid design space exploration with various pragmas for FPGA-based accelerators without requiring RTL implementations.
Abstract: The increasing complexity of FPGA-based accelerators, coupled with time-to-market pressure, makes high-level synthesis (HLS) an attractive solution to improve designer productivity by abstracting the programming effort above register-transfer level (RTL). HLS offers various architectural design options with different trade-offs via pragmas (loop unrolling, loop pipelining, array partitioning). However, non-negligible HLS runtime renders manual or automated HLS-based exhaustive architectural exploration practically infeasible. To address this challenge, we present Lin-Analyzer, a high-level accurate performance analysis tool that enables rapid design space exploration with various pragmas for FPGA-based accelerators without requiring RTL implementations.

104 citations