Showing papers on "Physical design published in 2019"

PDF

Open Access

Proceedings Article•DOI•

INVITED: ALIGN – Open-Source Analog Layout Automation from the Ground Up

[...]

Kishor Kunal¹, Meghna Madhusudan¹, Arvind Sharma¹, Wenbin Xu², Steven M. Burns³, Ramesh Harjani¹, Jiang Hu², Desmond A. Kirkpatrick³, Sachin S. Sapatnekar¹ - Show less +5 more•Institutions (3)

University of Minnesota¹, Texas A&M University², Intel³

02 Jun 2019

TL;DR: A view of the current status of the ALIGN (“Analog Layout, Intelligently Generated from Netlists”) project, challenges in developing open-source code with an academic/industry team, and nuts-and-bolts issues such as working with abstracted PDKs, navigating the “wall” between secured IP and open- source software, and securing access to example designs are provided.

...read moreread less

Abstract: This paper presents analog layout automation efforts under the ALIGN ("Analog Layout, Intelligently Generated from Netlists") project for fast layout generation using a modular approach based on a mix of algorithmic and machine learning-based tools. The road to rapid turnaround is based on an approach that detects structure and hierarchy in the input netlist and uses a grid based philosophy for layout. The paper provides a view of the current status of the project, challenges in developing open-source code with an academic/industry team, and nuts-and-bolts issues such as working with abstracted PDKs, navigating the "wall" between secured IP and open-source software, and securing access to example designs.

...read moreread less

55 citations

Proceedings Article•DOI•

Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning

[...]

Charles Schaff¹, David Yunis², Ayan Chakrabarti³, Matthew R. Walter¹•Institutions (3)

Toyota Technological Institute at Chicago¹, University of Chicago², Washington University in St. Louis³

20 May 2019

TL;DR: In this paper, the authors proposed a method that jointly optimizes over the physical design and control network, maintaining a distribution over designs and using reinforcement learning to optimize a control policy to maximize expected reward over the design distribution.

...read moreread less

Abstract: The physical design of a robot and the policy that controls its motion are inherently coupled, and should be determined according to the task and environment. In an increasing number of applications, data-driven and learning-based approaches, such as deep reinforcement learning, have proven effective at designing control policies. For most tasks, the only way to evaluate a physical design with respect to such control policies is empirical—i.e., by picking a design and training a control policy for it. Since training these policies is time-consuming, it is computationally infeasible to train separate policies for all possible designs as a means to identify the best one. In this work, we address this limitation by introducing a method that jointly optimizes over the physical design and control network. Our approach maintains a distribution over designs and uses reinforcement learning to optimize a control policy to maximize expected reward over the design distribution. We give the controller access to design parameters to allow it to tailor its policy to each design in the distribution. Throughout training, we shift the distribution towards higher-performing designs, eventually converging to a design and control policy that are jointly optimal. We evaluate our approach in the context of legged locomotion, and demonstrate that it discovers novel designs and walking gaits, outperforming baselines across different settings.

...read moreread less

54 citations

Journal Article•DOI•

ColdFlux Superconducting EDA and TCAD Tools Project: Overview and Progress

[...]

Coenrad J. Fourie¹, Kyle Jackman¹, Matthys M. Botha¹, Sasan Razmkhah², Pascal Febvre², Christopher L. Ayala, Qiuyun Xu, Nobuyuki Yoshikawa, Erin Patrick³, Mark E. Law³, Yanzi Wang⁴, Murali Annavaram⁵, Peter A. Beerel⁵, Sandeep K. S. Gupta⁵, Shaheen Nazarian⁵, Massoud Pedram⁵ - Show less +12 more•Institutions (5)

Stellenbosch University¹, Los Angeles Harbor College², University of Florida³, Northeastern University⁴, University of Southern California⁵

10 Jan 2019-IEEE Transactions on Applied Superconductivity

TL;DR: An overview of the current and planned activities related to the ColdFlux project is presented and the design assumptions and decisions that were made to allow the development of design tools for million-gate circuits are justified.

...read moreread less

Abstract: The IARPA SuperTools program requires the development of superconducting electronic design automation (S-EDA) and superconducting technology computer-aided design (S-TCAD) tools aimed at enabling the reliable design of complex superconducting digital circuits with millions of Josephson junctions. Within the SuperTools program, the ColdFlux project addresses S-EDA and S-TCAD tool research and development in four areas: 1) RTL synthesis, architectures and verification; 2) analog design and layout synthesis; 3) physical design and test; and 4) device and process modeling/simulation and cell library design. Capabilities include, but are not limited to, the following: device level modeling and simulation of Josephson junctions, modeling and simulation of the superconducting process manufacturing processes, powerful new electrical circuit simulation, parameterized schematic and layout libraries, optimization, compact SPICE-like model extraction, timing analysis, behavioral, register-transfer-level and logic syntheses, clock tree synthesis, placement and routing, layout-versus-schematic extraction, functional verification, and the evaluation of designs in the presence of magnetic fields and trapped flux. ColdFlux consists of six research groups from four continents. Here, we present an overview of the current and planned activities related to the project and justify the design assumptions and decisions that were made to allow the development of design tools for million-gate circuits.

...read moreread less

54 citations

Journal Article•DOI•

Computational Bounds for Photonic Design

[...]

Guillermo Angeris¹, Jelena Vuckovic¹, Stephen Boyd¹•Institutions (1)

Stanford University¹

22 Apr 2019-ACS Photonics

TL;DR: In this article, the photonic inverse design problem is solved using local optimization methods, which often produce what appear to be good or very good designs when compared to comparably bad designs.

...read moreread less

Abstract: Physical design problems, such as photonic inverse design, are typically solved using local optimization methods. These methods often produce what appear to be good or very good designs when compar...

...read moreread less

48 citations

Journal Article•DOI•

A Robust Digital RRAM-Based Convolutional Block for Low-Power Image Processing and Learning Applications

[...]

Edouard Giacomin¹, Tzofnat Greenberg-Toledo², Shahar Kvatinsky², Pierre-Emmanuel Gaillardon¹•Institutions (2)

University of Utah¹, Technion – Israel Institute of Technology²

01 Feb 2019-IEEE Transactions on Circuits and Systems I-regular Papers

TL;DR: This paper presents a purely digital robust RRAM-based convolutional block using single-ended XNOR sensing capable of performing dot product operations in a single cycle and shows that at the circuit level, this architecture can tolerate a resistance window as low as 1.09, ensuring reliable operations even under a high RRAM variability.

...read moreread less

Abstract: Currently, there is a growing attention toward developing efficient hardware convolutional blocks for several applications such as computer vision or image processing. Recent works have shown that using binary values in convolutional blocks can considerably reduce the overall power consumption while achieving a high degree of accuracy. In parallel, some works employed resistive random-access memory (RRAM) as an in-memory accelerator to directly store the convolution kernels and perform analog dot product operations in the array, reducing the overall power consumption by limiting the number of memory accesses. However, such architecture is hampered by the limited resistance precision and large intrinsic variability of RRAMs. In this paper, we present a purely digital robust RRAM-based convolutional block using single-ended XNOR sensing capable of performing dot product operations in a single cycle. By carefully considering physical design and RRAM limitations at the 28-nm technology node, we show that at the circuit level, our architecture can tolerate a resistance window as low as 1.09, ensuring reliable operations even under a high RRAM variability ( $\sigma /\mu = 25\%$ for a resistance window between both states around 50). When integrated in ISAAC, a state-of-the-art learning accelerator, our block can reduce the power by $2.7\times $ while guaranteeing robust operations.

...read moreread less

45 citations

Proceedings Article•DOI•

Dr. CU 2.0: A Scalable Detailed Routing Framework with Correct-by-Construction Design Rule Satisfaction

[...]

Haocheng Li¹, Gengjie Chen¹, Bentian Jiang¹, Jingsong Chen¹, Evangeline F. Y. Young¹ - Show less +1 more•Institutions (1)

The Chinese University of Hong Kong¹

01 Nov 2019

TL;DR: This paper proposes a detailed router that judiciously handles hard-to-access pins and new design rules including length-dependent parallel run length spacing, end-of-line spacing with parallel edges, and corner- to-corner spacing that can effectively reduce the number of violations with comparable wirelength.

...read moreread less

Abstract: Detailed routing becomes a crucial challenge in VLSI design with shrinking feature size and increasing design complexity. More complicated design rules were added to guarantee manufacturability, which made detailed routing an even harder task to achieve in the design flow. In this paper, we propose a detailed router that judiciously handles hard-to-access pins and new design rules including length-dependent parallel run length spacing, end-of-line spacing with parallel edges, and corner-to-corner spacing. Our experimental results show that our framework can effectively reduce the number of violations with comparable wirelength. Comparing our algorithm with the best score of each released designs in the ISPD'19 Contest, there is 2% score improvement. Compared with the state-of-the-art work [1], our algorithm achieves 69% better scores. The source code of Dr. CU 2.0 is available at https://github.com/cuhk-eda/dr-cu.

...read moreread less

40 citations

Journal Article•DOI•

Cross-Layer Optimization for High Speed Adders: A Pareto Driven Machine Learning Approach

[...]

Yuzhe Ma¹, Subhendu Roy², Jin Miao³, Jiamin Chen¹, Bei Yu¹ - Show less +1 more•Institutions (3)

The Chinese University of Hong Kong¹, Intel², Cadence Design Systems³

01 Dec 2019-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: Experimental results demonstrate that the enhanced prefix adder synthesis algorithm enhanced can achieve Pareto frontier of high quality over a wide design space, bridging the gap between architectural and physical designs.

...read moreread less

Abstract: In spite of maturity to the modern electronic design automation (EDA) tools, optimized designs at architectural stage may become suboptimal after going through physical design flow. Adder design has been such a long studied fundamental problem in very large-scale integration industry yet designers cannot achieve optimal solutions by running EDA tools on the set of available prefix adder architectures. In this paper, we enhance a state-of-the-art prefix adder synthesis algorithm to obtain a much wider solution space in architectural domain. On top of that, a machine learning-based design space exploration methodology is applied to predict the Pareto frontier of the adders in physical domain, which is infeasible by exhaustively running EDA tools for innumerable architectural solutions. Considering the high cost of obtaining the true values for learning, an active learning algorithm is proposed to select the representative data during learning process, which uses less labeled data while achieving better quality of Pareto frontier. Experimental results demonstrate that our framework can achieve Pareto frontier of high quality over a wide design space, bridging the gap between architectural and physical designs. Source code and data are available at https://github.com/yuzhe630/adder-DSE .

...read moreread less

25 citations

Proceedings Article•DOI•

A Learning-Based Recommender System for Autotuning Design FIows of Industrial High-Performance Processors

[...]

Jihye Kwon¹, Matthew M. Ziegler², Luca P. Carloni¹•Institutions (2)

Columbia University¹, IBM²

02 Jun 2019

TL;DR: This work proposes an LSPD parameter recommender system that involves learning a collaborative prediction model through tensor decomposition and regression and demonstrates the transfer-learing properties of this approach by showing that this model can be successfully applied for 7nm designs.

...read moreread less

Abstract: Logic synthesis and physical design (LSPD) tools automate complex design tasks previously performed by human designers. One time-consuming task that remains manual is configuring the LSPD flow parameters, which significantly impacts design results. To reduce the parameter-tuning effort, we propose an LSPD parameter recommender system that involves learning a collaborative prediction model through tensor decomposition and regression. Using a model trained with archived data from multiple state-of-the-art 14nm processors, we reduce the exploration cost while achieving comparable design quality. Furthermore, we demonstrate the transfer-learning properties of our approach by showing that this model can be successfully applied for 7nm designs.

...read moreread less

23 citations

Journal Article•DOI•

Physical Design Obfuscation of Hardware: A Comprehensive Investigation of Device- and Logic-Level Techniques

[...]

Arunkumar Vijayakumar¹, Vinay C. Patil¹, Daniel Holcomb¹, Christof Paar¹, Sandip Kundu¹ - Show less +1 more•Institutions (1)

University of Massachusetts Amherst¹

02 Oct 2019-arXiv: Cryptography and Security

TL;DR: This paper investigates physical obfuscation techniques, which perform alterations of circuit elements that are difficult or impossible for an adversary to observe, and provides a categorization of the available physical obfuscations as it pertains to various design stages.

...read moreread less

Abstract: The threat of hardware reverse engineering is a growing concern for a large number of applications. A main defense strategy against reverse engineering is hardware obfuscation. In this paper, we investigate physical obfuscation techniques, which perform alterations of circuit elements that are difficult or impossible for an adversary to observe. The examples of such stealthy manipulations are changes in the doping concentrations or dielectric manipulations. An attacker will, thus, extract a netlist, which does not correspond to the logic function of the device-under-attack. This approach of camouflaging has garnered recent attention in the literature. In this paper, we expound on this promising direction to conduct a systematic end-to-end study of the VLSI design process to find multiple ways to obfuscate a circuit for hardware security. This paper makes three major contributions. First, we provide a categorization of the available physical obfuscation techniques as it pertains to various design stages. There is a large and multidimensional design space for introducing obfuscated elements and mechanisms, and the proposed taxonomy is helpful for a systematic treatment. Second, we provide a review of the methods that have been proposed or in use. Third, we present recent and new device and logic-level techniques for design obfuscation. For each technique considered, we discuss feasibility of the approach and assess likelihood of its detection. Then we turn our focus to open research questions, and conclude with suggestions for future research directions.

...read moreread less

20 citations

Proceedings Article•DOI•

A Predictive Process Design Kit for Three-Independent-Gate Field-Effect Transistors

[...]

Ganesh Gore¹, Patsy Cadareanu¹, Edouard Giacomin¹, Pierre-Emmanuel Gaillardon¹•Institutions (1)

University of Utah¹

06 Oct 2019

TL;DR: This paper proposes Predictive PDK for the 10 nm-diameter silicon-nanowire TIGFET device and shows 26% and 41% area reduction in the case of an XOR gate and a 1-bit full-adder design respectively.

...read moreread less

Abstract: The Three-Independent-Gate Field-Effect Transistor (TIGFET) is a promising beyond-CMOS technology which offers many unique properties, such as (i) dynamic control of the device polarity, (ii) dual threshold operation and (iii) more expressive logic capabilities. The efficient exploitation of these properties provides opportunity to design area and power optimized logic circuits. However, the evaluation of TIGFET-based design currently relies on a close approximation for the Power, Performance, and Area (PPA) rather than traditional layout-based methods. There is a need for a publicly available Process Design Kit (PDK) enabling systematic evaluation of the design area. In this paper, we propose Predictive PDK for the 10 nm-diameter silicon-nanowire TIGFET device. This work consists of a SPICE model and full custom physical design files including a Design Rule Manual, a Design Rule Check, and a Layout Versus Schematic decks for Calibre®. We then validate the design rules through the implementation of basic logic gates and a full-adder and compare extracted metrics with FreePDK15nm™ PDK. We show 26% and 41% area reduction in the case of an XOR gate and a 1-bit full-adder design respectively.

...read moreread less

20 citations

Journal Article•DOI•

Transistor Count Reduction by Gate Merging

[...]

Calebe Conceicao, Ricardo Reis¹•Institutions (1)

Universidade Federal do Rio Grande do Sul¹

25 Apr 2019-IEEE Transactions on Circuits and Systems I-regular Papers

TL;DR: This approach is capable to reduce the number of transistors of a circuit by 21% on average when compared to the traditional solution, and by 4% whenCompared to other logic minimization tools, which provides an important leakage power reduction.

...read moreread less

Abstract: A large set of ASICs uses much more transistors than its necessity, as they use a library of cells with a limited amount of logic functions. This small number of logic functions in a traditional cell library represents an inherent limitation in the optimization of the number of transistors. In modern technologies, static power consumption is related to the number of transistors. To reduce leakage power, it is necessary to optimize the number of transistors. Therefore, it demands the use of a library-free physical design approach, using tools to allow the automatic layout synthesis of any transistor network. The goal of this paper is to present a new method to optimize the logical netlist, willing to reduce the number of transistors and connections, as well as the number of vias. Post-processing of the original logic netlist generated in a traditional design flow is completed, and a set of basic cells is replaced by just one new equivalent logic gate, thus simultaneously reducing the number of transistors. The connected cells of unitary fanout are considered to be merged into a new complex gate (usually not available in a traditional cell library). This approach is capable to reduce the number of transistors of a circuit by 21% on average when compared to the traditional solution, and by 4% when compared to other logic minimization tools. This provides an important leakage power reduction.

...read moreread less

Proceedings Article•DOI•

CRoute: A Fast High-Quality Timing-Driven Connection-Based FPGA Router

[...]

Dries Vercruyce¹, Elias Vansteenkiste¹, Dirk Stroobandt¹•Institutions (1)

Ghent University¹

01 Apr 2019

TL;DR: The concept of the connection-based routing principle is elaborated on, the algorithm is improved and a timing-driven version is introduced and high-quality results are obtained in 3.4x less routing runtime.

...read moreread less

Abstract: FPGA routing is an important part of physical design as the programmable interconnection network requires the majority of the total silicon area and the connections largely contribute to delay and power. It should also occur with minimum runtime to enable efficient design exploration. In this work we elaborate on the concept of the connection-based routing principle. The algorithm is improved and a timing-driven version is introduced. The router, called CRoute, is implemented in an easy to adapt FPGA CAD framework written in Java, which is publicly available on GitHub. Quality and runtime are compared to the state-of-the-art router in VPR 7.0.7. Benchmarking is done with the Titan23 design suite, which consists of large heterogeneous designs targeted to a detailed representation of the Stratix IV FPGA. CRoute gains in both the total wire-length and maximum clock frequency while reducing the routing runtime. The total wire-length reduces by 11% and the maximum clock frequency increases by 6%. These high-quality results are obtained in 3.4x less routing runtime.

...read moreread less

Proceedings Article•DOI•

A New Paradigm in Split Manufacturing: Lock the FEOL, Unlock at the BEOL

[...]

Abhrajit Sengupta¹, Mohammed Nabeel², Johann Knechtel², Ozgur Sinanoglu²•Institutions (2)

New York University¹, New York University Abu Dhabi²

25 Mar 2019

TL;DR: In this paper, a new paradigm is proposed to enhance the security for split manufacturing by embedding keys in the BEOL layout in such a way that they become indecipherable to the state-of-the-art FEOL-centric attacks.

...read moreread less

Abstract: Split manufacturing was introduced as an effective countermeasure against hardware-level threats such as IP piracy, overbuilding, and insertion of hardware Trojans. Nevertheless, the security promise of split manufacturing has been challenged by various attacks, which exploit the well-known working principles of physical design tools to infer the missing BEOL interconnects. In this work, we advocate a new paradigm to enhance the security for split manufacturing. Based on Kerckhoff’s principle, we protect the FEOL layout in a formal and secure manner, by embedding keys. These keys are purposefully implemented and routed through the BEOL in such a way that they become indecipherable to the state-of-the-art FEOL-centric attacks. We provide our secure physical design flow to the community. We also define the security of split manufacturing formally and provide the associated proofs. At the same time, our technique is competitive with current schemes in terms of layout overhead, especially for practical, large-scale designs (ITC’99 benchmarks).

...read moreread less

Proceedings Article•DOI•

IR-ATA: IR annotated timing analysis, a flow for closing the loop between PDN design, IR analysis & timing closure

[...]

Ashkan Vakil¹, Houman Homayoun¹, Avesta Sasan¹•Institutions (1)

George Mason University¹

21 Jan 2019

TL;DR: IR-ATA is presented, a novel flow for modeling the timing impact of IR drop during the physical design and timing closure of an ASIC chip, allowing the physical designers to explore tradeoffs that were previously, for lack of methodology, not possible.

...read moreread less

Abstract: This paper presents IR-ATA, a novel flow for modeling the timing impact of IR drop during the physical design and timing closure of an ASIC chip. We first illustrate how the current and conventional mechanism for budgeting the IR drop and voltage noise (by using hard margins) lead to sub-optimal design. Consequently, we propose a new approach for modeling and margining against voltage noise, such that each timing path is margined based on its own topology and its own view of voltage noise. By having such a path based margining mechanism, the margins for IR drop and voltage noise for most timing paths in the design are safely relaxed. The reduction in the margin increases the available timing slack that could be used for improving the power, performance, and area of a design. Finally, we illustrate how IR-ATA could be used to track the timing impact of physical or PDN changes, allowing the physical designers to explore tradeoffs that were previously, for lack of methodology, not possible.

...read moreread less

Proceedings Article•DOI•

Finding placement-relevant clusters with fast modularity-based clustering

[...]

Mateus Fogaca¹, Andrew B. Kahng², Ricardo Reis¹, Lutong Wang²•Institutions (2)

Universidade Federal do Rio Grande do Sul¹, University of California, San Diego²

21 Jan 2019

TL;DR: This work studies a new criterion for the classic challenge of VLSI netlist clustering: how well netlist clusters "stay together" through final implementation, and empirically demonstrates that modularity-based clustering achieves better correlation to actual netlist placements than traditional VLSi CAD methods.

...read moreread less

Abstract: In advanced technology nodes, IC implementation faces increasing design complexity as well as ever-more demanding design schedule requirements. This raises the need for new decomposition approaches that can help reduce problem complexity, in conjunction with new predictive methodologies that can help avoid bottlenecks and loops in the physical implementation flow. Notably, with modern design methodologies it would be very valuable to better predict final placement of the gate-level netlist: this would enable more accurate early assessment of performance, congestion and floorplan viability in the SOC floorplanning/RTL planning stages of design. In this work, we study a new criterion for the classic challenge of VLSI netlist clustering: how well netlist clusters "stay together" through final implementation. We propose use of several evaluators of this criterion. We also explore the use of modularity-driven clustering to identify natural clusters in a given graph without the tuning of parameters and size balance constraints typically required by VLSI CAD partitioning methods. We find that the netlist hypergraph-to-graph mapping can significantly affect quality of results, and we experimentally identify an effective recipe for weighting that also comprehends topological proximity to I/Os. Further, we empirically demonstrate that modularity-based clustering achieves better correlation to actual netlist placements than traditional VLSI CAD methods (our method is also 4X faster than use of hMetis for our largest testcases). Finally, we show a potential flow with fast "blob placement" of clusters to evaluate netlist and floorplan viability in early design stages; this flow can predict gate-level placement of 370K cells in 200 seconds on a single core.

...read moreread less

Proceedings Article•DOI•

ROAD: Routability Analysis and Diagnosis Framework Based on SAT Techniques

[...]

Dong-won Park¹, Ilgweon Kang², Yeseong Kim¹, Sicun Gao¹, Bill Lin¹, Chung-Kuan Cheng¹ - Show less +2 more•Institutions (2)

University of California, San Diego¹, Cadence Design Systems²

04 Apr 2019

TL;DR: This paper proposes a novel framework, called ROAD, which diagnoses explicit reasons for routing failures and provides human-interpretable explanations for conflicted routing conditions, and demonstrates that ROAD successfully examines conflict causes for diverse pin layouts.

...read moreread less

Abstract: Routability diagnosis has increasingly become the bottleneck in detailed routing for sub-10nm technology due to the limited tracks, high density, and complex design rules. The conventional ways to examine the routability of detailed routing are ILP- and SAT-based techniques. However, once we identify the routability, the diagnosis remains an open problem for physical designers. In this paper, we propose a novel framework, called ROAD, which diagnoses explicit reasons for routing failures. The proposed ROAD framework utilizes a diagnosis-friendly SAT formulation to represent design's layout and diagnoses the routability with SAT solving techniques. Based on the diagnosis, ROAD provides human-interpretable explanations for conflicted routing conditions. To show the practical value of our framework, we also generate comprehensive test-sets that enable exhaustive exploration of layouts based on Rent's rule. We demonstrate that ROAD successfully examines conflict causes for diverse pin layouts. Throughout extensive diagnosis, we also present several key findings for design failure. ROAD performs routability diagnosis within 2 minutes on average for 90 grids testsets, while diagnosing the exact causes of routing failures in terms of congestion and conditional design rules.

...read moreread less

Proceedings Article•DOI•

A Reinforcement Learning-Based Framework for Solving Physical Design Routing Problem in the Absence of Large Test Sets

[...]

Upma Gandhi¹, Ismail Bustany², William Swartz³, Laleh Behjat¹•Institutions (3)

University of Calgary¹, Xilinx², University of Texas at Dallas³

01 Sep 2019

TL;DR: This work proposes a data-independent reinforcement learning (RL) based routing model called Alpha-PD-Router, which learns to route a circuit and correct short violations, based on a two-player collaborative game model that has been trained on a small circuit.

...read moreread less

Abstract: Advances in Electronic Design Automation(EDA) methods have made the designers and programmers to search for new ways to solve the complex problems seen in today’s Very Large Scale Integration circuits. Machine learning (ML), especially supervised learning, has been used to predict design rule violations. However, supervised learning requires large amount of labeled data. With the competitive nature of EDA based companies, there is limited access to benchmarks and labeled data. In this work, we propose a data-independent reinforcement learning (RL) based routing model called Alpha-PD-Router, which learns to route a circuit and correct short violations. The Alpha-PD-Router is based on a two-player collaborative game model that has been trained on a small circuit and successfully resolves 75 violations in 99 cases of 2 pins net arrangements in the testing phase. The proposed model has the potential to be used as a framework to develop RL based routing techniques untethered by the scarce availability of large routing data samples or designer expertise.

...read moreread less

Proceedings Article•DOI•

Clock Tree Synthesis Techniques for Optimal Power and Timing Convergence in SoC Partitions

[...]

Priya V Vishnu, A R Priyarenjini, Naveen Kotha¹•Institutions (1)

Intel¹

17 May 2019

TL;DR: This paper focuses on analyzing efficient CTS techniques for optimal power and timing convergence in SoC Partition with Multisource Clock Tree Synthesis and Multibit Flip-Flop usage with Clock Tree awareness.

...read moreread less

Abstract: Physical design is the process of converting a circuit description at Register Transfer Level into the physical layout. It primarily focuses on timing, power and area optimization by applying different optimization techniques at each stage of the design. Clock Tree Synthesis (CTS) is an important step in physical design flow. CTS builds the clock tree by balancing the skew in the entire design for all the clocks present. The conventional flow of CTS is inefficient at many points due to the increasing complexity of Integrated Circuits as a result of changing technology nodes. This paper focuses on analyzing efficient CTS techniques for optimal power and timing convergence in SoC Partition. The methodologies adopted for CTS are Multisource Clock Tree Synthesis and Multibit Flip-Flop usage with Clock Tree awareness.

...read moreread less

Journal Article•DOI•

A low-power and area-efficient quaternary adder based on CNTFET switching logic

[...]

Shirin Fakhari¹, Narges Hajizadeh Bastani², Mohammad Hossein Moaiyeri¹•Institutions (2)

Shahid Beheshti University¹, Islamic Azad University²

01 Jan 2019-Analog Integrated Circuits and Signal Processing

TL;DR: A low-power and area-efficient quaternary adder based on CNTFET switching logic is proposed, which significantly reduces the number of transistors, area and power consumption, while maintaining output driving capability and full swing operation.

...read moreread less

Abstract: Due to the increasing short channel effects in scaled CMOS circuits, the need for alternative technologies has substantially been increased. Moreover, the limitation in space consumed by interconnects and increased power density in nanoscale binary circuits have challenged the scaling process to achieve more efficient and denser circuits. Accordingly, designing efficient nanoscale multiple-valued circuits is of great importance. In this paper, a low-power and area-efficient quaternary adder based on CNTFET switching logic is proposed. The proposed design significantly reduces the number of transistors, area and power consumption, while maintaining output driving capability and full swing operation. The proposed design is comprehensively simulated using HSPICE and the Stanford CNTFET model. Furthermore, the layout of the proposed circuit is drawn using the physical design tool for CNTFET-based circuits. The results confirm significant improvements regarding of area, average power consumption, PDP, static power dissipation and sensitivity to process variations compared to its state-of-the-art counterparts. Also, the proposed quaternary full adder is exerted as the building block of a 4-digit quaternary ripple carry adder, and the simulation results indicate its superiority regarding of energy efficiency.

...read moreread less

Proceedings Article•DOI•

Ignore Clocking Constraints: An Alternative Physical Design Methodology for Field-Coupled Nanotechnologies

[...]

Robert Wille¹, Marcel Walter², Frank Sill Torres, Daniel GroBe², Rolf Drechsler - Show less +1 more•Institutions (2)

Johannes Kepler University of Linz¹, University of Bremen²

15 Jul 2019

TL;DR: A physical design methodology is proposed which tackles the FCN design problem by simply ignoring the clocking constraints and using adjusted conventional place and route algorithms, and results extracted from a physics simulator confirm the feasibility of the approach.

...read moreread less

Abstract: Field-Coupled Nanocomputing (FCN) allows for conducting computations with a power consumption that is magnitudes below current CMOS technologies. Recent physical implementations confirmed these prospects and put pressure on the Electronic Design Automation (EDA) community to develop physical design methods comparable to those available for conventional circuits. While the major design task boils down to a place and route problem, certain characteristics of FCN circuits introduce further challenges in terms of dedicated clock arrangements which lead to rather cumbersome clocking constraints. Thus far, those constraints have been addressed in a rather unsatisfactory fashion only. In this work, we propose a physical design methodology which tackles this problem by simply ignoring the clocking constraints and using adjusted conventional place and route algorithms. In order to deal with the resulting ramifications, a dedicated synchronization element is introduced. Results extracted from a physics simulator confirm the feasibility of the approach. A proof of concept implementation illustrates that ignoring clocking constraints indeed allows for a promising alternative direction for FCN design that overcomes the obstacles preventing the development of efficient solutions thus far.

...read moreread less

Proceedings Article•DOI•

Congestion-aware Global Routing using Deep Convolutional Generative Adversarial Networks

[...]

Zhonghua Zhou¹, Ziran Zhu², Jianli Chen², Yuzhe Ma³, Bei Yu³, Tsung-Yi Ho⁴, Guy G.F. Lemieux¹, Andre Ivanov¹ - Show less +4 more•Institutions (4)

University of British Columbia¹, Center for Discrete Mathematics and Theoretical Computer Science², The Chinese University of Hong Kong³, National Tsing Hua University⁴

01 Sep 2019

TL;DR: This paper presents a routing strategy that decomposes global routing into three stages, with different objectives associated with each stage, in contrast to conventional approaches, which usually use a single global optimization objective for driving the entire process.

...read moreread less

Abstract: The routing stage is one of the most time-consuming steps in System on Chip (SoC) physical design. For large designs, it can take days of effort to find a complete routing solution, and the result directly affects the circuit performance. In this paper, we present a routing strategy that decomposes global routing into three stages, with different objectives associated with each stage. This is in contrast to conventional approaches, which usually use a single global optimization objective for driving the entire process. Furthermore, we propose to use generative adversarial networks (GAN) to predict the congestion heatmap. This deep learning method has been used to successfully improve image recognition results. We adapt its use to global routing by converting data between the router and the image-based model. This model needs only placement and netlist information as input to make the forecast. Our GAN-based congestion estimator produces congestion heatmaps that show good fidelity with actual heatmaps produced by state-of-the-art global routers. Using this heatmap along with our modified routing flow, we achieve comparable global routing quality in terms of the total overflow and wirelength, but the runtime speedup on hard-to-route designs is significant.

...read moreread less

Proceedings Article•DOI•

A simulation framework for the design and evaluation of computational cameras

[...]

Thomas Nürnberg¹, Maximilian Schambach¹, David Uhlig¹, Michael Heizmann¹, Fernando Puente León¹ - Show less +1 more•Institutions (1)

Karlsruhe Institute of Technology¹

21 Jun 2019

TL;DR: This article shows that, depending on the application, the image formation on a sensor and phenomena like image noise have to be simulated accurately in order to achieve meaningful results while other aspects, such as photorealistic scene modeling, can be omitted.

...read moreread less

Abstract: In the emerging field of computational imaging, rapid prototyping of new camera concepts becomes increasingly difficult since the signal processing is intertwined with the physical design of a camera. As novel computational cameras capture information other than the traditional two-dimensional information, ground truth data, which can be used to thoroughly benchmark a new system design, is also hard to acquire. We propose to bridge this gap by using simulation. In this article, we present a raytracing framework tailored for the design and evaluation of computational imaging systems. We show that, depending on the application, the image formation on a sensor and phenomena like image noise have to be simulated accurately in order to achieve meaningful results while other aspects, such as photorealistic scene modeling, can be omitted. Therefore, we focus on accurately simulating the mandatory components of computational cameras, namely apertures, lenses, spectral filters and sensors. Besides the simulation of the imaging process, the framework is capable of generating various ground truth data, which can be used to evaluate and optimize the performance of a particular imaging system. Due to its modularity, it is easy to further extend the framework to the needs of other fields of application. We make the source code of our simulation framework publicly available and encourage other researchers to use it to design and evaluate their own camera designs.1

...read moreread less

Proceedings Article•DOI•

Simultaneous Placement and Clock Tree Construction for Modern FPGAs

[...]

Wuxi Li¹, Mehrdad Eslami Dehkordi², Stephen Yang², David Z. Pan¹•Institutions (2)

University of Texas at Austin¹, Xilinx²

20 Feb 2019

TL;DR: This paper proposes a generic FPGA placement framework that can simultaneously optimize placement quality and ensure clock feasibility by explicit clock tree construction and demonstrates the effectiveness and efficiency of the proposed approach using the ISPD 2017 Clock-Aware Placement Contest benchmark suite.

...read moreread less

Abstract: Modern field-programmable gate array (FPGA) devices often contain complex clocking architectures to achieve high-performance and flexible clock networks. The physical structure of these clock networks, however, are pre-manufactured, unadjustable, and with only limited routing resources. Most conventional FPGA placement algorithms rarely consider clock feasibility, and therefore lead to clock routing failures. Some recent works adopt simplified clock routing models (e.g., the bounding box model) to force clock legality during placement, which, however, can often overestimate clock routing demands and results in unnecessary placement quality degradation. To address these limitations, in this paper, we propose a generic FPGA placement framework that can simultaneously optimize placement quality and ensure clock feasibility by explicit clock tree construction. We demonstrate the effectiveness and efficiency of the proposed approach using the ISPD 2017 Clock-Aware Placement Contest benchmark suite. Compared with other state-of-the-art clock legalization algorithms, the proposed approach can achieve the best routed wirelength with competitive runtime.

...read moreread less

Journal Article•DOI•

RotaSYN: Rotary Traveling Wave Oscillator SYNthesizer

[...]

Ragh Kuttappa¹, Adarsha Balaji¹, Vasil Pano¹, Baris Taskin¹, Hamid Mahmoodi² - Show less +1 more•Institutions (2)

Drexel University¹, San Francisco State University²

27 Feb 2019-IEEE Transactions on Circuits and Systems I-regular Papers

TL;DR: A physical design methodology is presented to synchronize digital application specific integrated circuit (ASIC) designs by a resonant rotary clock network, demonstrating that the ASIC products of RotaSYN can operate at previously unattainable (relatively) low-frequency ranges of hundreds of megahertz.

...read moreread less

Abstract: A physical design methodology is presented to synchronize digital application specific integrated circuit (ASIC) designs by a resonant rotary clock network. One novelty of the proposed RotaSYN flow is that the ASIC products of RotaSYN can operate at previously unattainable (relatively) low-frequency ranges of hundreds of megahertz. The dynamic resonant frequency divider is used to implement the low-frequency operation; and low in comparison to the norm of gigahertz-range of operation for resonant clocking reported in this paper. In SPICE -based simulations, the efficacy of the proposed flow and novel algorithms in RotaSYN is demonstrated using performance metrics of the wirelength, skew, and power on international symposium on physical design-10 clock benchmark circuits. In addition, RotaSYN is compared to three publicly available industrial designs that include the ARM Cortex M0 against equivalent clocks generated with a traditional phase locked loop (PLL) and distributed with an industrial clock tree synthesis tool flow. The RotaSYN methodology is implemented at three different target frequencies of 880 MHz, 500 MHz, and 220 MHz for the industrial designs. SPICE simulations show an average of 29% power savings for the industrial designs overall, solely thanks to 66% power savings on the clock generation and distribution networks, operating at a frequency of 880 MHz on comparison to the PLL-based design with a clock tree synthesized with an industrial EDA tool.

...read moreread less

Proceedings Article•DOI•

Optimized Spatial-Spectral CT for Multi-Material Decomposition.

[...]

Matthew Tivnan¹, Wenying Wang¹, Steven Tilley¹, Jeffrey H. Siewerdsen¹, J. Webster Stayman¹ - Show less +1 more•Institutions (1)

Johns Hopkins University¹

28 May 2019

TL;DR: This work examines an alternate design based on a spatial-spectral filter made up of a linear array of materials that divide the incident x-ray beam into spectrally varied beamlets and characterize the effects of design parameters including filter tile order and filter tile width and their impact on material decomposition performance.

...read moreread less

Abstract: Spectral CT is an emerging modality that uses a data acquisition scheme with varied spectral responses to provide enhanced material discrimination in addition to the structural information of conventional CT. Existing clinical and preclinical designs with this capability include kV-switching, split-filtration, and dual-layer detector systems to provide two spectral channels of projection data. In this work, we examine an alternate design based on a spatial-spectral filter. This source-side filter is made up a linear array of materials that divide the incident x-ray beam into spectrally varied beamlets. This design allows for any number of spectral channels; however, each individual channel is sparse in the projection domain. Model-based iterative reconstruction methods can accommodate such sparse spatial-spectral sampling patterns and allow for the incorporation of advanced regularization. With the goal of an optimized physical design, we characterize the effects of design parameters including filter tile order and filter tile width and their impact on material decomposition performance. We present results of numerical simulations that characterize the impact of each design parameter using a realistic CT geometry and noise model to demonstrate feasibility. Results for filter tile order show little change indicating that filter order is a low-priority design consideration. We observe improved performance for narrower filter widths; however, the performance drop-off is relatively flat indicating that wider filter widths are also feasible designs.

...read moreread less

Proceedings Article•DOI•

A New Paradigm in Split Manufacturing: Lock the FEOL, Unlock at the BEOL

[...]

Abhrajit Sengupta¹, Mohammed Nabeel², Johann Knechtel², Ozgur Sinanoglu¹•Institutions (2)

New York University¹, New York University Abu Dhabi²

07 Mar 2019-arXiv: Cryptography and Security

...read moreread less

Abstract: Split manufacturing was introduced as an effective countermeasure against hardware-level threats such as IP piracy, overbuilding, and insertion of hardware Trojans. Nevertheless, the security promise of split manufacturing has been challenged by various attacks, which exploit the well-known working principles of physical design tools to infer the missing BEOL interconnects. In this work, we advocate a new paradigm to enhance the security for split manufacturing. Based on Kerckhoff's principle, we protect the FEOL layout in a formal and secure manner, by embedding keys. These keys are purposefully implemented and routed through the BEOL in such a way that they become indecipherable to the state-of-the-art FEOL-centric attacks. We provide our secure physical design flow to the community. We also define the security of split manufacturing formally and provide the associated proofs. At the same time, our technique is competitive with current schemes in terms of layout overhead, especially for practical, large-scale designs (ITC'99 benchmarks).

...read moreread less

Proceedings Article•DOI•

Hierarchical Layout Synthesis and Design Automation for 2.5D Heterogeneous Multi-Chip Power Modules

[...]

Imam Al Razi¹, Quang Le¹, H. Alan Mantooth¹, Yarui Peng¹•Institutions (1)

University of Arkansas¹

01 Sep 2019

TL;DR: This paper proposes a generic, scalable, and efficient algorithm to automate not only 2D but also 2.5D and 3D heterogeneous MCPM layouts considering hierarchy, and demonstrates the benefits of a hierarchical design methodology over the state-of-the-art approaches.

...read moreread less

Abstract: Multi-chip power module (MCPM) layout design automation has been identified as one of the primary research interests in the power electronics community with the advent of wide bandgap circuits. MCPM physical design requires a time-consuming iterative procedure that is so far explored manually based on the experience of the designers. Though the number of components and routing layers is limited in power electronics, careful physical design is required because of thermal and reliability issues. In this paper, the benefits of a hierarchical design methodology are demonstrated over the state-of-the-art approaches. We propose a generic, scalable, and efficient algorithm to automate not only 2D but also 2.5D (multiple substrates in a planar package) and 3D (multiple device layers stacked on the same substrate) heterogeneous MCPM layouts considering hierarchy. A complete optimization approach for a full-bridge 2.5D power module is demonstrated using hardware-validated electrical and thermal models.

...read moreread less

Proceedings Article•DOI•

Exploiting Proximity Information in a Satisfiability Based Attack Against Split Manufactured Circuits

[...]

Suyuan Chen¹, Ranga Vemuri¹•Institutions (1)

University of Cincinnati¹

05 May 2019

TL;DR: This paper proposes an effective method to exploit proximity information extracted from the FEOL circuit to reduce the size of the interconnection network which models the missing BEOL layers which in turn significantly reduces thesize of the resulting SAT problem.

...read moreread less

Abstract: Split Manufacturing (SM) was introduced as an effective countermeasure to reverse engineering of integrated circuits and as a potential deterrent to Trojan insertion and overproduction. In SM, some wires, assigned to the back-end-of-line (BEOL) layers and fabricated at a secure facility, are hidden from the attacker. However, proximity information based attacks use physical design hints such as wire-length, combinational cycles and routing directions obtained from the FEOL (front-end-of-line) netlist to recover some or all of the BEOL signals. In addition, a recently proposed satisfiability (SAT) based attack models the BEOL signal recovery problem as a problem of configuring a key-controlled interconnect network and solves for the key values using a SAT solver. While this method can recover 100% of the BEOL signals, it takes impractically long time for large circuits. In this paper, we propose an effective method to exploit proximity information extracted from the FEOL circuit to reduce the size of the interconnection network which models the missing BEOL layers which in turn significantly reduces the size of the resulting SAT problem. This leads to efficient recovery of 100% of the ‘hidden’ BEOL signals even for large circuits. Experimental results using circuits from ISCAS85, ISCAS89 and ITC99 benchmark suites show that the proposed method is up to 80x faster than the SAT-only attack (without proximity information) while maintaining the 100% attack correctness for all combinational and sequential benchmarks.

...read moreread less

Patent•

Methods, systems, and computer program product for implementing virtual prototyping for electronic designs

[...]

Ginetti Arnold¹, Pic Jean-Noel¹•Institutions (1)

Cadence Design Systems¹

25 Jun 2019

TL;DR: In this paper, methods, systems, and articles of manufacture for implementing virtual prototyping for electronic designs are described, including methods to identify a plurality of leaf cells into a hierarchical physical design of an electronic design.

...read moreread less

Abstract: Disclosed are methods, systems, and articles of manufacture for implementing virtual prototyping for electronic designs. These techniques identify a plurality of leaf cells into a hierarchical physical design of an electronic design, generate the hierarchical physical design at least by performing hierarchical placement for the plurality of leaf cells based in part or in whole upon one or more factors, and revise the placed hierarchical physical design at least by performing hierarchical routing for the plurality of leaf cells on the hierarchical physical design. One aspect may further detach a virtual cell in the hierarchical physical design at least by grouping a first set of leaf cells and representing the first set of leaf cells with a first placeholder.

...read moreread less

Book•

A Practical Approach to VLSI System on Chip (Soc) Design: A Comprehensive Guide

[...]

Veena S. Chakravarthi

25 Sep 2019

TL;DR: In this article, the authors present a design methodology for low power UPF flow in SoC design methodology using Static Timing Analysis (STA) and VLSI System Verification.

...read moreread less

Abstract: Introduction -- SoC design methodology -- System on Chip Components -- DFT and Synthesis -- Static timing Analysis (STA) -- VLSI System Verification -- Physical Design -- Advanced Techniques: Low power UPF flow -- Reference Design: Specification to Layout.

...read moreread less