scispace - formally typeset
Search or ask a question

Showing papers on "Physical design published in 2019"


Proceedings ArticleDOI
02 Jun 2019
TL;DR: A view of the current status of the ALIGN (“Analog Layout, Intelligently Generated from Netlists”) project, challenges in developing open-source code with an academic/industry team, and nuts-and-bolts issues such as working with abstracted PDKs, navigating the “wall” between secured IP and open- source software, and securing access to example designs are provided.
Abstract: This paper presents analog layout automation efforts under the ALIGN ("Analog Layout, Intelligently Generated from Netlists") project for fast layout generation using a modular approach based on a mix of algorithmic and machine learning-based tools. The road to rapid turnaround is based on an approach that detects structure and hierarchy in the input netlist and uses a grid based philosophy for layout. The paper provides a view of the current status of the project, challenges in developing open-source code with an academic/industry team, and nuts-and-bolts issues such as working with abstracted PDKs, navigating the "wall" between secured IP and open-source software, and securing access to example designs.

55 citations


Proceedings ArticleDOI
20 May 2019
TL;DR: In this paper, the authors proposed a method that jointly optimizes over the physical design and control network, maintaining a distribution over designs and using reinforcement learning to optimize a control policy to maximize expected reward over the design distribution.
Abstract: The physical design of a robot and the policy that controls its motion are inherently coupled, and should be determined according to the task and environment. In an increasing number of applications, data-driven and learning-based approaches, such as deep reinforcement learning, have proven effective at designing control policies. For most tasks, the only way to evaluate a physical design with respect to such control policies is empirical—i.e., by picking a design and training a control policy for it. Since training these policies is time-consuming, it is computationally infeasible to train separate policies for all possible designs as a means to identify the best one. In this work, we address this limitation by introducing a method that jointly optimizes over the physical design and control network. Our approach maintains a distribution over designs and uses reinforcement learning to optimize a control policy to maximize expected reward over the design distribution. We give the controller access to design parameters to allow it to tailor its policy to each design in the distribution. Throughout training, we shift the distribution towards higher-performing designs, eventually converging to a design and control policy that are jointly optimal. We evaluate our approach in the context of legged locomotion, and demonstrate that it discovers novel designs and walking gaits, outperforming baselines across different settings.

54 citations


Journal ArticleDOI
TL;DR: An overview of the current and planned activities related to the ColdFlux project is presented and the design assumptions and decisions that were made to allow the development of design tools for million-gate circuits are justified.
Abstract: The IARPA SuperTools program requires the development of superconducting electronic design automation (S-EDA) and superconducting technology computer-aided design (S-TCAD) tools aimed at enabling the reliable design of complex superconducting digital circuits with millions of Josephson junctions. Within the SuperTools program, the ColdFlux project addresses S-EDA and S-TCAD tool research and development in four areas: 1) RTL synthesis, architectures and verification; 2) analog design and layout synthesis; 3) physical design and test; and 4) device and process modeling/simulation and cell library design. Capabilities include, but are not limited to, the following: device level modeling and simulation of Josephson junctions, modeling and simulation of the superconducting process manufacturing processes, powerful new electrical circuit simulation, parameterized schematic and layout libraries, optimization, compact SPICE-like model extraction, timing analysis, behavioral, register-transfer-level and logic syntheses, clock tree synthesis, placement and routing, layout-versus-schematic extraction, functional verification, and the evaluation of designs in the presence of magnetic fields and trapped flux. ColdFlux consists of six research groups from four continents. Here, we present an overview of the current and planned activities related to the project and justify the design assumptions and decisions that were made to allow the development of design tools for million-gate circuits.

54 citations


Journal ArticleDOI
TL;DR: In this article, the photonic inverse design problem is solved using local optimization methods, which often produce what appear to be good or very good designs when compared to comparably bad designs.
Abstract: Physical design problems, such as photonic inverse design, are typically solved using local optimization methods. These methods often produce what appear to be good or very good designs when compar...

48 citations


Journal ArticleDOI
TL;DR: This paper presents a purely digital robust RRAM-based convolutional block using single-ended XNOR sensing capable of performing dot product operations in a single cycle and shows that at the circuit level, this architecture can tolerate a resistance window as low as 1.09, ensuring reliable operations even under a high RRAM variability.
Abstract: Currently, there is a growing attention toward developing efficient hardware convolutional blocks for several applications such as computer vision or image processing. Recent works have shown that using binary values in convolutional blocks can considerably reduce the overall power consumption while achieving a high degree of accuracy. In parallel, some works employed resistive random-access memory (RRAM) as an in-memory accelerator to directly store the convolution kernels and perform analog dot product operations in the array, reducing the overall power consumption by limiting the number of memory accesses. However, such architecture is hampered by the limited resistance precision and large intrinsic variability of RRAMs. In this paper, we present a purely digital robust RRAM-based convolutional block using single-ended XNOR sensing capable of performing dot product operations in a single cycle. By carefully considering physical design and RRAM limitations at the 28-nm technology node, we show that at the circuit level, our architecture can tolerate a resistance window as low as 1.09, ensuring reliable operations even under a high RRAM variability ( $\sigma /\mu = 25\%$ for a resistance window between both states around 50). When integrated in ISAAC, a state-of-the-art learning accelerator, our block can reduce the power by $2.7\times $ while guaranteeing robust operations.

45 citations


Proceedings ArticleDOI
01 Nov 2019
TL;DR: This paper proposes a detailed router that judiciously handles hard-to-access pins and new design rules including length-dependent parallel run length spacing, end-of-line spacing with parallel edges, and corner- to-corner spacing that can effectively reduce the number of violations with comparable wirelength.
Abstract: Detailed routing becomes a crucial challenge in VLSI design with shrinking feature size and increasing design complexity. More complicated design rules were added to guarantee manufacturability, which made detailed routing an even harder task to achieve in the design flow. In this paper, we propose a detailed router that judiciously handles hard-to-access pins and new design rules including length-dependent parallel run length spacing, end-of-line spacing with parallel edges, and corner-to-corner spacing. Our experimental results show that our framework can effectively reduce the number of violations with comparable wirelength. Comparing our algorithm with the best score of each released designs in the ISPD'19 Contest, there is 2% score improvement. Compared with the state-of-the-art work [1], our algorithm achieves 69% better scores. The source code of Dr. CU 2.0 is available at https://github.com/cuhk-eda/dr-cu.

40 citations


Journal ArticleDOI
TL;DR: Experimental results demonstrate that the enhanced prefix adder synthesis algorithm enhanced can achieve Pareto frontier of high quality over a wide design space, bridging the gap between architectural and physical designs.
Abstract: In spite of maturity to the modern electronic design automation (EDA) tools, optimized designs at architectural stage may become suboptimal after going through physical design flow. Adder design has been such a long studied fundamental problem in very large-scale integration industry yet designers cannot achieve optimal solutions by running EDA tools on the set of available prefix adder architectures. In this paper, we enhance a state-of-the-art prefix adder synthesis algorithm to obtain a much wider solution space in architectural domain. On top of that, a machine learning-based design space exploration methodology is applied to predict the Pareto frontier of the adders in physical domain, which is infeasible by exhaustively running EDA tools for innumerable architectural solutions. Considering the high cost of obtaining the true values for learning, an active learning algorithm is proposed to select the representative data during learning process, which uses less labeled data while achieving better quality of Pareto frontier. Experimental results demonstrate that our framework can achieve Pareto frontier of high quality over a wide design space, bridging the gap between architectural and physical designs. Source code and data are available at https://github.com/yuzhe630/adder-DSE .

25 citations


Proceedings ArticleDOI
02 Jun 2019
TL;DR: This work proposes an LSPD parameter recommender system that involves learning a collaborative prediction model through tensor decomposition and regression and demonstrates the transfer-learing properties of this approach by showing that this model can be successfully applied for 7nm designs.
Abstract: Logic synthesis and physical design (LSPD) tools automate complex design tasks previously performed by human designers. One time-consuming task that remains manual is configuring the LSPD flow parameters, which significantly impacts design results. To reduce the parameter-tuning effort, we propose an LSPD parameter recommender system that involves learning a collaborative prediction model through tensor decomposition and regression. Using a model trained with archived data from multiple state-of-the-art 14nm processors, we reduce the exploration cost while achieving comparable design quality. Furthermore, we demonstrate the transfer-learning properties of our approach by showing that this model can be successfully applied for 7nm designs.

23 citations


Journal ArticleDOI
TL;DR: This paper investigates physical obfuscation techniques, which perform alterations of circuit elements that are difficult or impossible for an adversary to observe, and provides a categorization of the available physical obfuscations as it pertains to various design stages.
Abstract: The threat of hardware reverse engineering is a growing concern for a large number of applications. A main defense strategy against reverse engineering is hardware obfuscation. In this paper, we investigate physical obfuscation techniques, which perform alterations of circuit elements that are difficult or impossible for an adversary to observe. The examples of such stealthy manipulations are changes in the doping concentrations or dielectric manipulations. An attacker will, thus, extract a netlist, which does not correspond to the logic function of the device-under-attack. This approach of camouflaging has garnered recent attention in the literature. In this paper, we expound on this promising direction to conduct a systematic end-to-end study of the VLSI design process to find multiple ways to obfuscate a circuit for hardware security. This paper makes three major contributions. First, we provide a categorization of the available physical obfuscation techniques as it pertains to various design stages. There is a large and multidimensional design space for introducing obfuscated elements and mechanisms, and the proposed taxonomy is helpful for a systematic treatment. Second, we provide a review of the methods that have been proposed or in use. Third, we present recent and new device and logic-level techniques for design obfuscation. For each technique considered, we discuss feasibility of the approach and assess likelihood of its detection. Then we turn our focus to open research questions, and conclude with suggestions for future research directions.

20 citations


Proceedings ArticleDOI
06 Oct 2019
TL;DR: This paper proposes Predictive PDK for the 10 nm-diameter silicon-nanowire TIGFET device and shows 26% and 41% area reduction in the case of an XOR gate and a 1-bit full-adder design respectively.
Abstract: The Three-Independent-Gate Field-Effect Transistor (TIGFET) is a promising beyond-CMOS technology which offers many unique properties, such as (i) dynamic control of the device polarity, (ii) dual threshold operation and (iii) more expressive logic capabilities. The efficient exploitation of these properties provides opportunity to design area and power optimized logic circuits. However, the evaluation of TIGFET-based design currently relies on a close approximation for the Power, Performance, and Area (PPA) rather than traditional layout-based methods. There is a need for a publicly available Process Design Kit (PDK) enabling systematic evaluation of the design area. In this paper, we propose Predictive PDK for the 10 nm-diameter silicon-nanowire TIGFET device. This work consists of a SPICE model and full custom physical design files including a Design Rule Manual, a Design Rule Check, and a Layout Versus Schematic decks for Calibre®. We then validate the design rules through the implementation of basic logic gates and a full-adder and compare extracted metrics with FreePDK15nm™ PDK. We show 26% and 41% area reduction in the case of an XOR gate and a 1-bit full-adder design respectively.

20 citations


Journal ArticleDOI
TL;DR: This approach is capable to reduce the number of transistors of a circuit by 21% on average when compared to the traditional solution, and by 4% whenCompared to other logic minimization tools, which provides an important leakage power reduction.
Abstract: A large set of ASICs uses much more transistors than its necessity, as they use a library of cells with a limited amount of logic functions. This small number of logic functions in a traditional cell library represents an inherent limitation in the optimization of the number of transistors. In modern technologies, static power consumption is related to the number of transistors. To reduce leakage power, it is necessary to optimize the number of transistors. Therefore, it demands the use of a library-free physical design approach, using tools to allow the automatic layout synthesis of any transistor network. The goal of this paper is to present a new method to optimize the logical netlist, willing to reduce the number of transistors and connections, as well as the number of vias. Post-processing of the original logic netlist generated in a traditional design flow is completed, and a set of basic cells is replaced by just one new equivalent logic gate, thus simultaneously reducing the number of transistors. The connected cells of unitary fanout are considered to be merged into a new complex gate (usually not available in a traditional cell library). This approach is capable to reduce the number of transistors of a circuit by 21% on average when compared to the traditional solution, and by 4% when compared to other logic minimization tools. This provides an important leakage power reduction.

Proceedings ArticleDOI
01 Apr 2019
TL;DR: The concept of the connection-based routing principle is elaborated on, the algorithm is improved and a timing-driven version is introduced and high-quality results are obtained in 3.4x less routing runtime.
Abstract: FPGA routing is an important part of physical design as the programmable interconnection network requires the majority of the total silicon area and the connections largely contribute to delay and power. It should also occur with minimum runtime to enable efficient design exploration. In this work we elaborate on the concept of the connection-based routing principle. The algorithm is improved and a timing-driven version is introduced. The router, called CRoute, is implemented in an easy to adapt FPGA CAD framework written in Java, which is publicly available on GitHub. Quality and runtime are compared to the state-of-the-art router in VPR 7.0.7. Benchmarking is done with the Titan23 design suite, which consists of large heterogeneous designs targeted to a detailed representation of the Stratix IV FPGA. CRoute gains in both the total wire-length and maximum clock frequency while reducing the routing runtime. The total wire-length reduces by 11% and the maximum clock frequency increases by 6%. These high-quality results are obtained in 3.4x less routing runtime.

Proceedings ArticleDOI
25 Mar 2019
TL;DR: In this paper, a new paradigm is proposed to enhance the security for split manufacturing by embedding keys in the BEOL layout in such a way that they become indecipherable to the state-of-the-art FEOL-centric attacks.
Abstract: Split manufacturing was introduced as an effective countermeasure against hardware-level threats such as IP piracy, overbuilding, and insertion of hardware Trojans. Nevertheless, the security promise of split manufacturing has been challenged by various attacks, which exploit the well-known working principles of physical design tools to infer the missing BEOL interconnects. In this work, we advocate a new paradigm to enhance the security for split manufacturing. Based on Kerckhoff’s principle, we protect the FEOL layout in a formal and secure manner, by embedding keys. These keys are purposefully implemented and routed through the BEOL in such a way that they become indecipherable to the state-of-the-art FEOL-centric attacks. We provide our secure physical design flow to the community. We also define the security of split manufacturing formally and provide the associated proofs. At the same time, our technique is competitive with current schemes in terms of layout overhead, especially for practical, large-scale designs (ITC’99 benchmarks).

Proceedings ArticleDOI
21 Jan 2019
TL;DR: IR-ATA is presented, a novel flow for modeling the timing impact of IR drop during the physical design and timing closure of an ASIC chip, allowing the physical designers to explore tradeoffs that were previously, for lack of methodology, not possible.
Abstract: This paper presents IR-ATA, a novel flow for modeling the timing impact of IR drop during the physical design and timing closure of an ASIC chip. We first illustrate how the current and conventional mechanism for budgeting the IR drop and voltage noise (by using hard margins) lead to sub-optimal design. Consequently, we propose a new approach for modeling and margining against voltage noise, such that each timing path is margined based on its own topology and its own view of voltage noise. By having such a path based margining mechanism, the margins for IR drop and voltage noise for most timing paths in the design are safely relaxed. The reduction in the margin increases the available timing slack that could be used for improving the power, performance, and area of a design. Finally, we illustrate how IR-ATA could be used to track the timing impact of physical or PDN changes, allowing the physical designers to explore tradeoffs that were previously, for lack of methodology, not possible.

Proceedings ArticleDOI
21 Jan 2019
TL;DR: This work studies a new criterion for the classic challenge of VLSI netlist clustering: how well netlist clusters "stay together" through final implementation, and empirically demonstrates that modularity-based clustering achieves better correlation to actual netlist placements than traditional VLSi CAD methods.
Abstract: In advanced technology nodes, IC implementation faces increasing design complexity as well as ever-more demanding design schedule requirements. This raises the need for new decomposition approaches that can help reduce problem complexity, in conjunction with new predictive methodologies that can help avoid bottlenecks and loops in the physical implementation flow. Notably, with modern design methodologies it would be very valuable to better predict final placement of the gate-level netlist: this would enable more accurate early assessment of performance, congestion and floorplan viability in the SOC floorplanning/RTL planning stages of design. In this work, we study a new criterion for the classic challenge of VLSI netlist clustering: how well netlist clusters "stay together" through final implementation. We propose use of several evaluators of this criterion. We also explore the use of modularity-driven clustering to identify natural clusters in a given graph without the tuning of parameters and size balance constraints typically required by VLSI CAD partitioning methods. We find that the netlist hypergraph-to-graph mapping can significantly affect quality of results, and we experimentally identify an effective recipe for weighting that also comprehends topological proximity to I/Os. Further, we empirically demonstrate that modularity-based clustering achieves better correlation to actual netlist placements than traditional VLSI CAD methods (our method is also 4X faster than use of hMetis for our largest testcases). Finally, we show a potential flow with fast "blob placement" of clusters to evaluate netlist and floorplan viability in early design stages; this flow can predict gate-level placement of 370K cells in 200 seconds on a single core.

Proceedings ArticleDOI
04 Apr 2019
TL;DR: This paper proposes a novel framework, called ROAD, which diagnoses explicit reasons for routing failures and provides human-interpretable explanations for conflicted routing conditions, and demonstrates that ROAD successfully examines conflict causes for diverse pin layouts.
Abstract: Routability diagnosis has increasingly become the bottleneck in detailed routing for sub-10nm technology due to the limited tracks, high density, and complex design rules. The conventional ways to examine the routability of detailed routing are ILP- and SAT-based techniques. However, once we identify the routability, the diagnosis remains an open problem for physical designers. In this paper, we propose a novel framework, called ROAD, which diagnoses explicit reasons for routing failures. The proposed ROAD framework utilizes a diagnosis-friendly SAT formulation to represent design's layout and diagnoses the routability with SAT solving techniques. Based on the diagnosis, ROAD provides human-interpretable explanations for conflicted routing conditions. To show the practical value of our framework, we also generate comprehensive test-sets that enable exhaustive exploration of layouts based on Rent's rule. We demonstrate that ROAD successfully examines conflict causes for diverse pin layouts. Throughout extensive diagnosis, we also present several key findings for design failure. ROAD performs routability diagnosis within 2 minutes on average for 90 grids testsets, while diagnosing the exact causes of routing failures in terms of congestion and conditional design rules.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: This work proposes a data-independent reinforcement learning (RL) based routing model called Alpha-PD-Router, which learns to route a circuit and correct short violations, based on a two-player collaborative game model that has been trained on a small circuit.
Abstract: Advances in Electronic Design Automation(EDA) methods have made the designers and programmers to search for new ways to solve the complex problems seen in today’s Very Large Scale Integration circuits. Machine learning (ML), especially supervised learning, has been used to predict design rule violations. However, supervised learning requires large amount of labeled data. With the competitive nature of EDA based companies, there is limited access to benchmarks and labeled data. In this work, we propose a data-independent reinforcement learning (RL) based routing model called Alpha-PD-Router, which learns to route a circuit and correct short violations. The Alpha-PD-Router is based on a two-player collaborative game model that has been trained on a small circuit and successfully resolves 75 violations in 99 cases of 2 pins net arrangements in the testing phase. The proposed model has the potential to be used as a framework to develop RL based routing techniques untethered by the scarce availability of large routing data samples or designer expertise.

Proceedings ArticleDOI
17 May 2019
TL;DR: This paper focuses on analyzing efficient CTS techniques for optimal power and timing convergence in SoC Partition with Multisource Clock Tree Synthesis and Multibit Flip-Flop usage with Clock Tree awareness.
Abstract: Physical design is the process of converting a circuit description at Register Transfer Level into the physical layout. It primarily focuses on timing, power and area optimization by applying different optimization techniques at each stage of the design. Clock Tree Synthesis (CTS) is an important step in physical design flow. CTS builds the clock tree by balancing the skew in the entire design for all the clocks present. The conventional flow of CTS is inefficient at many points due to the increasing complexity of Integrated Circuits as a result of changing technology nodes. This paper focuses on analyzing efficient CTS techniques for optimal power and timing convergence in SoC Partition. The methodologies adopted for CTS are Multisource Clock Tree Synthesis and Multibit Flip-Flop usage with Clock Tree awareness.

Journal ArticleDOI
TL;DR: A low-power and area-efficient quaternary adder based on CNTFET switching logic is proposed, which significantly reduces the number of transistors, area and power consumption, while maintaining output driving capability and full swing operation.
Abstract: Due to the increasing short channel effects in scaled CMOS circuits, the need for alternative technologies has substantially been increased. Moreover, the limitation in space consumed by interconnects and increased power density in nanoscale binary circuits have challenged the scaling process to achieve more efficient and denser circuits. Accordingly, designing efficient nanoscale multiple-valued circuits is of great importance. In this paper, a low-power and area-efficient quaternary adder based on CNTFET switching logic is proposed. The proposed design significantly reduces the number of transistors, area and power consumption, while maintaining output driving capability and full swing operation. The proposed design is comprehensively simulated using HSPICE and the Stanford CNTFET model. Furthermore, the layout of the proposed circuit is drawn using the physical design tool for CNTFET-based circuits. The results confirm significant improvements regarding of area, average power consumption, PDP, static power dissipation and sensitivity to process variations compared to its state-of-the-art counterparts. Also, the proposed quaternary full adder is exerted as the building block of a 4-digit quaternary ripple carry adder, and the simulation results indicate its superiority regarding of energy efficiency.

Proceedings ArticleDOI
15 Jul 2019
TL;DR: A physical design methodology is proposed which tackles the FCN design problem by simply ignoring the clocking constraints and using adjusted conventional place and route algorithms, and results extracted from a physics simulator confirm the feasibility of the approach.
Abstract: Field-Coupled Nanocomputing (FCN) allows for conducting computations with a power consumption that is magnitudes below current CMOS technologies. Recent physical implementations confirmed these prospects and put pressure on the Electronic Design Automation (EDA) community to develop physical design methods comparable to those available for conventional circuits. While the major design task boils down to a place and route problem, certain characteristics of FCN circuits introduce further challenges in terms of dedicated clock arrangements which lead to rather cumbersome clocking constraints. Thus far, those constraints have been addressed in a rather unsatisfactory fashion only. In this work, we propose a physical design methodology which tackles this problem by simply ignoring the clocking constraints and using adjusted conventional place and route algorithms. In order to deal with the resulting ramifications, a dedicated synchronization element is introduced. Results extracted from a physics simulator confirm the feasibility of the approach. A proof of concept implementation illustrates that ignoring clocking constraints indeed allows for a promising alternative direction for FCN design that overcomes the obstacles preventing the development of efficient solutions thus far.

Proceedings ArticleDOI
01 Sep 2019
TL;DR: This paper presents a routing strategy that decomposes global routing into three stages, with different objectives associated with each stage, in contrast to conventional approaches, which usually use a single global optimization objective for driving the entire process.
Abstract: The routing stage is one of the most time-consuming steps in System on Chip (SoC) physical design. For large designs, it can take days of effort to find a complete routing solution, and the result directly affects the circuit performance. In this paper, we present a routing strategy that decomposes global routing into three stages, with different objectives associated with each stage. This is in contrast to conventional approaches, which usually use a single global optimization objective for driving the entire process. Furthermore, we propose to use generative adversarial networks (GAN) to predict the congestion heatmap. This deep learning method has been used to successfully improve image recognition results. We adapt its use to global routing by converting data between the router and the image-based model. This model needs only placement and netlist information as input to make the forecast. Our GAN-based congestion estimator produces congestion heatmaps that show good fidelity with actual heatmaps produced by state-of-the-art global routers. Using this heatmap along with our modified routing flow, we achieve comparable global routing quality in terms of the total overflow and wirelength, but the runtime speedup on hard-to-route designs is significant.

Proceedings ArticleDOI
21 Jun 2019
TL;DR: This article shows that, depending on the application, the image formation on a sensor and phenomena like image noise have to be simulated accurately in order to achieve meaningful results while other aspects, such as photorealistic scene modeling, can be omitted.
Abstract: In the emerging field of computational imaging, rapid prototyping of new camera concepts becomes increasingly difficult since the signal processing is intertwined with the physical design of a camera. As novel computational cameras capture information other than the traditional two-dimensional information, ground truth data, which can be used to thoroughly benchmark a new system design, is also hard to acquire. We propose to bridge this gap by using simulation. In this article, we present a raytracing framework tailored for the design and evaluation of computational imaging systems. We show that, depending on the application, the image formation on a sensor and phenomena like image noise have to be simulated accurately in order to achieve meaningful results while other aspects, such as photorealistic scene modeling, can be omitted. Therefore, we focus on accurately simulating the mandatory components of computational cameras, namely apertures, lenses, spectral filters and sensors. Besides the simulation of the imaging process, the framework is capable of generating various ground truth data, which can be used to evaluate and optimize the performance of a particular imaging system. Due to its modularity, it is easy to further extend the framework to the needs of other fields of application. We make the source code of our simulation framework publicly available and encourage other researchers to use it to design and evaluate their own camera designs.1

Proceedings ArticleDOI
20 Feb 2019
TL;DR: This paper proposes a generic FPGA placement framework that can simultaneously optimize placement quality and ensure clock feasibility by explicit clock tree construction and demonstrates the effectiveness and efficiency of the proposed approach using the ISPD 2017 Clock-Aware Placement Contest benchmark suite.
Abstract: Modern field-programmable gate array (FPGA) devices often contain complex clocking architectures to achieve high-performance and flexible clock networks. The physical structure of these clock networks, however, are pre-manufactured, unadjustable, and with only limited routing resources. Most conventional FPGA placement algorithms rarely consider clock feasibility, and therefore lead to clock routing failures. Some recent works adopt simplified clock routing models (e.g., the bounding box model) to force clock legality during placement, which, however, can often overestimate clock routing demands and results in unnecessary placement quality degradation. To address these limitations, in this paper, we propose a generic FPGA placement framework that can simultaneously optimize placement quality and ensure clock feasibility by explicit clock tree construction. We demonstrate the effectiveness and efficiency of the proposed approach using the ISPD 2017 Clock-Aware Placement Contest benchmark suite. Compared with other state-of-the-art clock legalization algorithms, the proposed approach can achieve the best routed wirelength with competitive runtime.

Journal ArticleDOI
TL;DR: A physical design methodology is presented to synchronize digital application specific integrated circuit (ASIC) designs by a resonant rotary clock network, demonstrating that the ASIC products of RotaSYN can operate at previously unattainable (relatively) low-frequency ranges of hundreds of megahertz.
Abstract: A physical design methodology is presented to synchronize digital application specific integrated circuit (ASIC) designs by a resonant rotary clock network. One novelty of the proposed RotaSYN flow is that the ASIC products of RotaSYN can operate at previously unattainable (relatively) low-frequency ranges of hundreds of megahertz. The dynamic resonant frequency divider is used to implement the low-frequency operation; and low in comparison to the norm of gigahertz-range of operation for resonant clocking reported in this paper. In SPICE -based simulations, the efficacy of the proposed flow and novel algorithms in RotaSYN is demonstrated using performance metrics of the wirelength, skew, and power on international symposium on physical design-10 clock benchmark circuits. In addition, RotaSYN is compared to three publicly available industrial designs that include the ARM Cortex M0 against equivalent clocks generated with a traditional phase locked loop (PLL) and distributed with an industrial clock tree synthesis tool flow. The RotaSYN methodology is implemented at three different target frequencies of 880 MHz, 500 MHz, and 220 MHz for the industrial designs. SPICE simulations show an average of 29% power savings for the industrial designs overall, solely thanks to 66% power savings on the clock generation and distribution networks, operating at a frequency of 880 MHz on comparison to the PLL-based design with a clock tree synthesized with an industrial EDA tool.

Proceedings ArticleDOI
28 May 2019
TL;DR: This work examines an alternate design based on a spatial-spectral filter made up of a linear array of materials that divide the incident x-ray beam into spectrally varied beamlets and characterize the effects of design parameters including filter tile order and filter tile width and their impact on material decomposition performance.
Abstract: Spectral CT is an emerging modality that uses a data acquisition scheme with varied spectral responses to provide enhanced material discrimination in addition to the structural information of conventional CT. Existing clinical and preclinical designs with this capability include kV-switching, split-filtration, and dual-layer detector systems to provide two spectral channels of projection data. In this work, we examine an alternate design based on a spatial-spectral filter. This source-side filter is made up a linear array of materials that divide the incident x-ray beam into spectrally varied beamlets. This design allows for any number of spectral channels; however, each individual channel is sparse in the projection domain. Model-based iterative reconstruction methods can accommodate such sparse spatial-spectral sampling patterns and allow for the incorporation of advanced regularization. With the goal of an optimized physical design, we characterize the effects of design parameters including filter tile order and filter tile width and their impact on material decomposition performance. We present results of numerical simulations that characterize the impact of each design parameter using a realistic CT geometry and noise model to demonstrate feasibility. Results for filter tile order show little change indicating that filter order is a low-priority design consideration. We observe improved performance for narrower filter widths; however, the performance drop-off is relatively flat indicating that wider filter widths are also feasible designs.

Proceedings ArticleDOI
TL;DR: In this paper, a new paradigm is proposed to enhance the security for split manufacturing by embedding keys in the BEOL layout in such a way that they become indecipherable to the state-of-the-art FEOL-centric attacks.
Abstract: Split manufacturing was introduced as an effective countermeasure against hardware-level threats such as IP piracy, overbuilding, and insertion of hardware Trojans. Nevertheless, the security promise of split manufacturing has been challenged by various attacks, which exploit the well-known working principles of physical design tools to infer the missing BEOL interconnects. In this work, we advocate a new paradigm to enhance the security for split manufacturing. Based on Kerckhoff's principle, we protect the FEOL layout in a formal and secure manner, by embedding keys. These keys are purposefully implemented and routed through the BEOL in such a way that they become indecipherable to the state-of-the-art FEOL-centric attacks. We provide our secure physical design flow to the community. We also define the security of split manufacturing formally and provide the associated proofs. At the same time, our technique is competitive with current schemes in terms of layout overhead, especially for practical, large-scale designs (ITC'99 benchmarks).

Proceedings ArticleDOI
01 Sep 2019
TL;DR: This paper proposes a generic, scalable, and efficient algorithm to automate not only 2D but also 2.5D and 3D heterogeneous MCPM layouts considering hierarchy, and demonstrates the benefits of a hierarchical design methodology over the state-of-the-art approaches.
Abstract: Multi-chip power module (MCPM) layout design automation has been identified as one of the primary research interests in the power electronics community with the advent of wide bandgap circuits. MCPM physical design requires a time-consuming iterative procedure that is so far explored manually based on the experience of the designers. Though the number of components and routing layers is limited in power electronics, careful physical design is required because of thermal and reliability issues. In this paper, the benefits of a hierarchical design methodology are demonstrated over the state-of-the-art approaches. We propose a generic, scalable, and efficient algorithm to automate not only 2D but also 2.5D (multiple substrates in a planar package) and 3D (multiple device layers stacked on the same substrate) heterogeneous MCPM layouts considering hierarchy. A complete optimization approach for a full-bridge 2.5D power module is demonstrated using hardware-validated electrical and thermal models.

Proceedings ArticleDOI
05 May 2019
TL;DR: This paper proposes an effective method to exploit proximity information extracted from the FEOL circuit to reduce the size of the interconnection network which models the missing BEOL layers which in turn significantly reduces thesize of the resulting SAT problem.
Abstract: Split Manufacturing (SM) was introduced as an effective countermeasure to reverse engineering of integrated circuits and as a potential deterrent to Trojan insertion and overproduction. In SM, some wires, assigned to the back-end-of-line (BEOL) layers and fabricated at a secure facility, are hidden from the attacker. However, proximity information based attacks use physical design hints such as wire-length, combinational cycles and routing directions obtained from the FEOL (front-end-of-line) netlist to recover some or all of the BEOL signals. In addition, a recently proposed satisfiability (SAT) based attack models the BEOL signal recovery problem as a problem of configuring a key-controlled interconnect network and solves for the key values using a SAT solver. While this method can recover 100% of the BEOL signals, it takes impractically long time for large circuits. In this paper, we propose an effective method to exploit proximity information extracted from the FEOL circuit to reduce the size of the interconnection network which models the missing BEOL layers which in turn significantly reduces the size of the resulting SAT problem. This leads to efficient recovery of 100% of the ‘hidden’ BEOL signals even for large circuits. Experimental results using circuits from ISCAS85, ISCAS89 and ITC99 benchmark suites show that the proposed method is up to 80x faster than the SAT-only attack (without proximity information) while maintaining the 100% attack correctness for all combinational and sequential benchmarks.

Patent
25 Jun 2019
TL;DR: In this paper, methods, systems, and articles of manufacture for implementing virtual prototyping for electronic designs are described, including methods to identify a plurality of leaf cells into a hierarchical physical design of an electronic design.
Abstract: Disclosed are methods, systems, and articles of manufacture for implementing virtual prototyping for electronic designs. These techniques identify a plurality of leaf cells into a hierarchical physical design of an electronic design, generate the hierarchical physical design at least by performing hierarchical placement for the plurality of leaf cells based in part or in whole upon one or more factors, and revise the placed hierarchical physical design at least by performing hierarchical routing for the plurality of leaf cells on the hierarchical physical design. One aspect may further detach a virtual cell in the hierarchical physical design at least by grouping a first set of leaf cells and representing the first set of leaf cells with a first placeholder.

Book
25 Sep 2019
TL;DR: In this article, the authors present a design methodology for low power UPF flow in SoC design methodology using Static Timing Analysis (STA) and VLSI System Verification.
Abstract: Introduction -- SoC design methodology -- System on Chip Components -- DFT and Synthesis -- Static timing Analysis (STA) -- VLSI System Verification -- Physical Design -- Advanced Techniques: Low power UPF flow -- Reference Design: Specification to Layout.