scispace - formally typeset
Search or ask a question

Showing papers on "Very-large-scale integration published in 2022"


Book ChapterDOI
01 Jan 2022
TL;DR: Several machine learning algorithms that have been developed and are being widely used in Very Large Scale Integrated Circuits (VLSI) are summarized.
Abstract: In the past few decades, machine learning, a subset of artificial intelligence (AI), has emerged as a disruptive technology which is now being extensively used and has stretched across various domains. Among the numerous applications, one of the most significant advancements due to Machine Learning is in the field of Very Large Scale Integrated Circuits (VLSI). Further growth and improvements in this field are highly anticipated in the near future. The fabrication of thousands of transistors in VLSI is time consuming and complex which demanded the automation of design process, and hence, computer-aided design (CAD) tools and technologies have started to evolve. The incorporation of machine learning in VLSI involves the application of machine learning algorithms at different abstraction levels of VLSI CAD. In this paper, we summarize several machine learning algorithms that have been developed and are being widely used. We also have briefly discussed about how machine learning methods have transuded the layers of VLSI design process from register transfer level (RTL) assertion generation to static timing analysis (STA) with smart and efficient models and methodologies, further enhancing the quality of chip design with power, performance and area improvements and complexity and turnaround time reduction.

15 citations


Journal ArticleDOI
TL;DR: In this paper , the impact of placement and routing congestion on the performance of integrated circuits is investigated using the Improved Harmonic Search Optimization (IHOSO) algorithm and the perimeter degree technique (PDT).
Abstract: When used in conjunction with the current floorplan and the optimization technique in circuit design engineering, this research allows for the evaluation of design parameters that can be used to reduce congestion during integrated circuit fabrication. Testing the multiple alternative consequences of IC design will be extremely beneficial in this situation, as will be demonstrated further below. If the importance of placement and routing congestion concerns is underappreciated, the IC implementation may experience significant nonlinear problems throughout the process as a result of the underappreciation of placement and routing congestion concerns. The use of standard optimization techniques in integrated circuit design is not the most effective strategy when it comes to precisely estimating nonlinear aspects in the design of integrated circuits. To this end, advanced tools such as Xilinx VIVADO and the ICC2 have been developed, in addition to the ICC1 and VIRTUOSO, to explore for computations and recover the actual parameters that are required to design optimal placement and routing for well-organized and ordered physical design. Furthermore, this work employs the perimeter degree technique (PDT) to measure routing congestion in both horizontal and vertical directions for a silicon chip region and then applies the technique to lower the density of superfluous routing (DSR) (PDT). Recently, a metaheuristic approach to computation has increased in favor, particularly in the last two decades. It is a classic graph theory problem, and it is also a common topic in the field of optimization. However, it does not provide correct information about where and how nodes should be put, despite its popularity. Consequently, in conjunction with the optimized floorplan data, the optimized model created by the Improved Harmonic Search Optimization algorithm undergoes testing and investigation in order to estimate the amount of congestion that occurs during the routing process in VLSI circuit design and to minimize the amount of congestion that occurs.

15 citations


Proceedings ArticleDOI
06 Jun 2022
TL;DR: This paper provides a comprehensive survey of available EDA and CAD tools, methods, processes, and techniques for Integrated Circuits (ICs) that use machine learning algorithms.
Abstract: An increase in demand for semiconductor ICs, recent advancements in machine learning, and the slowing down of Moore's law have all contributed to the increased interest in using Machine Learning (ML) to enhance Electronic Design Automation (EDA) and Computer-Aided Design (CAD) tools and processes. This paper provides a comprehensive survey of available EDA and CAD tools, methods, processes, and techniques for Integrated Circuits (ICs) that use machine learning algorithms. The ML-based EDA/CAD tools are classified based on the IC design steps. They are utilized in Synthesis, Physical Design (Floorplanning, Placement, Clock Tree Synthesis, Routing), IR drop analysis, Static Timing Analysis (STA), Design for Test (DFT), Power Delivery Network analysis, and Sign-off. The current landscape of ML-based VLSI-CAD tools, current trends, and future perspectives of ML in VLSI-CAD are also discussed.

12 citations


Journal ArticleDOI
TL;DR: The paper gives an overview about the recent methodologies that have been developed for the performance improvement of V LSI design and it shows the future directions of the areas that are to be concentrated on VLSI circuit design.
Abstract: Low power design is one of the primary goals for any integrated circuits. Very Large-Scale Integration (VLSI) is a kind of Integrated Circuit (IC) that consists of hundreds and hundreds of transistor connection into a small chip. The communication and computer applications have grown very faster in the past decade due to the development of VLSI circuit design as microcontroller and microprocessors. However, still the research on VLSI are moving faster towards the scope of power and area minimization. The paper gives an overview about the recent methodologies that have been developed for the performance improvement of VLSI design and it shows the future directions of the areas that are to be concentrated on VLSI circuit design.

8 citations


Journal ArticleDOI
27 Jan 2022-Crystals
TL;DR: In this paper , the authors proposed a configuration of carbon nanotube (CNT) bundles, namely, squarely packed bundles of mixed CNTs, for high-speed very large-scale integration (VLSI) interconnects.
Abstract: The aroused quest to reduce the delay at the interconnect level is the main urge of this paper, so as to come across a configuration of carbon nanotube (CNT) bundles, namely, squarely packed bundles of mixed CNTs. The demonstrated approach in this paper makes the mixed CNT bundle adaptable to adopt for high-speed very-large-scale integration (VLSI) interconnects with technology shrinkage. To reduce the delay of the proposed configuration of the mixed CNT bundle, the behavioral change of resistance (R), inductance (L), and capacitance (C) has been observed with respect to both the width of the bundle and the diameter of the CNTs in the bundle. Consequently, the performance of the modified bundle configuration is compared with a previously developed configuration, namely, squarely packed bundles of dimorphic MWCNTs in terms of propagation delay and crosstalk delay at local-, semiglobal-, and global-level interconnects. The proposed bundle configuration is, ultimately, enacted as the better one for 32-nm and 16-nm technology nodes, and is suitable for 7-nm nodes as well.

8 citations


Proceedings ArticleDOI
20 Jan 2022
TL;DR: In this article , lower built-in self-test (LBIS T) mechanism is used to design a microprocessor and the proposed methodology is giving performance measure like power efficiency 97.5% and area had been attained.
Abstract: The major VLSI circuits like sequential circuits, linear chips and op amps are very important elements to provide many logic functions. Today's competitive devices like cell phone, tabs and note pads are most prominent and those are used to get function the 5G related operations. In this work lower built-in self-test (LBIS T) mechanism is used to designing a microprocessor. The proposed methodology is giving performance measure like power efficiency 97.5 % , improvement of delay is 2.5% and 32% development of area had been attained. This methodology attains more outperformance and compete with present technology. The proposed equipment and execution for our approach requiring a constrained range overhead (lower than 3% power) over conventional LBIS T.

7 citations





Proceedings ArticleDOI
01 Aug 2022
TL;DR: Experimental results show that the proposed approach is very effective finding suitable predictive models while simultaneously reducing the overall power consumption and is applicable in the context of approximate computing where hardware accelerators tolerate certain degree of errors at their outputs.
Abstract: With most VLSI design companies now being fabless it is imperative to develop methods to protect their Intellectual Property (IP). One approach that has become very popular due to its relative simplicity and practicality is logic locking. One of the problems with traditional locking mechanisms is that the locking circuitry is built into the netlist that the VLSI design company delivers to the foundry which has now access to the entire design including the locking mechanism. This implies that they could potentially tamper with this circuitry or reverse engineer it to obtain the locking key. One relatively new approach that has been coined logic locking through omission, or hardware redaction, maps a portion of the design to an embedded FPGA (eFPGA). The bitstream of the eFPGA now acts as the locking key. This new approach has been shown to be more secure as the foundry has no access to the bitstream during the manufacturing stage. The obvious drawbacks are the increase in design complexity and the area and performance overheads associated with the eFPGA. In this work we propose, to the best of our knowledge, the first attack on these type of new locking mechanisms by substituting the exact logic mapped onto the eFPGA by a synthesizable predictive model that replicates the behavior of the exact logic. We show that this approach is applicable in the context of approximate computing where hardware accelerators tolerate certain degree of errors at their outputs. Experimental results show that our proposed approach is very effective finding suitable predictive models while simultaneously reducing the overall power consumption.

6 citations


Proceedings ArticleDOI
Xu He, Zhiyong Fu, Yao Wang, Chang Liu, Yang Guo 
10 Jul 2022
TL;DR: This work proposes a fast net delay timing predictor based on machine learning, which extract the fully timing features using a look-ahead RC network, and shows that the proposed timing predictor has achieved average correlation over 0.99 with the post-routing sign-off timing results obtained in Synopsys PrimeTime.
Abstract: Timing closure is a critical but effort-taking task in VLSI designs. In placement stage, a fast and accurate net delay estimator is highly desirable to guide the timing optimization prior to routing, and thus reduce the timing pessimism and shorten the design turn-around time. To handle the timing uncertainty at the placement stage, we propose a fast net delay timing predictor based on machine learning, which extract the fully timing features using a look-ahead RC network. Experimental results show that the proposed timing predictor has achieved average correlation over 0.99 with the post-routing sign-off timing results obtained in Synopsys PrimeTime.

Journal ArticleDOI
TL;DR: In this article , the authors present the first open-source dataset called CircuitNet for ML tasks in VLSI CAD, which is used for cross-stage prediction tasks in the design flow to achieve faster design convergence.
Abstract: The electronic design automation (EDA) community has been actively exploring machine learning (ML) for very large-scale integrated computer-aided design (VLSI CAD). Many studies explored learning-based techniques for cross-stage prediction tasks in the design flow to achieve faster design convergence. Although building ML models usually requires a large amount of data, most studies can only generate small internal datasets for validation because of the lack of large public datasets. In this essay, we present the first open-source dataset called CircuitNet for ML tasks in VLSI CAD.

Journal ArticleDOI
TL;DR: This paper focuses on the electrical, thermal, and process compatibility issues of current on-chip interconnects of carbon nanotubes (CNTs) and reviews the advantages, recent developments, and dilemmas from the perspective of different interconnect lengths and through-silicon-via (TSV) applications.
Abstract: Along with deep scaling transistors and complex electronics information exchange networks, very-large-scale-integrated (VLSI) circuits require high performance and ultra-low power consumption. In order to meet the demand of data-abundant workloads and their energy efficiency, improving only the transistor performance would not be sufficient. Super high-speed microprocessors are useless if the capacity of the data lines is not increased accordingly. Meanwhile, traditional on-chip copper interconnects reach their physical limitation of resistivity and reliability and may no longer be able to keep pace with a processor’s data throughput. As one of the potential alternatives, carbon nanotubes (CNTs) have attracted important attention to become the future emerging on-chip interconnects with possible explorations of new development directions. In this paper, we focus on the electrical, thermal, and process compatibility issues of current on-chip interconnects. We review the advantages, recent developments, and dilemmas of CNT-based interconnects from the perspective of different interconnect lengths and through-silicon-via (TSV) applications.

Journal ArticleDOI
TL;DR: In this paper, a resource efficient and low power architecture using Integer Haar Wavelet Transform (IHT) for the complete delineation of ECG signal has been presented, which uses single scale wavelet coefficients to delineate P-QRS-T features making it computationally simple.

Journal ArticleDOI
TL;DR: This article develops hardware-efficient architecture for fractional-order correntropy adaptive filter (FoCAF) for its efficient real-time VLSI implementation and demonstrates that reformulations cause negligible performance degradation under the 16-bit fixed-point implementation.
Abstract: Conventional adaptive filters, which assume Gaussian distribution for signal and noise, exhibit significant performance degradation when operating in non-Gaussian environments. Recently proposed fractional-order adaptive filters (FoAFs) address this concern by assuming that the signal and noise are symmetric $\alpha $ -stable random processes. However, the literature does not include any VLSI architectures for these algorithms. Toward that end, this article develops hardware-efficient architecture for fractional-order correntropy adaptive filter (FoCAF). We first reformulate the FoCAF for its efficient real-time VLSI implementation and then demonstrate that these reformulations cause negligible performance degradation under the 16-bit fixed-point implementation. Using this reformulated algorithm, we design an FoCAF architecture. Furthermore, we analyze the critical path of the design to select the appropriate level of pipelining based on the sampling rate of the application. According to the critical-path analysis, the FoCAF design is pipelined using retiming techniques to obtain delayed FoCAF (DFoCAF), which is then synthesized using $\mathbf {45}$ -nm CMOS technology. Synthesis results reveal that DFoCAF architecture requires a minimal increase in hardware over the prominent least mean square (LMS) filter architecture and achieves a significant increase in the performance in symmetric $\alpha $ -stable environments where LMS fails to converge.

Book ChapterDOI
TL;DR: In this paper , a robust system for machine learning-based optimal adder analysis that connects the prefix adder design synthesis to the final physical design is presented, which shows a less time saving, slightly more areas, and higher speed adder efficiency between the 16-bit adders with Brent Kung adder.
Abstract: In addition to evaluating FPGA’s design, this paper aims to achieve the best time reduction possible by enhancing FPGA’s performance and to demonstrate its applicability in reconfigurable high-performance computing. Carry Select Adder is one of the key supplements used in arithmetic operations. A high-speed adder is the VLSI Architecture but at the expense of the area and power. This paper presents VLSI architectures of the chosen adder. The proposed work shows a less time-saving, slightly more areas, and higher-speed adder efficiency between the 16-bit adders with Brent Kung adder. A robust system for machine learning-based optimal adder analysis that connects the prefix adder design synthesis to the final physical design. The work proposed tests are carried out and Xilinx 14.7 simulations are conducted.

Proceedings ArticleDOI
10 Jul 2022
TL;DR: This paper proposes a differentiable-timing-driven global placement framework inspired by deep neural networks, and establishes the analogy between static timing analysis and neural network propagation to explicitly optimize timing metrics such as total negative slack (TNS) and worst positive slack (WNS).
Abstract: Placement is critical to the timing closure of the very-large-scale integrated (VLSI) circuit design flow. This paper proposes a differentiable-timing-driven global placement framework inspired by deep neural networks. By establishing the analogy between static timing analysis and neural network propagation, we propose a differentiable timing objective for placement to explicitly optimize timing metrics such as total negative slack (TNS) and worst negative slack (WNS). The framework can achieve at most 32.7% and 59.1% improvements on WNS and TNS respectively compared with the state-of-the-art timing-driven placer, and achieve 1.80× speed-up when both running on GPU.

Journal ArticleDOI
TL;DR: In this paper , the authors proposed an improved KNN algorithm to classify the defects, i.e., this TP selection approach utilizes a classification model to select the valid patterns only, to shorten the Test Time (TT).

Proceedings ArticleDOI
01 Aug 2022
TL;DR: In this article , a hierarchical physical design flow is proposed to enable the building of high-density and commercial-quality two-tier face-to-face-bonded hierarchical 3D ICs, which significantly reduce the associated manufacturing cost compared to existing 3D implementation flows and achieve cost competitiveness against the 2D reference in large modern designs.
Abstract: Hierarchical very-large-scale integration (VLSI) flows are an understudied yet critical approach to achieving design closure at giga-scale complexity and gigahertz frequency targets. This paper proposes a novel hierarchical physical design flow enabling the building of high-density and commercial-quality two-tier face-to-face-bonded hierarchical 3D ICs. We significantly reduce the associated manufacturing cost compared to existing 3D implementation flows and, for the first time, achieve cost competitiveness against the 2D reference in large modern designs. Experimental results on complex industrial and open manycore processors demonstrate in two advanced nodes that the proposed flow provides major power, performance, and area/cost (PPAC) improvements of 1.2 to 2.2 × compared with 2D, where all metrics are improved simultaneously, including up to power savings.


Journal ArticleDOI
TL;DR: This paper proposes a kind of eliminate redundancy method, so that this LDA-MRMR algorithm which can provide test cost reduction without increasing the defect level obviously.

Journal ArticleDOI
TL;DR: A comparison between various traditional flip-flops and the TSPC Flip-flop with regard to power usage, diffusion delays, product of delay-power (PDP), area, and power flow is given using the findings obtained from the Microwind simulator.
Abstract: The method of huge integrating involves implementing a significant transistor count in an extremely condensed space. Combinatorial logic has shown to be particularly effective in quantum computing as well as other designing applications. In VLSI design, the primary goal is to cut down on power consumption as well as latency. For the purpose of establishing technology and supporting the increased use of electrical machines, it is vital to decrease sub-threshold current flowing for large strains. This research explores the feasibility of implementing a shift register and without the Multi-threshold CMOS (MTCMOS) approach. At the process technology of 0.18 µm, 0.12 µm, and 90 nm, an investigation into the power loss and transmission delay characteristics of a variety of flip-flops is carried out. As technology gets shrunk, the amount of power lost through leakage rises. Using the greatest technique among all run time strategies, namely MTCMOS, helps to limit the amount of power lost due to leakage. The purpose of this article is to give a comparison between various traditional flip-flops and the TSPC flip-flop with regard to power usage, diffusion delays, product of delay-power (PDP), area, and power flow using the findings obtained from the Microwind simulator.

Journal ArticleDOI
TL;DR: In this article , a resource efficient and low power architecture using Integer Haar Wavelet Transform (IHT) for the complete delineation of ECG signal has been presented, which uses single scale wavelet coefficients to delineate P-QRS-T features making it computationally simple.

Proceedings ArticleDOI
29 Oct 2022
TL;DR: In this paper , the authors argue that the effectiveness of GNNs implicitly embeds the prior knowledge and inductive biases associated with given VLSI tasks, which is one of the three approaches to make a learning algorithm physics-informed.
Abstract: In this paper, we discuss the source of effectiveness of Graph Neural Networks (GNNs) in EDA, particularly in the VLSI design automation domain. We argue that the effectiveness comes from the fact that GNNs implicitly embed the prior knowledge and inductive biases associated with given VLSI tasks, which is one of the three approaches to make a learning algorithm physics-informed. These inductive biases are different to those common used in GNNs designed for other structured data, such as social networks and citation networks. We will illustrate this principle with several recent GNN examples in the VLSI domain, including predictive tasks such as switching activity prediction, timing prediction, parasitics prediction, layout symmetry prediction, as well as optimization tasks such as gate sizing and macro and cell transistor placement. We will also discuss the challenges of applications of GNN and the opportunity of applying self-supervised learning techniques with GNN for VLSI optimization.

Journal ArticleDOI
TL;DR: In this paper , a method based on trigonometric expansion properties of the hyperbolic function for hardware implementation which can be easily tuned for different accuracy and precision requirements is presented. But, it is not suitable for DNNs that use different precision in different layers.
Abstract: Hyperbolic tangent and Sigmoid functions are used as non-linear activation units in the artificial and deep neural networks. Since, these networks are computationally expensive, customized accelerators are designed for achieving the required performance at lower cost and power. The activation function and MAC units are the key building blocks of these neural networks. A low complexity and accurate hardware implementation of the activation function is required to meet the performance and area targets of such neural network accelerators. Moreover, a scalable implementation is required as the recent studies show that the DNNs may use different precision in different layers. This paper presents a novel method based on trigonometric expansion properties of the hyperbolic function for hardware implementation which can be easily tuned for different accuracy and precision requirements.

Journal ArticleDOI
01 Apr 2022
TL;DR: This brief presents a high-performance VLSI architecture of delayed least mean square (DLMS) adaptive filter for fast-convergence and low-mean square error (MSE) using distributed arithmetic (DA).
Abstract: This brief presents a high-performance VLSI architecture of delayed least mean square (DLMS) adaptive filter for fast-convergence and low-mean square error (MSE) using distributed arithmetic (DA). The proposed design estimates response against the adaptation delays using a parallel predictive adder tree followed by a shift accumulate (SA) unit. An efficient quantization scheme with two bits of scaled error signal is also suggested. Single SA unit for multiple DA bases is used to reduce the number of adders and registers. Simulation and synthesis results show that the proposed design for 32nd order provides 19.72% lesser area, 25.51% lesser power, lesser 28.89% MSE and 59.91% lesser MSE/area over the best existing design.


Journal ArticleDOI
TL;DR: In this article , the authors proposed a kind of eliminate redundancy method, so that this LDA-MRMR algorithm which can provide test cost reduction without increasing the defect level, which can sacrifice 3% of the predictive accuracy in exchange for 3.7 times time saving compared with traditional methods.


Journal ArticleDOI
TL;DR: In this paper , the transient analysis of the equivalent single conductor (ESC) model of hybrid Cu-CNT on-chip interconnects for nanopackaging using matrix rational approximation (MRA) modeling technique is carried out for single and coupled CNT interconnect lines at 14 nm and 22 nm technology nodes.
Abstract: This paper presents the transient analysis of the equivalent single conductor (ESC) model of hybrid Cu-CNT on-chip interconnects for nanopackaging using matrix rational approximation (MRA) modeling technique. The analysis of propagation delay and peak crosstalk noise is carried out for single and coupled Cu-CNT interconnect lines at 14 nm and 22 nm technology nodes. It has been observed that the proposed MRA model provides a speed-up factor of 131 compared to the HSPICE. An error of less than 1% confirms the accuracy of the proposed model compared to the SPICE simulations. It is observed that Cu-CNT lines are more immune to the crosstalk due to lesser coupling effects compared to Cu and CNT interconnects. The efficacy, accuracy, and comprehensive analysis using the proposed model ensures immense application possibility of the proposed model in the VLSI design automation tools at the nanopackaging level.