scispace - formally typeset
Search or ask a question

Showing papers on "System on a chip published in 2023"


Journal ArticleDOI
TL;DR: In this article , the authors proposed a virtual prototype (VP) with integrated cryptographic accelerators for a cryptographic SoC based on RISC-V to accelerate the functional and performance simulation of the SoC.
Abstract: Embedded hardware accelerator with limited resources is increasingly employed in security areas. To accelerate system-on-chip (SoC) design, an efficient HW/SW co-design approach and validation platform become extremely important. The Electronic System Level Simulator (ESL) based on SystemC is the primary solution for fast hardware modeling and verification. However, most existing simulators cannot achieve a better trade-off between accuracy and performance, and none of the specific ESL simulators are proposed for cryptographic SoCs. To this end, this brief proposes a virtual prototype (VP) with integrated cryptographic accelerators for a cryptographic SoC based on RISC-V to accelerate the functional and performance simulation of the SoC. The VP is designed as an extensible and configurable platform dedicated to cryptographic SoC using an efficient HW/SW co-design approach. To accurately emulate real hardware, the flexible AHB-TLM interface and core timing model are presented. Compared to the RTL simulation, our custom VP performs about 10-450 times faster than the RTL simulation, and the simulation error is only about 4%. Our code is available at https://github.com/LX-IC/VP.

1 citations


Journal ArticleDOI
TL;DR: In this paper , the authors propose a scalable RTL-level SoC validation scheme, SeVNoC, for the systematic detection of security violations in inter-IP communications for SoC designs with NoC fabrics.
Abstract: Modern System-on-Chip (SoC) designs include a variety of Network-on-Chip (NoC) fabrics to implement coordination and communication of integrated hardware intellectual property (IP) blocks. An important class of security vulnerabilities involves a rogue hardware IP interfering with this communication to compromise the integrity of the system. Such interference includes message mutation, misdirection, delivery prevention, or IP masquerading, among others. In this article, we propose a scalable RTL-level SoC validation scheme, SeVNoC, for the systematic detection of security violations in inter-IP communications for SoC designs with NoC fabrics. Given a target security property to be validated, SeVNoC entails extraction of the control-flow graph of the relevant SoC, which is analyzed through a security property-based model comparison, without incurring state-space explosion. Our experiments on full-scale realistic SoC designs with multiple IPs and NoC architecture indicate that SeVNoC detects security violations in NoC communications with near-perfect accuracy, within only a few minutes.

1 citations



Proceedings ArticleDOI
21 May 2023
TL;DR: In this paper , the authors describe a simulator for two types of network-on-chip designs that employ the MPLS as an essential communication tool for switching and routing at the packet level inside the chip of multiprocessing system on chip implemented as a network on chip.
Abstract: This paper describes a simulator for two types of network-on-chip designs that employs the MPLS as an essential communication tool for switching and routing at the packet level inside the chip of multiprocessing system on chip implemented as a network-on-chip. Modeling and simulation results demonstrated the when using MPLS with network-on-chip system efficient results obtained with reduced latency, less packet error ratio, and fault tolerance achieved with the nature principle of the MPLS.

Journal ArticleDOI
TL;DR: In this article , a coprocessor IP core that can be flexibly embedded in a universal SoC was designed to accelerate the multi-motor vector control current loop operation according to the hardware-software coordination scheme.
Abstract: The multi-axis servo control system has been extensively used in industrial control. However, the applications of traditional MCU and DSP chips in high-performance multi-axis servo control systems are becoming increasingly difficult due to their lack of computing power. Although FPGA chips can meet the computing power requirements of high-performance multi-axis servo control systems, their versatility is insufficient, and the chip is too costly for large-scale use. Therefore, when designing the universal SoC, it is better to directly embed the coprocessor IP core dedicated to accelerating the multi-motor vector control current loop operation into the universal SoC. In this study, a coprocessor IP core that can be flexibly embedded in a universal SoC was designed. The IP core based on time division multiplexing (TDM) technology could accelerate the multi-motor vector control current loop operation according to the hardware–software coordination scheme proposed in this study. The IP was first integrated into a universal SoC to verify its performance, and then the FPGA prototype verification for the SoC was performed under three-axis servo control systems. Secondly, the ASIC implementation of the IP was also conducted based on the CSMC 90 nm process library. The experimental results revealed that the IP had a small area and low power consumption and was suitable for application in universal SoC. Therefore, the cheap and low-power single universal SoC with the coprocessor IP can be suitable for multi-axis servo control.

Proceedings ArticleDOI
26 May 2023
TL;DR: In this article , a hardware platform scheme of heterogeneous SoC chips is adopted, and the dual-system software architecture based on Linux and RTOS system is adopted to meet the demand of high-precision harmonic sampling in electronics power grid, and further improve the performance of secondary equipment such as substation measurement and control.
Abstract: In order to meet the demand of high-precision harmonic sampling in electronics power grid, and further improve the performance of secondary equipment such as substation measurement and control, PMU and broadband measurement, the application scenarios and performance requirements of high-performance hardware platform are analyzed, and the technical status of domestic chips in mass production is investigated. The hardware platform scheme of heterogeneous SoC chips is adopted, and the dual-system software architecture based on Linux and RTOS system is adopted, We have designed a large-capacity real-time data parallel processing and layered drive technology scheme, fully developed the computing potential of heterogeneous multi-core chips, improved the performance of hardware platform based on domestic chips, and completed the development and testing of the hardware platform of broadband measurement device prototype. Finally, the feasibility of the hardware platform technology scheme based on heterogeneous SoC chips is verified through the comparison of various performance tests.


Posted ContentDOI
09 Jan 2023
TL;DR: In this article , the authors proposed a configurable reinforcement learning (RL) algorithm implemented in a System on Chip (SoC), which offers flexibility, configurability, and scalability while maintaining computation speed and accuracy.
Abstract: <p>This paper proposes a FAst paRAllel and pipeliNE Q-learning accelerator (FARANE-Q) for a configurable Reinforcement Learning (RL) algorithm implemented in a System on Chip (SoC). The proposed work offers flexibility, configurability, and scalability while maintaining computation speed and accuracy to overcome the challenges of a dynamic environment and increasing complexity. The proposed method includes a Hardware/Software (HW/SW) design methodology for the SoC architecture to achieve flexibility. We also propose joint optimizations on the algorithm, architecture, and implementation to obtain optimum (high efficiency) performance, specifically in energy and area efficiency. Furthermore, we implemented the proposed design in a real-time Zynq Ultra96-V2 FPGA platform to evaluate the functionality with an actual use case of smart navigation. Experimental results confirm that the proposed accelerator FARANE-Q outperforms state-of-the-art works by achieving a throughput of up to 148.55 MSps. It corresponds to the energy efficiency of 1747.64 MSps/W per agent for 32-bit and 2424.33 MSps/W per agent for 16-bit FARANE-Q. Moreover, the proposed 16-bit FARANE-Q outperforms other related works by an improvement of at least 1.23× in energy efficiency. The designed system also maintains an error accuracy of less than 0.4% with optimized bit precision for more than eight fraction bits. The proposed FARANE-Q also offers a speed up of processing time up to 1795× compared to embedded SW computation executed on ARM Zynq processor and 280× of computation of full software executed on i7 processor. Hence, the proposed work has the potential to be used for smart navigation, robotic control, and predictive maintenance.</p>

Proceedings ArticleDOI
01 Apr 2023
TL;DR: In this article , the AXI4-Lite arbitration protocol is implemented using System Verilog (SV) and Universal Verification Methodology (UVM) for the verification of AXI-4-light arbitration protocol.
Abstract: The communication bus is one of the critical components in System-of-Chip design. The on-chip bus communication architecture impacts the overall performance of an SoC design. As design complexity increases, they become more error-prone, and verification becomes challenging. According to a few studies, verification consumes about 70% of the total development time and is considered a bottleneck in the ASIC design cycle. This paper focuses on the design and verification of the AXI4-Lite arbitration algorithm. We present a design and implementation of a high-performance AXI4-Lite interconnect using System Verilog (SV) to connect three managers and six subordinate devices. We performed the verification of an AXI4-Lite arbitration system using two methodologies i.e., SV and Universal Verification Methodology (UVM). Several test cases were applied to check all possible scenarios for arbitration verification. The SV and UVM verification techniques are compared in this paper, and it is observed that the UVM methodology is a better choice for verifying large and complex designs.

Journal ArticleDOI
TL;DR: In this article , an Extended Kalman Filter (EKF) filter was used for battery management system (BMS) in the context of open circuit voltage (OCV) voltage regulator.
Abstract: 배터리 관리 시스템(BMS: Battery Management System)은 전기자동차 배터리 팩 내에 발생하는 에너지 불균형을 개선하기 위해 셀 간의 에너지 밸런싱을 수행한다. 기존의 연구에서 에너지 밸런싱을 위해 주로 사용한 배터리 셀의 단자전압 정보나 미리 맵핑된 배터리 충전상태(SOC: State of Charge)와 개방회로전압(OCV: Open Circuit Voltage)표는 실제 배터리의 충전상태를 반영하지 못하므로 정확한 셀 밸런싱을 수행했다고 보기 어렵다. 본 논문은 확장 칼만 필터(EKF: Extended Kalman Filter) 기반의 SOC 추정을 통해 셀 밸런싱을 수행하고 기존의 전압기반 밸런싱 방법과의 비교분석을 통해 BMS의 효과적인 밸런싱 운용 방법을 제시한다. EKF 기반의 알고리즘 설계를 위한 등가회로 모델링과 단계별 파라미터 추출을 진행했고, SOC 추정 결과는 초기 짧은 구간을 제외한 SOC의 전 구간에서 1 % 미만의 낮은 오차를 보였다. 추정된 SOC 정보를 바탕으로 셀 간의 에너지 밸런싱을 수행하기 위해 벅-부스트 컨버터를 설계했고, 밸런싱 결과는 셀 간의 전압이 균등해지는 시점의 SOC가 3.97 %의 편차를 보임으로써, 밸런싱으로 인한 전압의 균등화와 실제 SOC의 균등화가 다소 차이가 있는 것을 확인했다. 본 논문에서 제안한 EKF 활용 SOC 추정 기반의 셀 밸런싱은 실제 SOC 기반 밸런싱의 분석 및 영향에 기여함으로써 BMS의 배터리 운용의 신뢰성을 향상시킬 것으로 기대된다.

Journal ArticleDOI
TL;DR: In this article , the authors highlight the work published in IEEE Computer Society journals, focusing on IEEE Transactions on Computers (TOCs) and IEEE Journal of Distributed Computing (JDC).
Abstract: This installment of Computer’s series highlighting the work published in IEEE Computer Society journals comes from IEEE Transactions on Computers.

Proceedings ArticleDOI
13 Mar 2023
TL;DR: In this paper , a trace monitoring of the transactions on the Advanced eXtensible Interface (AXI) interface of the interconnect is performed by programming different operational pointers and filters.
Abstract: The semiconductor industry has evolved significantly since its founding in 1950. Transistors and diodes are the primarily used electronic devices, but advancements in technology have led to more complex semiconductor devices, from printed circuit boards to multimillion gate design, i.e., a System on Chip (SoC) design. Almost 70–80 percent of the total SoC design effort is aimed at functional verification. In this paper, verification of an interconnect block in a processing system is presented. Trace monitoring of the transactions on the Advanced eXtensible Interface (AXI) interface of the interconnect is performed by programming different operational pointers and filters. Results were simulated from Synopsys—a Verilog Compiler Simulator (VCS) tool-2022v (Hyderabad, India).

Proceedings ArticleDOI
26 May 2023
TL;DR: Zhang et al. as mentioned in this paper used the Software-Based Self Test (SBST) method to explore and implement the function of data cache controller for Zynq-7000 series Programmable SoC embedded processors.
Abstract: With the progress of Integrated Circuit technology and the multiple needs of human for scalar and reconfigurable operations in intelligent electronic system, a new chip architecture combining traditional FPGA and embedded processor appears, namely programmable SoC. Programmable SoC products have been widely used in mission and safety-critical applications, but production defects in hardware and time-related defects in the working process often lead to system misbehavior, which leads to disastrous consequences. Therefore, in view of the long-term reliability application requirements of programmable SoC products, it is necessary to carry out relevant research on its in-field test technology. In this paper, we will use the Software-Based Self Test (SBST) method to explore and implement the function of data cache controller for Zynq-7000 series Programmable SoC embedded processors (Enable\Disable, Invalidate, Clean) test technology research. By making full use of the fully programmable features of hardware and software resources in Zynq-7000, we have respectively realized the test generation technology based on PS and the test observation technology based on PL, and designed the prototype test system. Finally, the experimental results show that the prototype test system meets our expectations and can realize the in-field test of the cache controller of the embedded processor in the Zynq-7000 series programmable SoC. At the end of this paper, we summarize the work done and look forward to the future research direction.


Proceedings ArticleDOI
01 Apr 2023
TL;DR: Proteus as discussed by the authors is a configurable and modular NoC simulator and RTL generator that uses HLS compiler to develop NoCs from a C++ description of the N oC circuit.
Abstract: Networks-on-chip (NoCs) form the backbone fabric for connecting multi-core SoCs containing several processor cores and memories. Design-space exploration (DSE) of NoCs is a crucial part of the SoC design process to ensure that it does not become a bottleneck. DSE today is often hindered by the inherent trade-off between software simulation vs hardware emulation/e- valuation. Software simulators are easily extendable and allow for the evaluation of new ideas but are not able to capture the hardware complexity. Meanwhile, RTL development is known to be time-consuming. This has forced DSE to use simulators followed by RTL development, evaluation and feedback, which slows down the overall design process. In an effort to tackle this problem, we present Proteus, a configurable and modular NoC simulator and RTL generator. Proteus is the first of its kind framework to use HLS compiler to develop NoCs from a C++ description of the N oC circuit. These generated N oCs can be simulated in software and tested on FPGAs. This allows users to do rapid DSE by providing the opportunity to tweak and test NoC architectures in real-time. We also compare Proteus-generated RTL with Chisel- generated and hand-written RTL in terms of area, timing and productivity. The ability to synthesize the NoC design on FPGAs can benefit large designs as the custom hardware results in faster run-time than cycle-accurate software simulators. Proteus is modeled similar to existing state-of-the-art simulators and offers users modifiable parameters to generate custom topologies, routing algorithms, and router microarchitectures.

Journal ArticleDOI
TL;DR: In this paper , the authors propose ReDeSIGN, a framework to reuse the DFD infrastructure during the in-flight operation for performance enhancement of NoC-based MPSoCs, which includes reuse of trace buffer as extended Virtual Channel (VC) for network throughput improvement, trace prioritization hardware for critical data prioritization, and packet monitor module for packet starvation control.
Abstract: Network-on-Chip (NoC) is considered as a scalable interconnect medium for Multiprocessor System-on-Chip (MPSoC) due to its ability to provide high bandwidth and low latency communication. With the increasing intricacy of the modern-day systems, the state-of-the-art NoCs are becoming extremely complex. Design-for-Debug (DFD) structures are integrated to the system for the validation of such complex modules during post-silicon debug. However, after the system validation and mass production, the DFD hardware remains vestigial on the design. In this context, we propose ReDeSIGN, a framework to reuse the DFD infrastructure during the in- fi eld operation for performance enhancement of the NoC-based MPSoCs. Major contributions of our work include reuse of (i) trace buffer as extended Virtual Channel (VC) for network throughput improvement, (ii) trace prioritization hardware for critical data prioritization, and (iii) packet monitor module for packet starvation control. Experimental evaluations with real benchmarks show an average of 11.46% increase in network throughput, 34.93% decrease in critical data latency, and 19.17% decrease in packet starvation for an 8x8 homogeneous system.



Journal ArticleDOI
01 Jan 2023
TL;DR: In this paper , an optimal energy-aware earliest deadline first scheduling (EA-EDF) based technique for multiprocessor environments with task migration that enhances the performance and efficiency in multi-core system-on-chip while lowering energy and power consumption is proposed.
Abstract: Increasing the life span and efficiency of Multiprocessor System on Chip (MPSoC) by reducing power and energy utilization has become a critical chip design challenge for multiprocessor systems. With the advancement of technology, the performance management of central processing unit (CPU) is changing. Power densities and thermal effects are quickly increasing in multi-core embedded technologies due to shrinking of chip size. When energy consumption reaches a threshold that creates a delay in complementary metal oxide semiconductor (CMOS) circuits and reduces the speed by 10%–15% because excessive on-chip temperature shortens the chip’s life cycle. In this paper, we address the scheduling & energy utilization problem by introducing and evaluating an optimal energy-aware earliest deadline first scheduling (EA-EDF) based technique for multiprocessor environments with task migration that enhances the performance and efficiency in multiprocessor system-on-chip while lowering energy and power consumption. The selection of core and migration of tasks prevents the system from reaching its maximum energy utilization while effectively using the dynamic power management (DPM) policy. Increase in the execution of tasks the temperature and utilization factor on-chip increases that dissipate more power. The proposed approach migrates such tasks to the core that produces less heat and consumes less power by distributing the load on other cores to lower the temperature and optimizes the duration of idle and sleep times across multiple CPUs. The performance of the EA-EDF algorithm was evaluated by an extensive set of experiments, where excellent results were reported when compared to other current techniques, the efficacy of the proposed methodology reduces the power and energy consumption by 4.3%–4.7% on a utilization of 6%, 36% & 46% at 520 & 624 MHz operating frequency when particularly in comparison to other energy-aware methods for MPSoCs. Tasks are running and accurately scheduled to make an energy-efficient processor by controlling and managing the thermal effects on-chip and optimizing the energy consumption of MPSoCs.

Journal ArticleDOI
TL;DR: In this article , a 16-cell stackable multi-channel battery monitoring integrated circuits (BMICs) are proposed to assist battery management systems in effectively managing battery data, which is the key to improving the reliability of electric vehicles (EVs).

Proceedings ArticleDOI
05 Feb 2023
TL;DR: In this article , the authors proposed a high-level AMBA (Advanced Microcontroller Bus Architecture) monitoring platform where various traffic statistics can be obtained with C++ modeling using open-source Verilator.
Abstract: As a System on Chip (SoC) hardware complexity grows dramatically, it becomes more difficult to find the optimal SoC architecture with various hardware IPs. Accordingly, SoC architecture exploration should be performed before the chip-level implementation where various types of on-chip interconnect topology are compared according to the on-chip traffic patterns from a number of hardware IPs in terms of area, transaction latency, power consumption, etc. In this paper, we propose a high-level AMBA (Advanced Microcontroller Bus Architecture) Monitoring Platform where various traffic statistics can be obtained with C++ modeling using open-source Verilator. For the evaluation, we built the baseline SoC platform with Arm Cortex-M4F CPU core and various hardware IPs. With the proposed high-level AMBA monitoring platform, the high-level C++ modeling of on-chip traffic analysis allows to find optimal AMBA on-chip interconnects in the early stages with fast analysis time based on the on-chip traffic analysis.

Journal ArticleDOI
01 Jan 2023
TL;DR: In this article , an energy-aware dynamic power management technique based on energy aware earliest deadline first (EA-EDF) scheduling is proposed for improving the performance and reliability by reducing energy and power consumption in the system on chip (SOC).
Abstract: Minimizing the energy consumption to increase the life span and performance of multiprocessor system on chip (MPSoC) has become an integral chip design issue for multiprocessor systems. The performance measurement of computational systems is changing with the advancement in technology. Due to shrinking and smaller chip size power densities on-chip are increasing rapidly that increasing chip temperature in multi-core embedded technologies. The operating speed of the device decreases when power consumption reaches a threshold that causes a delay in complementary metal oxide semiconductor (CMOS) circuits because high on-chip temperature adversely affects the life span of the chip. In this paper an energy-aware dynamic power management technique based on energy aware earliest deadline first (EA-EDF) scheduling is proposed for improving the performance and reliability by reducing energy and power consumption in the system on chip (SOC). Dynamic power management (DPM) enables MPSOC to reduce power and energy consumption by adopting a suitable core configuration for task migration. Task migration avoids peak temperature values in the multi-core system. High utilization factor ( on central processing unit (CPU) core consumes more energy and increases the temperature on-chip. Our technique switches the core by migrating such task to a core that has less temperature and is in a low power state. The proposed EA-EDF scheduling technique migrates load on different cores to attain stability in temperature among multiple cores of the CPU and optimized the duration of the idle and sleep periods to enable the low-temperature core. The effectiveness of the EA-EDF approach reduces the utilization and energy consumption compared to other existing methods and works. The simulation results show the improvement in performance by optimizing 4.8% on 9%, 16%, 23% and 25% at 520 MHz operating frequency as compared to other energy-aware techniques for MPSoCs when the least number of tasks is in running state and can schedule more tasks to make an energy-efficient processor by controlling and managing the energy consumption of MPSoC.

Journal ArticleDOI
TL;DR: In this article , the authors propose a runtime 3PIP Trojan detection framework, named IP-Tag, which is a tag-based structure to track the requests on SoC and enforce fine-grained access control in individual IPs.
Abstract: The complexity of modern system-on-chip (SoC) designs and the ever shortened time-to-market (TTM) makes the third-party intellectual property (3PIP) a cornerstone in the modern SoC supply chain. Various 3PIPs are involved in modern SoCs, performing functionality ranging from computation accelerating to sensitive data processing. The wide use of 3PIPs also raises security concerns, e.g., hardware Trojans inserted in 3PIPs may compromise the security of the whole system. While SoC integrators carefully evaluate the functionality of the acquired 3PIPs, there lack effective and low-cost solutions for third-party IP security validation in the SoC environment. Exacerbating the issue, Trojans may be located in multiple IPs and will only perform malicious tasks collaboratively. To address these limitations and to protect modern SoCs, we propose a runtime 3PIP Trojan detection framework. The new framework, named IP-Tag, is a tag-based structure to track the requests on SoC and enforce fine-grained access control in individual IPs. The proposed framework can detect and prevent illegal access and sensitive data leakage on IPs within the SoC environment. The proposed IP-Tag framework was demonstrated on an RISC-V-based SoC and also implemented on an FPGA platform for security and performance analysis. Our experimental results show that the developed IP-Tag can detect and prevent illegal access and sensitive data leakage in SoC with malicious IPs. The hardware overhead is 7.9% LUTs and 7.8% Flip-Flops and a performance overhead is 2.2%.

Proceedings ArticleDOI
19 Feb 2023
TL;DR: In this paper , a self-powered system-on-chip (SoC) that can be distributed along a fiber strand, capable of simultaneously harvesting energy, cooperatively scaling performance, sharing power, and booting-up with other SoCs in-fiber.
Abstract: Rapid reductions in power and size of SoC have paved the way for mm-scale textile-based self-powered systems capable of sensing a variety of biological and environmental data such as sodium, glucose, temperature, and neural signals [1,3-6]. SoCs built for these applications need to be fully autonomous and miniaturized, capable of continuous sensing at nW-level to operate from scarce amounts of harvested energy, and able to communicate in a distributed sensing network. A prior smart E-textile system [1] enables self-powered Na+ sensing but is built with sm-scale commercial-off-the-shelf (COTS) components that consume >4mW. A mm-scale system-in-fiber in [2] with COTS components requires batteries for $> 10\mu \mathrm{W}$ power. For systems using custom SoCs with nW power and mm-scale form factor [3–6], a base station is required to provide light (>3Klux [4], >60Klux [5]) to communicate and power the devices. This leads to reduced system autonomy and an inability for direct inter-SoC communication. We address these limitations with a fully autonomous self-powered system-on-chip (SoC) that can be distributed along a fiber strand, capable of simultaneously harvesting energy, cooperatively scaling performance, sharing power, and booting-up with other SoCs in-fiber. The SoC achieves 33nW power consumption for the whole chip under 92Lux light and can reduce control power down to 2.7nW for the energy harvesting and power management unit (EHPMU). With the proposed power sharing and cooperative dynamic voltage and frequency scaling (DVFS), the proposed SoC reduces the illuminance needed to stay alive by $> 7\times$ down to 12Lux. We integrate the SoC into a $2.2\times 1$ mm cross-section polymer fiber with an embedded electrical bus via a $4.7\times 3.7\text{mm}$ interposer board, as shown in Fig. 15.1.1 (bottom). The timing waveform in Fig. 15.1.1 (right) shows how the SoCs can cooperatively scale their performance based on both the local [7–9] and adjacent SoCs' conditions. This allows the energy and performance of all the in-fiber SoCs to be flexibly and jointly balanced, thereby improving the system viability and adaptability.

Proceedings ArticleDOI
26 Apr 2023
TL;DR: In this article , the authors proposed a test compression method that employs both an efficient dictionary and creating and capturing value collection to reduce test data quantity without affecting overall system performance, and applied this data compression method to the test patterns generated by the BIST technique, and the compressed data is then applied to the module being tested.
Abstract: BIST is a systematic methodology capable of addressing many of the issues encountered when testing systems-on-chip. However, larger registers are required to handle the large amount of test information produced in each clock cycle, which has a significant impact on overall circuit performance. Huge data volume generally requires not only more memory but also a longer testing time. The proposed design develops a test compression method that employs both an efficient dictionary and creating and capturing value collection to dramatically reduce testing high memory requirements. Data compression reduces test data quantity without affecting overall system performance. This data compression method is applied to the test patterns generated by the BIST technique, and the compressed data is then applied to the module being tested Following that, a simple processor is designed. The concept of Null Conventional Logic is commonly used in the testing of the basic processing units in the processor designed.

Proceedings ArticleDOI
19 May 2023
TL;DR: In this paper , the presence of multiple HTs in the routing unit is modelled and its impact analysis is done for both synthetic traffic and real benchmarks using gem5 simulator, and it can be observed that presence of the multiple trojans decrease the overall performance of the system.
Abstract: Tiled Chip Multicore Processors (TCMP) with packet switching Network-on-Chip (NoC) have become a common method for on-chip connectivity. The performance of the entire system may suffer if a malicious Hardware Trojan (HT) is present in the NoC routers as it might negatively disrupt communication between tiles. Detecting Trojans in complicated multi-processor System on Chips (SoCs) using traditional pre and post silicon validation approaches is a huge difficulty. In this paper the presence of multiple HTs in the routing unit is modelled and its impact analysis is done for both synthetic traffic and real benchmarks using gem5 simulator. It can be observed that the presence of multiple trojans decrease the overall performance of the system.

Journal ArticleDOI
TL;DR: In this paper , the authors explore multiple FCA design configurations and demonstrate that this technology can decrease the temperature of a heterogeneous 3-D MPSoC by 78 °C, and its total power consumption by 46%, compared to a high-performance cold-plate-based liquid cooling solution.
Abstract: Flow cell arrays (FCAs) concurrently provide efficient on-chip liquid cooling and electrochemical power generation. This technology is especially promising for 3-D multiprocessor systems-on-chip (3-D MPSoCs) realized in deeply scaled technologies, which present very challenging power and thermal requirements. Indeed, FCAs effectively improve power delivery network (PDN) performance, particularly if switched capacitor (SC) converters are employed to decouple the flow cells and the systems-on-chip voltages, allowing each to operate at their optimal point. Nonetheless, the design of FCA-based solutions entails nonobvious considerations and tradeoffs, stemming from their dual role in governing both the thermal and power delivery characteristics of 3-D MPSoCs. Showcasing them in this article, we explore multiple FCA design configurations and demonstrate that this technology can decrease the temperature of a heterogeneous 3-D MPSoC by 78 °C, and its total power consumption by 46%, compared to a high-performance cold-plate-based liquid cooling solution. At the same time, FCAs enable up to 90% voltage drop recovery across dies, using SC converters occupying a small fraction of the chip area. Such outcomes provide an opportunity to boost 3-D MPSoC computing performance by increasing the operating frequency of dies. Leveraging these results, we introduce a novel temperature and voltage-aware model-predictive control (MPC) strategy that optimizes power efficiency during runtime. We achieve application-wide speedups of up to 16% on various machine learning (ML), data mining, and other high-performance benchmarks while keeping the 3-D MPSoC temperature below 83 °C and voltage drops below 5%.

Proceedings ArticleDOI
17 Apr 2023
TL;DR: In this article , the main phase of AI SoC design where should be considered, and the best practices of the joint AI Chip Design Lab with ITRI are discussed, which greatly reduce the risks and development cycle.
Abstract: While Artificial Neural Network (ANN) has been well developed in different aspects, it is still a hard topic for IC designer to develop an AI SoC that can fulfill the performance requirements while also maintain the balance of power and die area for AI applications. The large design space variations drive the differences of the SoC architecture for different AI applications. The classic approach is to build some kind of high-level analytical model, write up an implementation spec and proceed the HW design. However, the first time you can really quantitatively measure your power and performance KPIs is when you run the SW image and the HW RTL on an emulator or FPGA prototype. This data comes too late in the process from the architecture perspective. Also, the turnaround time is too long for architecture design exploration and optimization. The continue evolving AI algorithms and new market requirements also make the chip design even harder. In this session, we will review the main phase of AI SoC design where should be considered, and the best practices of the joint AI Chip Design Lab with ITRI. The reference design flow that helps to exploring the optimized AI SoC architecture. MobileNet is used as a benchmark application for architecture exploration. How to minimize the power and energy consumption for targeting an inference latency of 4 milliseconds for processing 5 frames of MobileNet (1250frames/sec); HW/SW co-design and the AI SoC reference design that greatly reduce the risks and development cycle.

Proceedings ArticleDOI
06 Jun 2023
TL;DR: In this article , the authors present a Zynq 7020 FPGA implementation and evaluation of a middle-sized dense neural network based on approximate computation by linearly approximated functions.
Abstract: Integrating artificial intelligence technologies into embedded systems requires efficient implementation of neural networks in hardware. The paper presents a Zynq 7020 FPGA implementation and evaluation of a middle-sized dense neural network based on approximate computation by linearly approximated functions. Three famous benchmarks were used for classification accuracy evaluation and hardware testing. We use our highly pipelined neural hardware architecture that takes weights from block RAMs to save logic resources and enables their update from the processing system. The architecture reaches excellent design scalability, allowing us to estimate the number of neurons implemented in programmable logic based on single-neuron resources. We reached nearly full chip utilization while preserving the high clock freuuency for the FPGA used.

Journal ArticleDOI
TL;DR: In this article , the authors proposed an operator design method to improve the efficiency of SoC design and also designed algorithmic operators using pulse signal processing as an example to enhance the efficiency in designing health detection SoCs.
Abstract: Improving the efficiency of System-on-Chip(SoC) design has always been a hot topic in SoC design methods. The quality of SoC architecture design directly affects the core elements of SoC performance and cost. However, architecture design usually requires highly experienced and skilled personnel, which are scarce resources. Operators in SoC, like instructions in a CPU, are extracted from algorithms and are both universal and flexible. SoC can be designed based on operators, and algorithmic functions can be implemented through operators’ interconnection. In this paper, we propose an operator design method to improve the efficiency of SoC design. Additionally, we design algorithmic operators using pulse signal processing as an example to enhance the efficiency of designing health detection SoCs.