scispace - formally typeset
Search or ask a question
Topic

System on a chip

About: System on a chip is a research topic. Over the lifetime, 11331 publications have been published within this topic receiving 147395 citations. The topic is also known as: system-on-a-chip & SOC.


Papers
More filters
Proceedings ArticleDOI
13 Jun 2015
TL;DR: This work argues that the conventional concept of processing-in-memory (PIM) can be a viable solution to achieve memory-capacity-proportional performance and designs a programmable PIM accelerator for large-scale graph processing called Tesseract.
Abstract: The explosion of digital data and the ever-growing need for fast data analysis have made in-memory big-data processing in computer systems increasingly important. In particular, large-scale graph processing is gaining attention due to its broad applicability from social science to machine learning. However, scalable hardware design that can efficiently process large graphs in main memory is still an open problem. Ideally, cost-effective and scalable graph processing systems can be realized by building a system whose performance increases proportionally with the sizes of graphs that can be stored in the system, which is extremely challenging in conventional systems due to severe memory bandwidth limitations. In this work, we argue that the conventional concept of processing-in-memory (PIM) can be a viable solution to achieve such an objective. The key modern enabler for PIM is the recent advancement of the 3D integration technology that facilitates stacking logic and memory dies in a single package, which was not available when the PIM concept was originally examined. In order to take advantage of such a new technology to enable memory-capacity-proportional performance, we design a programmable PIM accelerator for large-scale graph processing called Tesseract. Tesseract is composed of (1) a new hardware architecture that fully utilizes the available memory bandwidth, (2) an efficient method of communication between different memory partitions, and (3) a programming interface that reflects and exploits the unique hardware design. It also includes two hardware prefetchers specialized for memory access patterns of graph processing, which operate based on the hints provided by our programming model. Our comprehensive evaluations using five state-of-the-art graph processing workloads with large real-world graphs show that the proposed architecture improves average system performance by a factor of ten and achieves 87% average energy reduction over conventional systems.

718 citations

Proceedings ArticleDOI
16 Feb 2004
TL;DR: NMAP is presented, a fast algorithm that maps the cores onto a mesh NoC architecture under bandwidth constraints, minimizing the average communication delay, and the NMAP algorithm is presented for both single minimum-path routing and split-traffic routing.
Abstract: We address the design of complex monolithic systems, where processing cores generate and consume a varying and large amount of data, thus bringing the communication links to the edge of congestion. Typical applications are in the area of multi-media processing. We consider a mesh-based networks on chip (NoC) architecture, and we explore the assignment of cores to mesh cross-points so that the traffic on links satisfies bandwidth constraints. A single-path deterministic routing between the cores places high bandwidth demands on the links. The bandwidth requirements can be significantly reduced by splitting the traffic between the cores across multiple paths. In this paper, we present NMAP, a fast algorithm that maps the cores onto a mesh NoC architecture under bandwidth constraints, minimizing the average communication delay. The NMAP algorithm is presented for both single minimum-path routing and split-traffic routing. The algorithm is applied to a benchmark DSP design and the resulting NoC is built and simulated at cycle accurate level in SystemC using macros from the /spl times/pipes library. Also, experiments with six video processing applications show significant savings in bandwidth and communication cost for NMAP algorithm when compared to existing algorithms.

714 citations

Journal ArticleDOI
TL;DR: In this article, the authors demonstrate a new resonator with a record Q-factor of 875 million for on-chip devices, which sets a new benchmark for the Q factor on a chip, and also provides full compatibility of this important device class with conventional semiconductor processing.
Abstract: Ultrahigh-Q optical resonators are being studied across a wide range of fields, including quantum information, nonlinear optics, cavity optomechanics and telecommunications. Here, we demonstrate a new resonator with a record Q-factor of 875 million for on-chip devices. The fabrication of our device avoids the requirement for a specialized processing step, which in microtoroid resonators8 has made it difficult to control their size and achieve millimetre- and centimetre-scale diameters. Attaining these sizes is important in applications such as microcombs and potentially also in rotation sensing. As an application of size control, stimulated Brillouin lasers incorporating our device are demonstrated. The resonators not only set a new benchmark for the Q-factor on a chip, but also provide, for the first time, full compatibility of this important device class with conventional semiconductor processing. This feature will greatly expand the range of possible ‘system on a chip’ functions enabled by ultrahigh-Q devices.

632 citations

Proceedings ArticleDOI
29 Aug 2005
TL;DR: A CELL processor is a multi-core chip consisting of a 64b power architecture processor, multiple streaming processors, a flexible IO interface, and a memory interface controller that is implemented in 90nm SOI technology.
Abstract: A CELL processor is a multi-core chip consisting of a 64b power architecture processor, multiple streaming processors, a flexible IO interface, and a memory interface controller This SoC is implemented in 90nm SOI technology The chip is designed with a high degree of modularity and reuse to maximize the custom circuit content and achieve a high-frequency clock-rate

611 citations

Proceedings ArticleDOI
21 Jan 2003
TL;DR: An algorithm which automatically maps the IPs/cores onto a generic regular Network on Chip (NoC) architecture such that the total communication energy is minimized and the performance of the mapped system is guaranteed to satisfy the specified constraints through bandwidth reservation.
Abstract: In this paper, we present an algorithm which automatically maps the IPs/cores onto a generic regular Network on Chip (NoC) architecture such that the total communication energy is minimized At the same time, the performance of the mapped system is guaranteed to satisfy the specified constraints through bandwidth reservation As the main contribution, we first formulate the problem of energy-aware mapping, in a topological sense, and then propose an efficient branch-and-bound algorithm to solve it Experimental results show that the proposed algorithm is very fast and robust, and significant energy savings can be achieved For instance, for a complex video/audio SoC design, on average, 604% energy savings have been observed compared to an ad-hoc implementation

585 citations


Network Information
Related Topics (5)
CMOS
81.3K papers, 1.1M citations
94% related
Integrated circuit
82.7K papers, 1M citations
91% related
Electronic circuit
114.2K papers, 971.5K citations
88% related
Semiconductor memory
45.4K papers, 663.1K citations
87% related
Transistor
138K papers, 1.4M citations
86% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202337
202289
2021247
2020327
2019360
2018426