scispace - formally typeset
Search or ask a question

Brick and mortar chip fabrication

01 Jan 2008-
TL;DR: Brick and mortar chips are introduced, which aim to obtain the benefits of Moore's Law without the financial side effects, and software partitioning and mapping techniques which balance communication costs against computational resource contention are developed.
Abstract: While Moore's Law has advanced the semiconductor and technology industries, it has simultaneously driven up the cost of engineering a chip in a modern silicon process. The result is that fewer and fewer chips are produced in larger and larger volumes, stifling hardware diversity. This thesis introduces brick and mortar chips, which aim to obtain the benefits of Moore's Law without the financial side effects. Brick and mortar chips are made from small, pre-fabricated hardware components (called bricks) that are bonded in a designer-specified arrangement to a communication backbone chip which serves as the mortar (called the I/O cap). Our research examines several aspects of this chip manufacturing system. We develop a family of functional bricks, demonstrating a methodology for developing families that make efficient use of physical computation and communication resources. For high-performance communication between arbitrary combinations of bricks we propose a polymorphic on-chip network. This network allows a single I/O cap to be configured to implement the ideal network for any particular application. We analyze a low-cost, physical component assembly technique called fluidic self-assembly, and find that the chip production rate is intertwined with the architectural design of the components. To minimize application execution time on these partitioned chips, we develop software partitioning and mapping techniques which balance communication costs against computational resource contention. We close with a case study: an analysis of a brick and mortar implementation of a chip multiprocessor. Despite this being a highly latency sensitive design, our measurements indicate a worst case 36% average slowdown in application execution compared to a traditional, monolithic chip. Based on this, our cost analysis, and a survey of related technologies, we conclude that brick and mortar offers the best available performance for its price.
Citations
More filters
23 Jul 2003
TL;DR: In this article, a low power high speed chip-to-chip interface scheme is described having a density of 625pins/mm/sup 2/. The interface utilizes capacitively coupled contactless minipads, returnto-half-V/sub 00/ signaling and sense amplifying F/F.
Abstract: A low-power high-speed chip-to-chip interface scheme is described having a density of 625pins/mm/sup 2/. The interface utilizes capacitively coupled contactless minipads, return-to-half-V/sub 00/ signaling and sense amplifying F/F. The measured test chip fabricated in 0.35/spl mu/m CMOS delivers up to 1.27Gb/s/pin at 3mW/pin.

4 citations

Proceedings ArticleDOI
13 Jun 2005
TL;DR: This lively panel will discuss whether it is FPGAs, structured/platform ASICs, or something else that stand to gain the most ground from the projected $25B ASIC market, and why.
Abstract: Moore's law delivers higher performance and lower cost for FPGAs and ASICs alike, but at the 90nm process node and below, design schedules using the traditional cell-based ASIC design methodology hit a wall of uncertainty. At 90nm and below an emerging alternative ASIC design platform is either Platform ASIC or FPGAs. Which way will the cell-based ASIC designer turn for their next design?Over time, FPGAs and structured/platform ASICs are together poised to replace today's cell-based ASIC market, but which is the real answer to future digital design? Can companies really use these platforms to achieve the system cost reduction and functionality that they need to stay competitive? Which applications will migrate to these platforms the fastest? Is it possible to just tweak the existing cell-based methodology to more efficiently reach the benefits of 90nm process nodes and below? This lively panel will discuss whether it is FPGAs, structured/platform ASICs, or something else that stand to gain the most ground from the projected $25B ASIC market, and why.

1 citations

References
More filters
Proceedings ArticleDOI
01 May 1995
TL;DR: This paper quantitatively characterize the SPLASH-2 programs in terms of fundamental properties and architectural interactions that are important to understand them well, including the computational load balance, communication to computation ratio and traffic needs, important working set sizes, and issues related to spatial locality.
Abstract: The SPLASH-2 suite of parallel applications has recently been released to facilitate the study of centralized and distributed shared-address-space multiprocessors. In this context, this paper has two goals. One is to quantitatively characterize the SPLASH-2 programs in terms of fundamental properties and architectural interactions that are important to understand them well. The properties we study include the computational load balance, communication to computation ratio and traffic needs, important working set sizes, and issues related to spatial locality, as well as how these properties scale with problem size and the number of processors. The other, related goal is methodological: to assist people who will use the programs in architectural evaluations to prune the space of application and machine parameters in an informed and meaningful way. For example, by characterizing the working sets of the applications, we describe which operating points in terms of cache size and problem size are representative of realistic situations, which are not, and which re redundant. Using SPLASH-2 as an example, we hope to convey the importance of understanding the interplay of problem size, number of processors, and working sets in designing experiments and interpreting their results.

4,002 citations


"Brick and mortar chip fabrication" refers background or methods in this paper

  • ...We ran the SPLASH2 [145] suite of multi-threaded benchmarks....

    [...]

  • ...We scheduled nine sample applications from the Spec2000 [122] and SPLASH2 [145] benchmark suites (art, equake, gzip, mcf, radix, twolf and fft, lu, ocean, respectively)....

    [...]

  • ...We find the SPLASH2 [145] benchmarks execute an average of 36% slower on a brick and mortar chip....

    [...]

Book
01 Jan 2004
TL;DR: This book offers a detailed and comprehensive presentation of the basic principles of interconnection network design, clearly illustrating them with numerous examples, chapter exercises, and case studies, allowing a designer to see all the steps of the process from abstract design to concrete implementation.
Abstract: One of the greatest challenges faced by designers of digital systems is optimizing the communication and interconnection between system components. Interconnection networks offer an attractive and economical solution to this communication crisis and are fast becoming pervasive in digital systems. Current trends suggest that this communication bottleneck will be even more problematic when designing future generations of machines. Consequently, the anatomy of an interconnection network router and science of interconnection network design will only grow in importance in the coming years. This book offers a detailed and comprehensive presentation of the basic principles of interconnection network design, clearly illustrating them with numerous examples, chapter exercises, and case studies. It incorporates hardware-level descriptions of concepts, allowing a designer to see all the steps of the process from abstract design to concrete implementation. ·Case studies throughout the book draw on extensive author experience in designing interconnection networks over a period of more than twenty years, providing real world examples of what works, and what doesn't. ·Tightly couples concepts with implementation costs to facilitate a deeper understanding of the tradeoffs in the design of a practical network. ·A set of examples and exercises in every chapter help the reader to fully understand all the implications of every design decision. Table of Contents Chapter 1 Introduction to Interconnection Networks 1.1 Three Questions About Interconnection Networks 1.2 Uses of Interconnection Networks 1.3 Network Basics 1.4 History 1.5 Organization of this Book Chapter 2 A Simple Interconnection Network 2.1 Network Specifications and Constraints 2.2 Topology 2.3 Routing 2.4 Flow Control 2.5 Router Design 2.6 Performance Analysis 2.7 Exercises Chapter 3 Topology Basics 3.1 Nomenclature 3.2 Traffic Patterns 3.3 Performance 3.4 Packaging Cost 3.5 Case Study: The SGI Origin 2000 3.6 Bibliographic Notes 3.7 Exercises Chapter 4 Butterfly Networks 4.1 The Structure of Butterfly Networks 4.2 Isomorphic Butterflies 4.3 Performance and Packaging Cost 4.4 Path Diversity and Extra Stages 4.5 Case Study: The BBN Butterfly 4.6 Bibliographic Notes 4.7 Exercises Chapter 5 Torus Networks 5.1 The Structure of Torus Networks 5.2 Performance 5.3 Building Mesh and Torus Networks 5.4 Express Cubes 5.5 Case Study: The MIT J-Machine 5.6 Bibliographic Notes 5.7 Exercises Chapter 6 Non-Blocking Networks 6.1 Non-Blocking vs. Non-Interfering Networks 6.2 Crossbar Networks 6.3 Clos Networks 6.4 Benes Networks 6.5 Sorting Networks 6.6 Case Study: The Velio VC2002 (Zeus) Grooming Switch 6.7 Bibliographic Notes 6.8 Exercises Chapter 7 Slicing and Dicing 7.1 Concentrators and Distributors 7.2 Slicing and Dicing 7.3 Slicing Multistage Networks 7.4 Case Study: Bit Slicing in the Tiny Tera 7.5 Bibliographic Notes 7.6 Exercises Chapter 8 Routing Basics 8.1 A Routing Example 8.2 Taxonomy of Routing Algorithms 8.3 The Routing Relation 8.4 Deterministic Routing 8.5 Case Study: Dimension-Order Routing in the Cray T3D 8.6 Bibliographic Notes 8.7 Exercises Chapter 9 Oblivious Routing 9.1 Valiant's Randomized Routing Algorithm 9.2 Minimal Oblivious Routing 9.3 Load-Balanced Oblivious Routing 9.4 Analysis of Oblivious Routing 9.5 Case Study: Oblivious Routing in the Avici Terabit Switch Router(TSR) 9.6 Bibliographic Notes 9.7 Exercises Chapter 10 Adaptive Routing 10.1 Adaptive Routing Basics 10.2 Minimal Adaptive Routing 10.3 Fully Adaptive Routing 10.4 Load-Balanced Adaptive Routing 10.5 Search-Based Routing 10.6 Case Study: Adaptive Routing in the Thinking Machines CM-5 10.7 Bibliographic Notes 10.8 Exercises Chapter 11 Routing Mechanics 11.1 Table-Based Routing 11.2 Algorithmic Routing 11.3 Case Study: Oblivious Source Routing in the IBM Vulcan Network 11.4 Bibliographic Notes 11.5 Exercises Chapter 12 Flow Control Basics 12.1 Resources and Allocation Units 12.2 Bufferless Flow Control 12.3 Circuit Switching 12.4 Bibliographic Notes 12.5 Exercises Chapter 13 Buffered Flow Control 13.1 Packet-Buffer Flow Control 13.2 Flit-Buffer Flow Control 13.3 Buffer Management and Backpressure 13.4 Flit-Reservation Flow Control 13.5 Bibliographic Notes 13.6 Exercises Chapter 14 Deadlock and Livelock 14.1 Deadlock 14.2 Deadlock Avoidance 14.3 Adaptive Routing 14.4 Deadlock Recovery 14.5 Livelock 14.6 Case Study: Deadlock Avoidance in the Cray T3E 14.7 Bibliographic Notes 14.8 Exercises Chapter 15 Quality of Service 15.1 Service Classes and Service Contracts 15.2 Burstiness and Network Delays 15.3 Implementation of Guaranteed Services 15.4 Implementation of Best-Effort Services 15.5 Separation of Resources 15.6 Case Study: ATM Service Classes 15.7 Case Study: Virtual Networks in the Avici TSR 15.8 Bibliographic Notes 15.9 Exercises Chapter 16 Router Architecture 16.1 Basic Router Architecture 16.2 Stalls 16.3 Closing the Loop with Credits 16.4 Reallocating a Channel 16.5 Speculation and Lookahead 16.6 Flit and Credit Encoding 16.7 Case Study: The Alpha 21364 Router 16.8 Bibliographic Notes 16.9 Exercises Chapter 17 Router Datapath Components 17.1 Input Buffer Organization 17.2 Switches 17.3 Output Organization 17.4 Case Study: The Datapath of the IBM Colony Router 17.5 Bibliographic Notes 17.6 Exercises Chapter 18 Arbitration 18.1 Arbitration Timing 18.2 Fairness 18.3 Fixed Priority Arbiter 18.4 Variable Priority Iterative Arbiters 18.5 Matrix Arbiter 18.6 Queuing Arbiter 18.7 Exercises Chapter 19 Allocation 19.1 Representations 19.2 Exact Algorithms 19.3 Separable Allocators 19.4 Wavefront Allocator 19.5 Incremental vs. Batch Allocation 19.6 Multistage Allocation 19.7 Performance of Allocators 19.8 Case Study: The Tiny Tera Allocator 19.9 Bibliographic Notes 19.10 Exercises Chapter 20 Network Interfaces 20.1 Processor-Network Interface 20.2 Shared-Memory Interface 20.3 Line-Fabric Interface 20.4 Case Study: The MIT M-Machine Network Interface 20.5 Bibliographic Notes 20.6 Exercises Chapter 21 Error Control 411 21.1 Know Thy Enemy: Failure Modes and Fault Models 21.2 The Error Control Process: Detection, Containment, and Recovery 21.3 Link Level Error Control 21.4 Router Error Control 21.5 Network-Level Error Control 21.6 End-to-end Error Control 21.7 Bibliographic Notes 21.8 Exercises Chapter 22 Buses 22.1 Bus Basics 22.2 Bus Arbitration 22.3 High Performance Bus Protocol 22.4 From Buses to Networks 22.5 Case Study: The PCI Bus 22.6 Bibliographic Notes 22.7 Exercises Chapter 23 Performance Analysis 23.1 Measures of Interconnection Network Performance 23.2 Analysis 23.3 Validation 23.4 Case Study: Efficiency and Loss in the BBN Monarch Network 23.5 Bibliographic Notes 23.6 Exercises Chapter 24 Simulation 24.1 Levels of Detail 24.2 Network Workloads 24.3 Simulation Measurements 24.4 Simulator Design 24.5 Bibliographic Notes 24.6 Exercises Chapter 25 Simulation Examples 495 25.1 Routing 25.2 Flow Control Performance 25.3 Fault Tolerance Appendix A Nomenclature Appendix B Glossary Appendix C Network Simulator

3,233 citations


"Brick and mortar chip fabrication" refers background or methods in this paper

  • ...This previous art used a variety of topologies, amongst them those included in our study: fat tree [78], flattened butterfly [69], mesh [37] and ring [37]....

    [...]

  • ...The design space includes two buffered arbitration policies: store-and-forward and wormhole [37] the latter of which reserves and preserves a connection at each switch until all packets in a...

    [...]

Journal ArticleDOI
TL;DR: In this paper, a lower critical solution temperature of poly(N-isopropyl acrylamide was found to be due to an entropy effect, which was attributed to the formation of nonpolar and intermolecular hydrogen bonds.
Abstract: Aqueous solutions of poly(N-isopropyl acrylamide) show a lower critical solution temperature. The thermodynamic properties of the system have been evaluated from the phase diagram and the heat absorbed during phase separation and the phenomenon is ascribed to be primarily due to an entropy effect. From viscosity, sedimentation, and light-scattering studies of solutions close to conditions of phase separation, it appears that aggregation due to formation of nonpolar and intermolecular hydrogen bonds is important. In addition, a weakening of the ordering effect of the water-amide hydrogen bonds as the temperature is raised contributes to the stability of the two-phase system.

2,698 citations


"Brick and mortar chip fabrication" refers background in this paper

  • ...The substrate on which bricks are assembled can be coated with a polymer pNIPAM (poly-N-isopropylacrylamide [55, 96, 58, 148]) which can be reversibly switched between hydrophobic and hydrophilic states through a small change in the local temperature of the binding site....

    [...]

Journal ArticleDOI
TL;DR: Simics is a platform for full system simulation that can run actual firmware and completely unmodified kernel and driver code, and it provides both functional accuracy for running commercial workloads and sufficient timing accuracy to interface to detailed hardware models.
Abstract: Full system simulation seeks to strike a balance between accuracy and performance. Many of its possibilities have been obvious to practitioners in both academia and industry for quite some time, perhaps decades, but Simics supports more of these possibilities within a single framework than other tools do. Simics is a platform for full system simulation that can run actual firmware and completely unmodified kernel and driver code. It is sufficiently abstract to achieve tolerable performance levels, and it provides both functional accuracy for running commercial workloads and sufficient timing accuracy to interface to detailed hardware models. Simics can also run a heterogeneous network of systems from different vendors within the same framework. Exceptionally fast, Simics can easily add new components and leverage older ones within a practical abstraction level. It offers a platform with a rich API and a powerful scripting environment for use in a broad range of applications.

2,133 citations


"Brick and mortar chip fabrication" refers methods in this paper

  • ...1 CMP simulator To time the execution of applications on the CMP we used the Virtutech Simics [82] simulation framework and GEMS [85] tool set....

    [...]

Book
01 Jun 1994
TL;DR: A deadlock-free routing algorithm can be generated for arbitrary interconnection networks using the concept of virtual channels, which is used to develop deadlocked routing algorithms for k-ary n-cubes, for cube-connected cycles, and for shuffle-exchange networks.
Abstract: A deadlock-free routing algorithm can be generated for arbitrary interconnection networks using the concept of virtual channels. A necessary and sufficient condition for deadlockfree routing is the absence of cycles in the channel dependency graph. Given an arbitrary network and a routing function, the cycles of the channel dependency graph can be removed by splitting physical channels into groups of virtual channels. This method is used to develop deadlock-free routing algorithms for k-ary n-cubes, for cube connected cycles, and for shuffle? exchange networks. (This is a revised version of 5206-tr-86)

2,035 citations


"Brick and mortar chip fabrication" refers methods in this paper

  • ...One way to work around this, which we employ, is to implement wormhole arbitration [36], allowing a lead packet to carry the route and establish necessary connections, that are used by the subsequent packets belonging to the same message....

    [...]