scispace - formally typeset
Search or ask a question
Author

Zvonko G. Vranesic

Bio: Zvonko G. Vranesic is an academic researcher from University of Toronto. The author has contributed to research in topics: Logic gate & Field-programmable gate array. The author has an hindex of 27, co-authored 100 publications receiving 3243 citations.


Papers
More filters
Proceedings ArticleDOI
01 Jun 1991
TL;DR: A new technology mapping algorithm for lookup tablebased Field Programmable Gate Arrays (FPGA) is presented, the major innovation is a method for choosing gate-level decompositions based on bin packing that is up to 28 times faster than a previous exhaustive approach.
Abstract: A new technology mapping algorithm for lookup tablebased Field Programmable Gate Arrays (FPGA) is presented. The major innovation is a method for choosing gate-level decompositions based on bin packing. This approach is up to 28 times faster than a previous exhaustive approach. The algorithm also exploits reconvergent paths and replication of logic at fanout nodes to reduce the number of lookup tables in the circuit. The new algorithm is implemented in the Chortle-crf program. In an experimental comparison Chortle-crf requires 14 YO fewer lookup tables than Chortle [Fran90] and 10 ~o fewer lookup tables than mis-pga [Murg90a] to implement a set of benchmark networks. Chortle-crf can also implement a network as a circuit of Xilinx 3000 series Configurable Logic Blocks (CLBS). To implement the benchmark networks as circuits of CLBS Chortle-crf requires 12 70 fewer CLBS than mis-pga and 22 % fewer CLBS than XNFOPT [Xili89]. In these experiments Chortle-crf waa an average of 68 times faster than mis-pga and 30 times faster than XNFOPT. 1

277 citations

Proceedings ArticleDOI
01 Dec 1997
TL;DR: A static instruction scheduling algorithm is developed that for the configurations considered the multicluster architecture may have significant performance advantages at feature sizes below 0.35 /spl mu/m, and warrants further investigation.
Abstract: The multicluster architecture that we introduce offers a decentralized, dynamically scheduled architecture, in which the register files, dispatch queue, and functional units of the architecture are distributed across multiple clusters, and each cluster is assigned a subset of the architectural registers. The motivation for the multicluster architecture is to reduce the clock cycle time, relative to a single-cluster architecture with the same number of hardware resources, by reducing the size and complexity of components on critical timing paths. Resource partitioning, however, introduces instruction-execution overhead and may reduce the number of concurrently executing instructions. To counter these two negative by-products of partitioning, we developed a static instruction scheduling algorithm. We describe this algorithm, and using trace-driven simulations of SPEC92 benchmarks, evaluate its effectiveness. This evaluation indicates that for the configurations considered the multicluster architecture may have significant performance advantages at feature sizes below 0.35 /spl mu/m, and warrants further investigation.

275 citations

Book
01 Aug 1984
TL;DR: The book highlights modern developments in computer design,I/O and performance and presents real system examples from around the world.
Abstract: From the Publisher: Always praised for its comprehensive yet accessible treatment and real system examples. The book highlights modern developments in computer design,I/O and performance.

248 citations

Journal ArticleDOI
11 Nov 1990
TL;DR: A detailed routing algorithm, called the coarse graph expander (CGE), that has been designed specifically for field-programmable gate arrays (FPGAs) is described, which can route relatively large FPGAs in very close to the minimum number of tracks as determined by global routing.
Abstract: A detailed routing algorithm, called the coarse graph expander (CGE), that has been designed specifically for field-programmable gate arrays (FPGAs) is described. The algorithm approaches this problem in a general way, allowing it to be used over a wide range of different FPGA routing architectures. It addresses the issue of scarce routing resources by considering the side effects that the routing of one connection has on another, and also has the ability to optimize the routing delays of time-critical connections. CGE has been used to obtain excellent routing results for several industrial circuits implemented in FPGAs with various routing architectures. The results show that CGE can route relatively large FPGAs in very close to the minimum number of tracks as determined by global routing, and it can successfully optimize the routing delays of time-critical connections. CGE has a linear run time over circuit size. >

154 citations

Proceedings ArticleDOI
11 Nov 1991
TL;DR: A novel technology mapping algorithm that reduces the delay of combinational circuits implemented with lookup-table-based field-programmable gate arrays (FPGAs) by reducing the number of lookup tables on the critical path.
Abstract: A novel technology mapping algorithm that reduces the delay of combinational circuits implemented with lookup-table-based field-programmable gate arrays (FPGAs) is presented. The algorithm reduces the contribution of logic block delays to the critical path delay by reducing the number of lookup tables on the critical path. The key feature of the algorithm is the use of bin packing to determine the gate-level decomposition of every node in the network. In addition, reconvergent paths and the replication of logic at fanout nodes are exploited to further reduce the depth of the lookup table circuit. For fanout-free trees the algorithm will construct the optimal depth K-input table circuit when K is less than or equal to 6. >

146 citations


Cited by
More filters
Journal Article
TL;DR: This paper provides an overview of SIS and contains descriptions of the input specification, STG (state transition graph) manipulation, new logic optimization and verification algorithms, ASTG (asynchronous signal transition graph] manipulation, and synthesis for PGA’s (programmable gate arrays).
Abstract: SIS is an interactive tool for synthesis and optimization of sequential circuits Given a state transition table, a signal transition graph, or a logic-level description of a sequential circuit, it produces an optimized net-list in the target technology while preserving the sequential input-output behavior Many different programs and algorithms have been integrated into SIS, allowing the user to choose among a variety of techniques at each stage of the process It is built on top of MISII [5] and includes all (combinational) optimization techniques therein as well as many enhancements SIS serves as both a framework within which various algorithms can be tested and compared, and as a tool for automatic synthesis and optimization of sequential circuits This paper provides an overview of SIS The first part contains descriptions of the input specification, STG (state transition graph) manipulation, new logic optimization and verification algorithms, ASTG (asynchronous signal transition graph) manipulation, and synthesis for PGA’s (programmable gate arrays) The second part contains a tutorial example illustrating the design process using SIS

1,854 citations

Journal ArticleDOI
TL;DR: The hardware aspects of reconfigurable computing machines, from single chip architectures to multi-chip systems, including internal structures and external coupling are explored, and the software that targets these machines is focused on.
Abstract: Due to its potential to greatly accelerate a wide variety of applications, reconfigurable computing has become a subject of a great deal of research. Its key feature is the ability to perform computations in hardware to increase performance, while retaining much of the flexibility of a software solution. In this survey, we explore the hardware aspects of reconfigurable computing machines, from single chip architectures to multi-chip systems, including internal structures and external coupling. We also focus on the software that targets these machines, such as compilation tools that map high-level algorithms directly to the reconfigurable substrate. Finally, we consider the issues involved in run-time reconfigurable systems, which reuse the configurable hardware during program execution.

1,666 citations

Book
15 Aug 1998
TL;DR: This book explains the forces behind this convergence of shared-memory, message-passing, data parallel, and data-driven computing architectures and provides comprehensive discussions of parallel programming for high performance and of workload-driven evaluation, based on understanding hardware-software interactions.
Abstract: The most exciting development in parallel computer architecture is the convergence of traditionally disparate approaches on a common machine structure. This book explains the forces behind this convergence of shared-memory, message-passing, data parallel, and data-driven computing architectures. It then examines the design issues that are critical to all parallel architecture across the full range of modern design, covering data access, communication performance, coordination of cooperative work, and correct implementation of useful semantics. It not only describes the hardware and software techniques for addressing each of these issues but also explores how these techniques interact in the same system. Examining architecture from an application-driven perspective, it provides comprehensive discussions of parallel programming for high performance and of workload-driven evaluation, based on understanding hardware-software interactions. * synthesizes a decade of research and development for practicing engineers, graduate students, and researchers in parallel computer architecture, system software, and applications development * presents in-depth application case studies from computer graphics, computational science and engineering, and data mining to demonstrate sound quantitative evaluation of design trade-offs * describes the process of programming for performance, including both the architecture-independent and architecture-dependent aspects, with examples and case-studies * illustrates bus-based and network-based parallel systems with case studies of more than a dozen important commercial designs Table of Contents 1 Introduction 2 Parallel Programs 3 Programming for Performance 4 Workload-Driven Evaluation 5 Shared Memory Multiprocessors 6 Snoop-based Multiprocessor Design 7 Scalable Multiprocessors 8 Directory-based Cache Coherence 9 Hardware-Software Tradeoffs 10 Interconnection Network Design 11 Latency Tolerance 12 Future Directions APPENDIX A Parallel Benchmark Suites

1,571 citations

Book ChapterDOI
01 Sep 1997
TL;DR: In terms of minimizing routing area, VPR outperforms all published FPGA place and route tools to which the authors can compare and presents placement and routing results on a new set of circuits more typical of today's industrial designs.
Abstract: We describe the capabilities of and algorithms used in a new FPGA CAD tool, Versatile Place and Route (VPR). In terms of minimizing routing area, VPR outperforms all published FPGA place and route tools to which we can compare. Although the algorithms used are based on previously known approaches, we present several enhancements that improve run-time and quality. We present placement and routing results on a new set of large circuits to allow future benchmark comparisons of FPGA place and route tools on circuit sizes more typical of today's industrial designs.

1,133 citations

Book
29 Sep 2011
TL;DR: The Fifth Edition of Computer Architecture focuses on this dramatic shift in the ways in which software and technology in the "cloud" are accessed by cell phones, tablets, laptops, and other mobile computing devices.
Abstract: The computing world today is in the middle of a revolution: mobile clients and cloud computing have emerged as the dominant paradigms driving programming and hardware innovation today. The Fifth Edition of Computer Architecture focuses on this dramatic shift, exploring the ways in which software and technology in the "cloud" are accessed by cell phones, tablets, laptops, and other mobile computing devices. Each chapter includes two real-world examples, one mobile and one datacenter, to illustrate this revolutionary change. Updated to cover the mobile computing revolutionEmphasizes the two most important topics in architecture today: memory hierarchy and parallelism in all its forms.Develops common themes throughout each chapter: power, performance, cost, dependability, protection, programming models, and emerging trends ("What's Next")Includes three review appendices in the printed text. Additional reference appendices are available online.Includes updated Case Studies and completely new exercises.

984 citations