Book•
Computational Aspects of Vlsi
01 Jan 1984-
About: The article was published on 1984-01-01 and is currently open access. It has received 862 citations till now. The article focuses on the topics: Very-large-scale integration.
Citations
More filters
••
TL;DR: A new interconnection topology is presented, which unifies a wide range of connection topologies such as the Boolean cube, classical Fibonacci cube, etc, and developed new classes of the discrete orthogonal transforms, based on the generalized fibonacci recursions.
Abstract: We present a new interconnection topology called generalized Fibonacci topology, which unifies a wide range of connection topologies such as the Boolean cube (or hypercube), classical Fibonacci cube, etc. Some basic topological properties of generalized Fibonacci cubes are established. Finally, we developed new classes of the discrete orthogonal transforms, based on the generalized Fibonacci recursions. They can be implemented efficiently by butterfly-type networks (like the Fourier, or the Haar transforms). A generalized Fibonacci cube based processor architecture (generalizing the known SIMD architecture — hypercube processor) can be efficiently used for hardware implementation of the proposed discrete orthogonal transforms.
25 citations
Cites background from "Computational Aspects of Vlsi"
...identified with the butterfly concerning ith and jth input at the stage r [10]....
[...]
••
TL;DR: The results for geometric graphs, which have become a frequently-used benchmark in the evaluation of partitioning algorithms, show that PO holds an advantage over the others.
25 citations
••
20 Jun 1990TL;DR: Embeddings of complete binary trees, pyramids and X-trees into 2-dimensional meshes achieve optimal expansion with congestion 2 for trees and congestion 6 for X-Trees, and constant expansion ≤3 with congestion 3 for pyramids.
Abstract: In the following we present embeddings of complete binary trees, pyramids and X-trees into 2-dimensional meshes. The presented embeddings achieve optimal expansion with congestion 2 for trees and congestion 6 for X-trees, and constant expansion ≤3 with congestion 3 for pyramids. The dilations are shown to be near optimal.
25 citations
01 Jan 1989
TL;DR: In this article, the authors propose a methodology for top-down analysis of parallel program executions on shared-memory multiprocessors, based on a formal model for shared memory communication among processes in a parallel program.
Abstract: One of the most serious problems in the development
cycle of large-scale parallel programs is the lack of tools for
debugging and performance analysis. Parallel programs are more
difficult to analyze than their sequential counterparts for several
reasons. First, race conditions in parallel programs can cause
non-deterministic behavior, which reduces the effectiveness of
traditional cyclic debugging techniques. Second, invasive,
interactive analysis can distort a parallel program's execution
beyond recognition. Finally, comprehensive analysis of a parallel
program's execution requires collection, management, and
presentation of an enormous amount of information. This
dissertation addresses the problem of debugging and analysis of
large-scale parallel programs executing on shared-memory
multiprocessors. It proposes a methodology for top-down analysis of
parallel program executions that replaces previous ad-hoc
approaches. To support this methodology, a formal model for
shared-memory communication among processes in a parallel program
is developed. It is shown how synchronization traces based on this
abstract model can be used to create indistinguishable executions
that form the basis for debugging. This result is used to develop a
practical technique for tracing parallel program executions on
shared-memory parallel processors so that their executions can be
repeated deterministically on demand. Next, it is shown how these
traces can be augmented with additional information that increases
their utility for debugging and performance analysis. The design of
an integrated, extensible toolkit based on these traces is
proposed. This toolkit uses execution traces to support
interactive, graphics-based, top-down analysis of parallel program
executions. A prototype implementation of the toolkit is described
explaining how it exploits our execution tracing model to
facilitate debugging and analysis. Case studies of the behavior of
several versions of two parallel programs are presented to
demonstrate both the utility of our execution tracing model and the
leverage it provides for debugging and performance
analysis.
25 citations
••
TL;DR: A circuit taxonomy along the space and time dimensions is presented, followed by an analysis of optimal power supply and threshold voltages and transistor sizing for minimizing the energy-delay product of a class of complementary metal-oxide-semiconductor (CMOS) digital circuits.
Abstract: We first present a circuit taxonomy along the space and time dimensions, which is useful for classifying generic low-power techniques, followed by an analysis of optimal power supply and threshold voltages and transistor sizing for minimizing the energy-delay product of a class of complementary metal-oxide-semiconductor (CMOS) digital circuits.
24 citations
Cites background from "Computational Aspects of Vlsi"
...The " classic " two-dimensional VLSI design space tries to minimize the circuit area (A) and delay (T ) in order to reduce cost and improve performance, by using optimizations with objective functions such as A, AT , and AT 2 [12]....
[...]