scispace - formally typeset
Search or ask a question
Book

Computational Aspects of Vlsi

01 Jan 1984-
About: The article was published on 1984-01-01 and is currently open access. It has received 862 citations till now. The article focuses on the topics: Very-large-scale integration.
Citations
More filters
Journal ArticleDOI
TL;DR: This article introduces a new implementation which is more robust than Rudolph's network and needs no redundancy or external permuters, and considers a class of single-stage designs with redundancy and compares the characteristics of networks discussed.
Abstract: Much research has been done on sorting networks but there are very few results concerning their robustness. Our starting point is the balanced sorting network introduced by Dowd et al. and its single-block robust design of Rudolph obtained at the cost of some redundancy and two permuters external to the network. In this article we introduce a new implementation which is more robust than Rudolph's network and needs no redundancy or external permuters. We also consider a class of single-stage designs with redundancy and compare the characteristics of networks discussed.

6 citations

Journal ArticleDOI
TL;DR: The circuit communication complexity that is introduced is related to the area of Boolean circuits that have all input vertices on the border of their layout, and also to the three-dimensional layout of Boolean circuit.

6 citations

Proceedings ArticleDOI
01 Jan 1988
TL;DR: This paper identifies communication-sensitive heuristics which promote good contractions for graph-based parallel algorithms on non-shared memory multiprocessors and presents algorithms which utilize these heuristic and discuss their performance on a group of diverse benchmarks.
Abstract: The mapping problem arises when parallel algorithms are implemented on parallel machines. When the number of processes exceeds the number of available processing elements, the mapping problem includes the contraction problem. In this paper, we identify communication-sensitive heuristics which promote good contractions for graph-based parallel algorithms on non-shared memory multiprocessors. We present algorithms which utilize these heuristics and discuss their performance on a group of diverse benchmarks.

6 citations

Proceedings ArticleDOI
Sourav Roy1, Xiaomin Lu1, Edmund J. Gieske1, Peng Yang1, Jim Holt1 
21 Oct 2013
TL;DR: An analytical model shows that coupled with fixed voltage-frequency scaling, asymmetric scaling can maintain the power density of the chip at the same level for several process generations, while increasing computational capabilities according to Dennardian scaling.
Abstract: This paper introduces a new architectural technique called asymmetric scaling on heterogeneous multi-core network processor architectures to mitigate the problem of dark silicon in future process technologies. In asymmetric scaling, the number of low power cores is increased at a higher rate than the number of high performance cores over process generations. Using an analytical model we show that coupled with fixed voltage-frequency scaling, asymmetric scaling can maintain the power density of the chip at the same level for several process generations, while increasing computational capabilities according to Dennardian scaling. Asymmetric scaling aligns nicely with the application characteristics on a network packet processor. To illustrate the concept, we discuss the Layerscape network processor architecture that incorporates a general purpose layer of high performance cores with an accelerated packet processing layer of low power cores. We discuss several techniques that can be applied to reduce the power density of low power cores. Using a representative packet forwarding workload, we show that shallow-pipeline, dual-issue, in-order cores with appropriate hardware acceleration and limited on-chip memory are a good choice for the low power processor layer.

6 citations


Cites background from "Computational Aspects of Vlsi"

  • ...FIXED VOLTAGE-FREQUENCY SCALING Contrary to constant electric .eld scaling of previous tech­nologies, today we are more in an era of constant voltage scaling, where Vdd and threshold voltage Vt are maintained Table 1: Scaling relationships Parameter Relation Constant Electric Field Scaling Fixed Voltage Scaling Fixed Voltage Freq Scaling W, L, tox 1/S 1/S 1/S Vdd, Vt 1/S 1 1 A/device WL 1/S2 1/S2 1/S2 C/A 1/tox S S S Iav (C/A)WV 1/S 1 1 tp C V /Iav 1/S 1/S 1/S f S S 1 Dynamic power density 1 A (C V 2f) 1 S2 S Sub-Threshold leakage power density * 1 A (C V )e Vgs -Vt nVT S S at almost the same levels....

    [...]

  • ...…in an era of constant voltage scaling, where Vdd and threshold voltage Vt are maintained Table 1: Scaling relationships Parameter Relation Constant Electric Field Scaling Fixed Voltage Scaling Fixed Voltage Freq Scaling W, L, tox 1/S 1/S 1/S Vdd, Vt 1/S 1 1 A/device WL 1/S2 1/S2 1/S2 C/A…...

    [...]

Journal ArticleDOI
TL;DR: Efficient data movement and partitioning techniques are used to derive optimal parallel algorithms for several geometric problems onn×n images using a fixed-size linear array withp processors, where 1≤p≤n.
Abstract: Linear arrays are characterized by a small communication bandwidth and a large communication diameter rendering them unsuited to the implementation of global computations. This paper presents efficient data movement and partitioning techniques to overcome several shortcomings of linear arrays. These techniques are used to derive optimal parallel algorithms for several geometric problems on n × n images using a fixed-size linear array with p processors, where 1 ≤ p ≤ n. O(n2/p) time solutions are presented for labeling connected image regions, computing the convex hull of each region, and computing nearest neighbors. Consequently, a linear array with n processors can solve several image problems in O(n) time which is the same time taken by a two dimensional mesh-connected computer with n2 processors. Limitations of linear arrays are analyzed by presenting a class of image problems which can be solved sequentially in O(n2) time, but require Ω(n2) time on a linear array, irrespective of the number of processors used and the partitioning of the input image among the processors. An alternate communication-efficient fixed-size organization with p processors is proposed to solve such problems in O(n2/p) time, for 1 ≤ p ≤ n.

6 citations