Showing papers on "Very-large-scale integration published in 2010"

PDF

Open Access

Journal Article•DOI•

Bio-Inspired Imprecise Computational Blocks for Efficient VLSI Implementation of Soft-Computing Applications

[...]

Hamid Reza Mahdiani¹, Ali Ahmadi¹, Sied Mehdi Fakhraie¹, Caro Lucas¹•Institutions (1)

01 Apr 2010-IEEE Transactions on Circuits and Systems I-regular Papers

TL;DR: It is shown that these proposed Bio-inspired Imprecise Computational blocks (BICs) can be exploited to efficiently implement a three-layer face recognition neural network and the hardware defuzzification block of a fuzzy processor.

...read moreread less

Abstract: The conventional digital hardware computational blocks with different structures are designed to compute the precise results of the assigned calculations. The main contribution of our proposed Bio-inspired Imprecise Computational blocks (BICs) is that they are designed to provide an applicable estimation of the result instead of its precise value at a lower cost. These novel structures are more efficient in terms of area, speed, and power consumption with respect to their precise rivals. Complete descriptions of sample BIC adder and multiplier structures as well as their error behaviors and synthesis results are introduced in this paper. It is then shown that these BIC structures can be exploited to efficiently implement a three-layer face recognition neural network and the hardware defuzzification block of a fuzzy processor.

...read moreread less

458 citations

Journal Article•DOI•

Design of Low-Power High-Speed Truncation-Error-Tolerant Adder and Its Application in Digital Signal Processing

[...]

Ning Zhu¹, Wang Ling Goh¹, Weija Zhang¹, Kiat Seng Yeo¹, Zhi Hui Kong¹ - Show less +1 more•Institutions (1)

Nanyang Technological University¹

01 Aug 2010-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A novel error-tolerant adder (ETA) is proposed that is able to ease the strict restriction on accuracy, and at the same time achieve tremendous improvements in both the power consumption and speed performance.

...read moreread less

Abstract: In modern VLSI technology, the occurrence of all kinds of errors has become inevitable. By adopting an emerging concept in VLSI design and test, error tolerance (ET), a novel error-tolerant adder (ETA) is proposed. The ETA is able to ease the strict restriction on accuracy, and at the same time achieve tremendous improvements in both the power consumption and speed performance. When compared to its conventional counterparts, the proposed ETA is able to attain more than 65% improvement in the Power-Delay Product (PDP). One important potential application of the proposed ETA is in digital signal processing systems that can tolerate certain amount of errors.

...read moreread less

286 citations

Book•

VLSI Physical Design Automation: Theory and Practice

[...]

Sadiq M. Sait, Habib Youssef

01 Jan 2010

TL;DR: VLSI Physical Design Automation is an essential introduction for senior undergraduates, postgraduates and anyone starting work in the field of CAD for VLSI.

...read moreread less

Abstract: From the Publisher: VLSI is an important area of electronic and computer engineering: however, there are few textbooks available for undergraduate education in VLSI design automation and chip layout. VLSI Physical Design Automation fills the void and is an essential introduction for senior undergraduates, postgraduates and anyone starting work in the field of CAD for VLSI. It covers all aspects of physical design, together with such related areas as automatic cell generation, silicon compilation, layout editors and compaction. A problem solving approach has been adopted and each solution has been illustrated with examples. Each topic is treated in a standard format of Problem Definition, Cost Functions and Constraints, Possible Approaches and Latest Developments.

...read moreread less

273 citations

Journal Article•DOI•

An Efficient VLSI Architecture for Nonbinary LDPC Decoders

[...]

Jun Lin¹, Jin Sha¹, Zhongfeng Wang², Li Li¹•Institutions (2)

Nanjing University¹, Broadcom²

01 Jan 2010-IEEE Transactions on Circuits and Systems Ii-express Briefs

TL;DR: An efficient selective computation algorithm, which totally avoids the sorting process, is proposed for Min-Max decoding and an efficient VLSI architecture for a nonbinary Min- Max decoder is presented.

...read moreread less

Abstract: Low-density parity-check (LDPC) codes constructed over the Galois field GF(q), which are also called nonbinary LDPC codes, are an extension of binary LDPC codes with significantly better performance. Although various kinds of low-complexity quasi-optimal iterative decoding algorithms have been proposed, the VLSI implementation of nonbinary LDPC decoders has rarely been discussed due to their hardware unfriendly properties. In this brief, an efficient selective computation algorithm, which totally avoids the sorting process, is proposed for Min-Max decoding. In addition, an efficient VLSI architecture for a nonbinary Min-Max decoder is presented. The synthesis results are given to demonstrate the efficiency of the proposed techniques.

...read moreread less

157 citations

Proceedings Article•DOI•

Degradation in FPGAs: measurement and modelling

[...]

Edward Stott¹, Justin S. J. Wong¹, P. Sedcole¹, Peter Y. K. Cheung¹•Institutions (1)

Imperial College London¹

21 Feb 2010

TL;DR: A method for measuring and monitoring degradation in an FPGA was developed and used to conduct an accelerated life test on a modern device, revealing a clear, gradual degradation in timing performance that matches the expected effects of Negative-Bias Temperature Instability and Hot Carrier Injection.

...read moreread less

Abstract: Progress in VLSI technology is driven by increasing circuit density through process scaling, but with shrinking geometry comes an increasing threat to reliability. FPGAs are uniquely placed to tackle degradation and faults due to their regular structure and ability to reconfigure, giving them the potential to implement system-level reliability enhancements. To assess the scale of the challenge, a method for measuring and monitoring degradation in an FPGA was developed and used to conduct an accelerated life test on a modern device. This revealed a clear, gradual degradation in timing performance that matches the expected effects of Negative-Bias Temperature Instability and Hot Carrier Injection, two of the most important VLSI degradation mechanisms. Further insight into ageing phenomena was gained using modelling -- showing how degradation in a typical LUT would be affected by different usage conditions, and predicting in detail the effects on circuit behaviour.

...read moreread less

115 citations

Journal Article•DOI•

VLSI Implementation of BCH Error Correction for Multilevel Cell NAND Flash Memory

[...]

Hyojin Choi¹, Wei Liu¹, Wonyong Sung¹•Institutions (1)

Seoul National University¹

01 May 2010-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: Three error-correcting architectures, named as whole-page, sector-pipelined, and multistrip ones, are proposed and the VLSI design applies both algorithmic and architectural-level optimizations that include parallel algorithm transformation, resource sharing, and time multiplexing.

...read moreread less

Abstract: Bit-error correction is crucial for realizing cost-effective and reliable NAND Flash-memory-based storage systems. In this paper, low-power and high-throughput error-correction circuits have been developed for multilevel cell (MLC) nand Flash memories. The developed circuits employ the Bose-Chaudhuri-Hocquenghem code to correct multiple random bit errors. The error-correcting codes for them are designed based on the bit-error characteristics of MLC NAND Flash memories for solid-state drives. To trade the code rate, circuit complexity, and power consumption, three error-correcting architectures, named as whole-page, sector-pipelined, and multistrip ones, are proposed. The VLSI design applies both algorithmic and architectural-level optimizations that include parallel algorithm transformation, resource sharing, and time multiplexing. The chip area, power consumption, and throughput results for these three architectures are presented.

...read moreread less

94 citations

Proceedings Article•DOI•

Compressive sampling hardware reconstruction

[...]

Avi Septimus¹, Raphael Steinberg¹•Institutions (1)

Technion – Israel Institute of Technology¹

03 Aug 2010

TL;DR: This work presents a VLSI implementation of a computationally efficient algorithm named Orthogonal Matching Pursuit, and further optimize the algorithm to meet typical hardware constraints and describe the different block units of the design.

...read moreread less

Abstract: Compressive Sampling reconstruction techniques require computationally intensive algorithms, often using L1 optimization to reconstruct a signal that was originally sampled at a sub-Nyquist rate. In this work we present a VLSI implementation of a computationally efficient algorithm named Orthogonal Matching Pursuit. We further optimize the algorithm to meet typical hardware constraints and describe the different block units of our design. We synthesize our design for the Xilinx Virtex 5 FPGA and give timing and area results. We summarize our work with a short discussion of the possible uses for our system.

...read moreread less

94 citations

Journal Article•DOI•

Power gating: Circuits, design methodologies, and best practice for standard-cell VLSI designs

[...]

Youngsoo Shin¹, Jun Seomun¹, Kyu-Myung Choi², Takayasu Sakurai³•Institutions (3)

KAIST¹, Samsung², University of Tokyo³

07 Oct 2010-ACM Transactions on Design Automation of Electronic Systems

TL;DR: Power gating has become one of the most widely used circuit design techniques for reducing leakage current as discussed by the authors, but its application to standard-cell VLSI designs involves many careful considerations.

...read moreread less

Abstract: Power Gating has become one of the most widely used circuit design techniques for reducing leakage current. Its concept is very simple, but its application to standard-cell VLSI designs involves many careful considerations. The great complexity of designing a power-gated circuit originates from the side effects of inserting current switches, which have to be resolved by a combination of extra circuitry and customized tools and methodologies. In this tutorial we survey these design considerations and look at the best practice within industry and academia. Topics include output isolation and data retention, current switch design and sizing, and physical design issues such as power networks, increases in area and wirelength, and power grid analysis. Designers can benefit from this tutorial by obtaining a better understanding of implications of power gating during an early stage of VLSI designs. We also review the ways in which power gating has been improved. These include reducing the sizes of switches, cutting transition delays, applying power gating to smaller blocks of circuitry, and reducing the energy dissipated in mode transitions. Power Gating has also been combined with other circuit techniques, and these hybrids are also reviewed. Important open problems are identified as a stimulus to research.

...read moreread less

80 citations

Journal Article•DOI•

Design and Implementation of a Sort-Free K-Best Sphere Decoder

[...]

S. Mondal¹, Ahmed M. Eltawil², Chung-An Shen², Khaled N. Salama³•Institutions (3)

Rensselaer Polytechnic Institute¹, University of California, Irvine², King Abdullah University of Science and Technology³

01 Oct 2010-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A novel sort-free approach to path extension, as well as quantized metrics result in a high-throughput VLSI architecture with lower power and area consumption compared to state-of-the-art published systems.

...read moreread less

Abstract: This paper describes the design and very-large-scale integration (VLSI) architecture for a 4 × 4 breadth-first K-best multiple-input-multiple-output (MIMO) decoder using a 64 quadrature-amplitude modulation (QAM) scheme. A novel sort-free approach to path extension, as well as quantized metrics result in a high-throughput VLSI architecture with lower power and area consumption compared to state-of-the-art published systems. Functionality is confirmed via a field-programmable gate array (FPGA) implementation on a Xilinx Virtex II Pro FPGA. Comparison of simulation and measurements are given, and FPGA utilization figures are provided. Finally, VLSI architectural tradeoffs are explored for a synthesized application-specific IC (ASIC) implementation in a 65-nm CMOS technology.

...read moreread less

78 citations

Journal Article•DOI•

Standby Leakage Power Reduction Technique for Nanoscale CMOS VLSI Systems

[...]

HeungJun Jeon¹, Yong-Bin Kim¹, Minsu Choi²•Institutions (2)

Northeastern University¹, Missouri University of Science and Technology²

25 Mar 2010-IEEE Transactions on Instrumentation and Measurement

TL;DR: The proposed approach demonstrates that the optimal body bias reduces a considerable amount of standby leakage power dissipation in nanoscale CMOS integrated circuits.

...read moreread less

Abstract: In this paper, a novel low-power design technique is proposed to minimize the standby leakage power in nanoscale CMOS very large scale integration (VLSI) systems by generating the adaptive optimal reverse body-bias voltage. The adaptive optimal body-bias voltage is generated from the proposed leakage monitoring circuit, which compares the subthreshold current (I SUB) and the band-to-band tunneling (BTBT) current (I BTBT). The proposed circuit was simulated in HSPICE using 32-nm bulk CMOS technology and evaluated using ISCAS85 benchmark circuits at different operating temperatures (ranging from 25°C to 100°C). Analysis of the results shows a maximum of 551 and 1491 times leakage power reduction at 25°C and 100°C, respectively, on a circuit with 546 gates. The proposed approach demonstrates that the optimal body bias reduces a considerable amount of standby leakage power dissipation in nanoscale CMOS integrated circuits. In this approach, the temperature and supply voltage variations are compensated by the proposed feedback loop.

...read moreread less

76 citations

Journal Article•DOI•

Efficient Decoder Design for Nonbinary Quasicyclic LDPC Codes

[...]

Jun Lin¹, Jin Sha¹, Zhongfeng Wang², Li Li¹•Institutions (2)

Nanjing University¹, Broadcom²

01 May 2010-IEEE Transactions on Circuits and Systems I-regular Papers

TL;DR: This paper addresses decoder design for nonbinary quasicyclic low-density parity-check (QC-LDPC) codes with a novel decoding algorithm proposed to eliminate the multiplications over Galois field for check node processing.

...read moreread less

Abstract: This paper addresses decoder design for nonbinary quasicyclic low-density parity-check (QC-LDPC) codes. First, a novel decoding algorithm is proposed to eliminate the multiplications over Galois field for check node processing. Then, a partially parallel architecture for check node processing units and an optimized architecture for variable node processing units are developed based on the new decoding algorithm. Thereafter, an efficient decoder structure dedicated to a promising class of high-performance nonbinary QC-LDPC codes is presented for the first time. Moreover, an ASIC implementation for a (620, 310) nonbinary QC-LDPC code decoder over GF(32) is designed to demonstrate the efficiency of the presented techniques.

...read moreread less

Journal Article•DOI•

A PSO-based intelligent decision algorithm for VLSI floorplanning

[...]

Guolong Chen¹, Wenzhong Guo¹, Chen Yuzhong¹•Institutions (1)

Fuzhou University¹

01 Oct 2010

TL;DR: A novel intelligent decision algorithm based on the particle swarm optimization (PSO) technique to obtain a feasible floorplanning in VLSI circuit physical placement and can avoid local minimum and performs well in convergence.

...read moreread less

Abstract: Floorplanning is an important issue in the very large-scale integrated (VLSI) circuit design automation as it determines the performance, size, yield and reliability of VLSI chips. This paper proposes a novel intelligent decision algorithm based on the particle swarm optimization (PSO) technique to obtain a feasible floorplanning in VLSI circuit physical placement. The PSO was applied with integer coding based on module number and a new recommended value of acceleration coefficients for optimal placement solution. Inspired by the physics of genetic algorithm (GA), the principles of mutation and crossover operator in GA are incorporated into the proposed PSO algorithm to make this algorithm to break away from local optima and achieve a better diversity. Experiments employing MCNC and GSRC benchmarks show that the proposed algorithm is effective. The proposed algorithm can avoid local minimum and performs well in convergence. The experimental results of the proposed method in this paper can also greatly help floorplanning decision making in VLSI circuit design automation.

...read moreread less

Journal Article•DOI•

A Radius Adaptive K-Best Decoder With Early Termination: Algorithm and VLSI Architecture

[...]

Chung-An Shen¹, Ahmed M. Eltawil¹•Institutions (1)

University of California, Irvine¹

01 Sep 2010

TL;DR: A novel algorithm and architecture for K-Best decoding that combines the benefits of radius shrinking commonly associated with sphere decoding and the architectural benefits associated with K- best decoding approaches is presented.

...read moreread less

Abstract: This paper presents a novel algorithm and architecture for K-Best decoding that combines the benefits of radius shrinking commonly associated with sphere decoding and the architectural benefits associated with K-Best decoding approaches. The proposed algorithm requires much smaller K and possesses the advantages of branch pruning and adaptively updated pruning threshold while still achieving near-optimum performance. The algorithm examines a much smaller subset of points as compared to the K-Best decoder. The VLSI architecture of the decoder is based on a pipelined sorter-free scheme. The proposed K-Best decoder is designed to support a 4 × 4 64-QAM system and is synthesized with 65-nm technology at 158-MHz clock frequency and 1-V supply. The synthesized decoder can support a throughput of 285.8 Mb/s at 25-dB signal-to-noise ratio with an area of 210 kGE at 12.8-mW power consumption.

...read moreread less

Posted Content•

Optimization of reversible sequential circuits

[...]

Abu Sadat Md. Sayem, Masashi Ueda

23 Jun 2010-arXiv: Other Computer Science

TL;DR: This paper has proposed reversible D-latch and JK latch which are better than the existing designs available in literature and reduced the required number of gates, garbage outputs, and delay and hardware complexity.

...read moreread less

Abstract: In recent years reversible logic has been considered as an important issue for designing low power digital circuits. It has voluminous applications in the present rising nanotechnology such as DNA computing, Quantum Computing, low power VLSI and quantum dot automata. In this paper we have proposed optimized design of reversible sequential circuits in terms of number of gates, delay and hardware complexity. We have designed the latches with a new reversible gate and reduced the required number of gates, garbage outputs, and delay and hardware complexity. As the number of gates and garbage outputs increase the complexity of reversible circuits, this design will significantly enhance the performance. We have proposed reversible D-latch and JK latch which are better than the existing designs available in literature.

...read moreread less

Journal Article•DOI•

Enabling Power-Efficient DVFS Operations on Silicon

[...]

Dongsheng Ma¹, Rajdeep Bondade¹•Institutions (1)

University of Arizona¹

01 Mar 2010-IEEE Circuits and Systems Magazine

TL;DR: This paper investigates key design issues, control schemes, circuit architectures and future research directions, involved in the development of application-aware, multiple- and variable-output DC-DC power converters, and addresses the importance of hardware-software co-design for future power management systems.

...read moreread less

Abstract: With the perpetual power increase in modern VLSI systems, efficient and effective power management has been critical to next-generation IC designs. To overcome this grand challenge, techniques, such as dynamic voltage/frequency scaling (DVFS), have been proposed to jointly optimize power, energy and operating performance, leading to significantly improved system reliability, efficiency and battery lifetime. From both system-level and circuit-level perspectives, this paper investigates key design issues, control schemes, circuit architectures and future research directions, involved in the development of application-aware, multiple- and variable-output DC-DC power converters. The article first discusses key multiple-output converters such as the single-inductor multiple-output (SIMO) DC-DC converters and their system-level integration for DVFS power management, followed by our investigation on various adaptive-output power converter topologies and corresponding design challenges. The paper also addresses the importance of hardware-software co-design for future power management systems. With the integration of the enabling hardware platform, power processing, in addition to the traditional signal processing, rises to become another key factor to next-generation VLSI designs. This naturally enables effective on-chip power tracking, power processing and thermal monitoring. More importantly, it would significantly change traditional design concepts and largely benefit signal processing in return, eventually leading to revolutionary changes on system reliability, performance, efficiency and operating lifetime.

...read moreread less

Journal Article•DOI•

A Scalable VLSI Architecture for Soft-Input Soft-Output Single Tree-Search Sphere Decoding

[...]

E.M. Witte¹, Filippo Borlenghi¹, Gerd Ascheid¹, Rainer Leupers¹, Heinrich Meyr¹ - Show less +1 more•Institutions (1)

RWTH Aachen University¹

01 Sep 2010-IEEE Transactions on Circuits and Systems Ii-express Briefs

TL;DR: The first VLSI architecture for SISO SD applying a single tree-search approach for soft input MIMO demapping is introduced, similar to the one proposed by Studer in IEEE J-SAC 2008.

...read moreread less

Abstract: Multiple-input multiple-output (MIMO) wireless transmission imposes huge challenges on the design of efficient hardware architectures for iterative receivers. A major challenge is soft-input soft-output (SISO) MIMO demapping, often approached by sphere decoding (SD). In this brief, we introduce-to our best knowledge-the first VLSI architecture for SISO SD applying a single tree-search approach. Compared with a soft-output-only base architecture similar to the one proposed by Studer in IEEE J-SAC 2008, the architectural modifications for soft input still allow a one-node-per-cycle execution. For a 4×4 antennas system using quadrature amplitude modulation (QAM) with order 16, the area increases by 57%, and the operating frequency degrades by 34% only.

...read moreread less

Journal Article•DOI•

A Self-Aligned InGaAs HEMT Architecture for Logic Applications

[...]

N. Waldron¹, Dae-Hyun Kim¹, J.A. del Alamo¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 2010-IEEE Transactions on Electron Devices

TL;DR: In this article, a trilayer TLM model is used to predict the expected evolution of the contact resistance as it is scaled to realistic VLSI dimensions and find that the current technology results in resistance values that are two orders of magnitude higher than the desired target for sub-22-nm nodes.

...read moreread less

Abstract: In this paper, we present a novel self-aligned process for future III-V logic FETs. Using this process, we have demonstrated enhancement-mode 90-nm-gate-length InGaAs HEMTs with excellent logic figures of merit. We have carried out a detailed analysis of this device architecture to determine its future scaling capabilities. We find that, as the insulator is scaled to achieve enhancement mode, the performance of the device is limited by degradation of the I ON/I OFF ratio due to gate leakage current. By use of TLM test structures, we have determined that the barrier resistance dominates the source resistance. We use a trilayer TLM model to predict the expected evolution of the contact resistance as it is scaled to realistic VLSI dimensions and find that the current technology results in resistance values that are two orders of magnitude higher than the desired target for sub-22-nm nodes. Using the model, we explore different options for device redesign. Both I ON/I OFF and source-resistance limitations imply that the use of a high-k gate dielectric will be required for future device implementations.

...read moreread less

Proceedings Article•DOI•

Stochastic computational models for accurate reliability evaluation of logic circuits

[...]

Hao Chen¹, Jie Han¹•Institutions (1)

University of Alberta¹

16 May 2010

TL;DR: A computational approach using the stochastic computational models (SCMs) accurately determines the reliability of a circuit with its precision only limited by the random fluctuations inherent in the representation of random binary bit streams.

...read moreread less

Abstract: As reliability becomes a major concern with the continuous scaling of CMOS technology, several computational methodologies have been developed for the reliability evaluation of logic circuits. Previous accurate analytical approaches, however, have a computational complexity that generally increases exponentially with the size of a circuit, making the evaluation of large circuits intractable. This paper presents novel computational models based on stochastic computation, in which probabilities are encoded in the statistics of random binary bit streams, for the reliability evaluation of logic circuits. A computational approach using the stochastic computational models (SCMs) accurately determines the reliability of a circuit with its precision only limited by the random fluctuations inherent in the representation of random binary bit streams. The SCM approach has a linear computational complexity and is therefore scalable for use for any large circuits. Our simulation results demonstrate the accuracy and scalability of the SCM approach, and suggest its possible applications in VLSI design.

...read moreread less

Proceedings Article•DOI•

VLSI implementation of a WiMAX/LTE compliant low-complexity high-throughput soft-output K-Best MIMO detector

[...]

Dimpesh Patel¹, Vadim Smolyakov¹, Mahdi Shabany¹, P. Glenn Gulak¹•Institutions (1)

University of Toronto¹

03 Aug 2010

TL;DR: This paper presents a VLSI architecture of a novel soft-output K-Best MIMO detector that attains low computational complexity using three improvement ideas: relevant discarded paths selection, last stage on-demand expansion, and relaxed LLR computation.

...read moreread less

Abstract: This paper presents a VLSI architecture of a novel soft-output K-Best MIMO detector. The proposed detector attains low computational complexity using three improvement ideas: relevant discarded paths selection, last stage on-demand expansion, and relaxed LLR computation. A deeply pipelined architecture for a soft-output MIMO detector is implemented for a 4×4 64-QAM MIMO system realizing a peak throughput of 655Mbps, while consuming 174K gates and 195mW in 0.13um CMOS. Synthesis results in 65nm CMOS show the potential to support a sustained throughput up to 2Gbps achieving the data rates envisioned by emerging IEEE 802.16m and LTE-Advanced wireless standards.

...read moreread less

Proceedings Article•DOI•

A subthreshold aVLSI implementation of the Izhikevich simple neuron model

[...]

Venkat Rangan¹, Abhishek Ghosh², Vladimir Aparin¹, Gert Cauwenberghs³•Institutions (3)

Qualcomm¹, University of California, Los Angeles², University of California, San Diego³

11 Nov 2010

TL;DR: A circuit architecture for compact analog VLSI implementation of the Izhikevich neuron model, which efficiently describes a wide variety of neuron spiking and bursting dynamics using two state variables and four adjustable parameters, is presented.

...read moreread less

Abstract: We present a circuit architecture for compact analog VLSI implementation of the Izhikevich neuron model, which efficiently describes a wide variety of neuron spiking and bursting dynamics using two state variables and four adjustable parameters. Log-domain circuit design utilizing MOS transistors in subthreshold results in high energy efficiency, with less than 1pJ of energy consumed per spike. We also discuss the effects of parameter variations on the dynamics of the equations, and present simulation results that replicate several types of neural dynamics. The low power operation and compact analog VLSI realization make the architecture suitable for human-machine interface applications in neural prostheses and implantable bioelectronics, as well as large-scale neural emulation tools for computational neuroscience.

...read moreread less

Journal Article•DOI•

Energy-Efficient Design Methodologies: High-Performance VLSI Adders

[...]

Bart R. Zeydel, Dursun Baran¹, Vojin G. Oklobdzija¹•Institutions (1)

University of Texas at Dallas¹

07 Jun 2010-IEEE Journal of Solid-state Circuits

TL;DR: In this article, a methodology for energy-efficient design applied to 64-bit adders implemented with static CMOS, dynamic CMOS and CMOS compound domino logic families, is presented.

...read moreread less

Abstract: Energy-efficient design requires exploration of available algorithms, recurrence structures, energy and wire tradeoffs, circuit design techniques, circuit sizing and system constraints. In this paper, methodology for energy-efficient design applied to 64-bit adders implemented with static CMOS, dynamic CMOS and CMOS compound domino logic families, is presented. We also examined 65 nm, 45 nm, 32 nm, and 22 nm technology nodes to explore the applicability of the results in deep submicron technologies. By applying energy-delay tradeoffs on various levels, we developed adder topology yielding up to 20% performance improvement and 4.5× energy reduction over existing designs.

...read moreread less

Journal Article•DOI•

Synctium: a Near-Threshold Stream Processor for Energy-Constrained Parallel Applications

[...]

Evgeni Krimer¹, Robert Pawlowski², Mattan Erez¹, Patrick Chiang²•Institutions (2)

University of Texas at Austin¹, Oregon State University²

01 Jan 2010-IEEE Computer Architecture Letters

TL;DR: A near energy-optimal, stream processor family that relies on massively parallel, near-threshold VLSI circuits and interconnect, incorporating cooperative circuit/architecture techniques to tolerate the expected large delay variations, enabling a new class of energy-constrained, high-throughput computing applications.

...read moreread less

Abstract: While Moore's law scaling continues to double transistor density every technology generation, supply voltage reduction has essentially stopped, increasing both power density and total energy consumed in conventional microprocessors. Therefore, future processors will require an architecture that can: a) take advantage of the massive amount of transistors that will be available; and b) operate these transistors in the near-threshold supply domain, thereby achieving near optimal energy/computation by balancing the leakage and dynamic energy consumption. Unfortunately, this optimality is typically achieved while running at very low frequencies (i.e. 0:1 - 10 MHz) and with only one computation executing per cycle, such that performance is limited. Further, near-threshold designs suffer from severe process variability that can introduce extremely large delay variations. In this paper, we propose a near energy-optimal, stream processor family that relies on massively parallel, near-threshold VLSI circuits and interconnect, incorporating cooperative circuit/architecture techniques to tolerate the expected large delay variations. Initial estimations from circuit simulations show that it is possible to achieve greater than 1 Giga-Operations per second (1GOP/s) with less than 1 mW total power consumption, enabling a new class of energy-constrained, high-throughput computing applications.

...read moreread less

Journal Article•DOI•

A VLSI Architecture and Algorithm for Lucas–Kanade-Based Optical Flow Computation

[...]

V. Mahalingam¹, Koustav Bhattacharya¹, Nagarajan Ranganathan¹, H. Chakravarthula, R.R. Murphy², K.S. Pratt¹ - Show less +2 more•Institutions (2)

University of South Florida¹, Texas A&M University²

01 Jan 2010-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: An efficient VLSI architecture is proposed for the accurate computation of the Lucas-Kanade (L-K)-based optical flow and results indicate 42% improvement in accuracy and a speed up of five times, compared to a recent hardware implementation of the L-K algorithm.

...read moreread less

Abstract: Optical flow computation in vision-based systems demands substantial computational power and storage area. Hence, to enable real-time processing at high resolution, the design of application-specific system for optic flow becomes essential. In this paper, we propose an efficient VLSI architecture for the accurate computation of the Lucas-Kanade (L-K)-based optical flow. The L-K algorithm is first converted to a scaled fixed-point version, with optimal bit widths, for improving the feasibility of high-speed hardware implementation without much loss in accuracy. The algorithm is mapped onto an efficient VLSI architecture and the data flow exploits the principles of pipelining and parallelism. The optical flow estimation involves several tasks such as Gaussian smoothing, gradient computation, least square matrix calculation, and velocity estimation, which are processed in a pipelined fashion. The proposed architecture was simulated and verified by synthesizing onto a Xilinx Field Programmable Gate Array, which utilize less than 40% of system resources while operating at a frequency of 55 MHz. Experimental results on benchmark sequences indicate 42% improvement in accuracy and a speed up of five times, compared to a recent hardware implementation of the L-K algorithm.

...read moreread less

Proceedings Article•DOI•

A device mismatch compensation method for VLSI neural networks

[...]

Emre Neftci¹, Giacomo Indiveri¹•Institutions (1)

University of Zurich¹

02 Jun 2010

TL;DR: This work proposes a method that selectively changes the connectivity profile in the neural network to normalize its response and demonstrates its effectiveness with experimental data obtained from a VLSI Soft Winner-Take-All network.

...read moreread less

Abstract: Device mismatch in neuromorphic VLSI implementations of spiking neural networks can be a serious and limiting problem. Classical engineering solutions can reduce the effect of mismatch, but require increasing layout sizes or using additional precious silicon real-estate. Here we propose a complementary strategy which exploits the Address-Event Representation used in neuromorphic systems and does not affect the device layout. We propose a method that selectively changes the connectivity profile in the neural network to normalize its response. We provide a theoretical analysis of the approach proposed and demonstrate its effectiveness with experimental data obtained from a VLSI Soft Winner-Take-All network.

...read moreread less

Journal Article•DOI•

A $Q$ -Modification Neuroadaptive Control Architecture for Discrete-Time Systems

[...]

Hsin Chen¹, Sylvain Saïghi², Laure Buhry², Sylvie Renaud²•Institutions (2)

National Tsing Hua University¹, University of Bordeaux²

01 Sep 2010-IEEE Transactions on Neural Networks

TL;DR: The feasibility of simulating the stochastic behavior of biological neurons in a very large scale integrated (VLSI) system, which implements a programmable and configurable Hodgkin-Huxley model is explored.

...read moreread less

Abstract: Neuronal variability has been thought to play an important role in the brain. As the variability mainly comes from the uncertainty in biophysical mechanisms, stochastic neuron models have been proposed for studying how neurons compute with noise. However, most papers are limited to simulating stochastic neurons in a digital computer. The speed and the efficiency are thus limited especially when a large neuronal network is of concern. This brief explores the feasibility of simulating the stochastic behavior of biological neurons in a very large scale integrated (VLSI) system, which implements a programmable and configurable Hodgkin-Huxley model. By simply injecting noise to the VLSI neuron, various stochastic behaviors observed in biological neurons are reproduced realistically in VLSI. The noise-induced variability is further shown to enhance the signal modulation of a neuron. These results point toward the development of analog VLSI systems for exploring the stochastic behaviors of biological neuronal networks in large scale.

...read moreread less

Proceedings Article•DOI•

A VLSI neural monitoring system with ultra-wideband telemetry for awake behaving subjects

[...]

Elliot Greenwald¹, Mohsen Mollazadeh¹, Nitish V. Thakor¹, Wei Tang², Eugenio Culurciello² - Show less +1 more•Institutions (2)

Johns Hopkins University¹, Yale University²

03 Aug 2010

TL;DR: A miniature, lightweight, and low-power recording system for monitoring neural activity in awake behaving animals, which includes the VLSI circuits, a digital interface board, a battery, and a custom housing that can be chronically mounted on small animals.

...read moreread less

Abstract: Long term monitoring of neuronal activity in awake behaving subjects can provide fundamental information about brain dynamics for both neuroscience and neuroengineering applications. Recent advances in VLSI systems has focused on designing wireless neural recording systems which can be mounted on animals and acquire neural signals in real time. These advances provide an unparalleled opportunity to study phenomenon such as neural plasticity in both a basic science setting (learning and memory), and also a clinical setting (injury and recovery). Here we present an integrated VLSI system for wireless telemetry of the entire spectrum of neural signals, spikes, local field potentials, electrocorticograms (ECoG) and electroencephalograms (EEG). The system integrates two custom designed VLSI chips, a 16 channel neural interface which can amplify, filter and digitize neural data up to 16 kS/sec and 12 bits and a low power ultra-wideband (UWB) chip which can transmit data at rates up to 14 Mbps. The entire system which includes these VLSI circuits, a digital interface board and a battery, is small, 1.2×1.2×2.6 in3, and light weight, 33 grams, so it can be chronically mounted on a rat. The system consumes 32.8 mA at 3.3V and can record for 6 hours running from the 200 mAh coin cell battery. Bench-top and in vitro characterization of the system showed comparable performance to the wired recording system.

...read moreread less

Book•

On and Off-Chip Crosstalk Avoidance in VLSI Design

[...]

Chunjie Duan¹, Brock J. LaMeres², Sunil P. Khatri³•Institutions (3)

Mitsubishi Electric Research Laboratories¹, Montana State University², Texas A&M University³

08 Jan 2010

TL;DR: This book focuses on crosStalk avoidance with bus encoding, one of the techniques that selectively mitigates the impact of crosstalk and improves the speed and power consumption of the bus interconnect.

...read moreread less

Abstract: Deep Sub-Micron (DSM) processes present many changes to Very Large Scale Integration (VLSI) circuit designers. One of the greatest challenges is crosstalk, which becomes significant with shrinking feature sizes of VLSI fabrication processes. The presence of crosstalk greatly limits the speed and increases the power consumption of the IC design. This book focuses on crosstalk avoidance with bus encoding, one of the techniques that selectively mitigates the impact of crosstalk and improves the speed and power consumption of the bus interconnect. This technique encodes data before transmission over the bus to avoid certain undesirable crosstalk conditions and thereby improve the bus speed and/or energy consumption.

...read moreread less

Journal Article•DOI•

A Low-Cost VLSI Implementation for Efficient Removal of Impulse Noise

[...]

Pei-Yin Chen¹, Chih-Yuan Lien¹, Hsu-Ming Chuang¹•Institutions (1)

National Cheng Kung University¹

01 Mar 2010-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: An efficient VLSI implementation for removing impulse noise is presented and the proposed technique preserves the edge features and obtains excellent performances in terms of quantitative evaluation and visual quality.

...read moreread less

Abstract: Image and video signals might be corrupted by impulse noise in the process of signal acquisition and transmission. In this paper, an efficient VLSI implementation for removing impulse noise is presented. Our extensive experimental results show that the proposed technique preserves the edge features and obtains excellent performances in terms of quantitative evaluation and visual quality. The design requires only low computational complexity and two line memory buffers. Its hardware cost is quite low. Compared with previous VLSI implementations, our design achieves better image quality with less hardware cost. Synthesis results show that the proposed design yields a processing rate of about 167 M samples/second by using TSMC 0.18 ?m technology.

...read moreread less

Journal Article•DOI•

Characterization, test and logic synthesis of novel conservative and reversible logic gates for qca

[...]

Kunal Das¹, Debashis De¹•Institutions (1)

West Bengal University of Technology¹

01 Jun 2010-International Journal of Nanoscience

TL;DR: In this article, a parity preserving reversible (PPR) or conservative reversible logic gate (CRLG) was proposed for low power quantum computing and QCA in the field of very large scale integration (VLSI).

...read moreread less

Abstract: Quantum dot cellular automaton (QCA) is an emerging technology in the field of nanotechnology. Reversible logic is emerging as a promising computing paradigm with applications in low-power quantum computing and QCA in the field of very large scale integration (VLSI) design. In this paper, we worked on conservative logic gate (CLG) and reversible logic gate (RLG). We examined that RLG and CLG are two classes of logic family intersecting each other. The intersection of RLG and CLG is parity preserving reversible (PPR) or conservative reversible logic gate (CRLG). We proposed in this paper, three algorithms to find different k × k RLG as well as CLG. Here, we demonstrate only the most promising two proposed gates of different categories. We compared the results with that of the previous Fredkin gate. The result shows that logic synthesis using above two gates will be a promising step towards the low-power QCA design era. We have shown a parity preserving approach to design all possible CLG. We also discuss a coupled Majority–minority-Voter (MmV) in a single nanostructure, dual outputs are driven simultaneously. This MmV gate is used for implementing n variables symmetric functions, testing the conservative gates as we explained that parity must be preserved if Majority and Minority output are same as input as well as output of CLG.

...read moreread less

Journal Article•DOI•

On the Hardware Implementation Cost of Crypto-Processors Architectures

[...]

Nicolas Sklavos¹•Institutions (1)

Technological Educational Institute of Patras¹

01 Jan 2010-Information Security Journal: A Global Perspective

TL;DR: This paper aims to introduce aspects of design, architecture, and implementation of crypto-processors, and to demonstrate efficient realizations of cryptographic mechanisms and tools in terms of hardware integration.

...read moreread less

Abstract: A variety of modern technologies such as networks, Internet, and electronic services demand private and secure communications for a great number of everyday transactions. Security and cryptography provide a huge set of primitives, methods, and operation modes to support the special needs of data transmission. This paper aims to introduce aspects of design, architecture, and implementation of crypto-processors. It is aimed to demonstrate efficient realizations of cryptographic mechanisms and tools in terms of hardware integration. Computational methodologies, computer arithmetic, and encryption algorithms need deep investigation and research to obtain efficient integrations of crypto-processors, with desirable improvements and optimizations. Approaches on silicon achieve high values of speed and bandwidth. VLSI design is determined with FPGA and ASIC devices, which are two alternative design methodologies for implementing crypto-processors, with several advantages such as flexibility, high performance, and fast time to market. Reconfigurable computing techniques can change the system architectures to several different modes of operations without sacrificing design efficiency or performance.

...read moreread less

Collapse