scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Journal on Emerging and Selected Topics in Circuits and Systems in 2012"


Journal ArticleDOI
TL;DR: A new design for an energy harvesting device is proposed in this paper, which enables scavenging energy from radiofrequency (RF) electromagnetic waves by proposing a dual-stage energy harvesting circuit composed of a seven-stage and ten-stage design, the former being more receptive in the low input power regions, while the latter is more suitable for higher power range.
Abstract: A new design for an energy harvesting device is proposed in this paper, which enables scavenging energy from radiofrequency (RF) electromagnetic waves. Compared to common alternative energy sources like solar and wind, RF harvesting has the least energy density. The existing state-of-the-art solutions are effective only over narrow frequency ranges, are limited in efficiency response, and require higher levels of input power. This paper has a twofold contribution. First, we propose a dual-stage energy harvesting circuit composed of a seven-stage and ten-stage design, the former being more receptive in the low input power regions, while the latter is more suitable for higher power range. Each stage here is a modified voltage multiplier, arranged in series and our design provides guidelines on component choice and precise selection of the crossover operational point for these two stages between the high (20 dBm) and low power (-20 dBm) extremities. Second, we fabricate our design on a printed circuit board to demonstrate how such a circuit can run a commercial Mica2 sensor mote, with accompanying simulations on both ideal and non-ideal conditions for identifying the upper bound on achievable efficiency. With a simple yet optimal dual-stage design, experiments and characterization plots reveal approximately 100% improvement over other existing designs in the power range of -20 to 7 dBm.

444 citations


Journal ArticleDOI
TL;DR: Various challenges and emerging solutions regarding the design of an efficient and reliable WiNoC architecture are presented.
Abstract: Current commercial systems-on-chips (SoCs) designs integrate an increasingly large number of predesigned cores and their number is predicted to increase significantly in the near future. For example, molecular-scale computing promises single or even multiple order-of-magnitude improvements in device densities. The network-on-chip (NoC) is an enabling technology for integration of large numbers of embedded cores on a single die. The existing method of implementing a NoC with planar metal interconnects is deficient due to high latency and significant power consumption arising out of long multi-hop links used in data exchange. The latency, power consumption and interconnect routing problems of conventional NoCs can be addressed by replacing or augmenting multi-hop wired paths with high-bandwidth single-hop long-range wireless links. This opens up new opportunities for detailed investigations into the design of wireless NoCs (WiNoCs) with on-chip antennas, suitable transceivers and routers. Moreover, as it is an emerging technology, the on-chip wireless links also need to overcome significant challenges pertaining to reliable integration. In this paper, we present various challenges and emerging solutions regarding the design of an efficient and reliable WiNoC architecture.

285 citations


Journal ArticleDOI
TL;DR: A new framework for image compressive sensing recovery via collaborative sparsity is proposed, which enforces local 2-D sparsity and nonlocal 3-Dsparsity simultaneously in an adaptive hybrid space-transform domain, thus substantially utilizing intrinsic sparsity of natural images and greatly confining the CS solution space.
Abstract: Compressive sensing (CS) has drawn quite an amount of attention as a joint sampling and compression approach. Its theory shows that when the signal is sparse enough in some domain, it can be decoded from many fewer measurements than suggested by the Nyquist sampling theory. So one of the most challenging researches in CS is to seek a domain where a signal can exhibit a high degree of sparsity and hence be recovered faithfully. Most of the conventional CS recovery approaches, however, exploited a set of fixed bases (e.g., DCT, wavelet, and gradient domain) for the entirety of a signal, which are irrespective of the nonstationarity of natural signals and cannot achieve high enough degree of sparsity, thus resulting in poor rate-distortion performance. In this paper, we propose a new framework for image compressive sensing recovery via collaborative sparsity, which enforces local 2-D sparsity and nonlocal 3-D sparsity simultaneously in an adaptive hybrid space-transform domain, thus substantially utilizing intrinsic sparsity of natural images and greatly confining the CS solution space. In addition, an efficient augmented Lagrangian-based technique is developed to solve the above optimization problem. Experimental results on a wide range of natural images are presented to demonstrate the efficacy of the new CS recovery strategy.

151 citations


Journal ArticleDOI
TL;DR: In this article, the authors report on the main advances enabling the demonstration of functional and performant stacked CMOS-FETs; i.e., wafer bonding, low temperature processes (<;650°C) and salicide stabilization achievements.
Abstract: 3-D sequential integration stands out from other 3-D schemes as it enables the full use of the third dimension. Indeed, in this approach, 3-D contact density matches with the transistor scale. In this paper, we report on the main advances enabling the demonstration of functional and performant stacked CMOS-FETs; i.e., wafer bonding, low temperature processes (<;650°C) and salicide stabilization achievements. This integration scheme enables fine grain partitioning and thus a gain in performance versus cost ratio linked to separation of heterogeneous technologies on distinct levels. In this work, we will detail examples taking advantage of the unique 3-D contact pitch achieved with sequential 3-D.

141 citations


Journal ArticleDOI
TL;DR: A wide bandwidth, compressed sensing based nonuniform sampling (NUS) system with a custom sample-and-hold chip designed to take advantage of a low average sampling rate is presented.
Abstract: We present a wide bandwidth, compressed sensing based nonuniform sampling (NUS) system with a custom sample-and-hold chip designed to take advantage of a low average sampling rate. By sampling signals nonuniformly, the average sample rate can be more than a magnitude lower than the Nyquist rate, provided that these signals have a relatively low information content as measured by the sparsity of their spectrum. The hardware design combines a wideband Indium-Phosphide heterojunction bipolar transistor sample-and-hold with a commercial off-the-shelf analog-to-digital converter to digitize an 800 MHz to 2 GHz band (having 100 MHz of noncontiguous spectral content) at an average sample rate of 236 Ms/s. Signal reconstruction is performed via a nonlinear compressed sensing algorithm, and the challenges of developing an efficient implementation are discussed. The NUS system is a general purpose digital receiver. As an example of its real signal capabilities, measured bit-error-rate data for a GSM channel is presented, and comparisons to a conventional wideband 4.4 Gs/s ADC are made.

136 citations


Journal ArticleDOI
TL;DR: A complete (hardware/ software) sub-Nyquist rate (× 13) wideband signal acquisition chain capable of acquiring radar pulse parameters in an instantaneous bandwidth spanning 100 MHz-2.5 GHz with the equivalent of 8 effective number of bits (ENOB) digitizing performance is presented.
Abstract: In this paper we present a complete (hardware/ software) sub-Nyquist rate (× 13) wideband signal acquisition chain capable of acquiring radar pulse parameters in an instantaneous bandwidth spanning 100 MHz-2.5 GHz with the equivalent of 8 effective number of bits (ENOB) digitizing performance. The approach is based on the alternative sensing-paradigm of compressed sensing (CS). The hardware platform features a fully-integrated CS receiver architecture named the random-modulation preintegrator (RMPI) fabricated in Northrop Grumman's 450 nm InP HBT bipolar technology. The software back-end consists of a novel CS parameter recovery algorithm which extracts information about the signal without performing full time-domain signal reconstruction. This approach significantly reduces the computational overhead involved in retrieving desired information which demonstrates an avenue toward employing CS techniques in power-constrained real-time applications. The developed techniques are validated on CS samples physically measured by the fabricated RMPI and measurement results are presented. The parameter estimation algorithms are described in detail and a complete description of the physical hardware is given.

101 citations


Journal ArticleDOI
TL;DR: This paper uses the ongoing work on leveraging nanophotonics in an on-chip title-to-tile network, processor- to-main-memory network, and dynamic random-access memory (DRAM) channel to illustrate this design process.
Abstract: Technology scaling will soon enable high-performance processors with hundreds of cores integrated onto a single die, but the success of such systems could be limited by the corresponding chip-level interconnection networks. There have been many recent proposals for nanophotonic interconnection networks that attempt to provide improved performance and energy-efficiency compared to electrical networks. This paper discusses the approach we have used when designing such networks, and provides a foundation for designing new networks. We begin by briefly reviewing the basic silicon-photonic device technology before outlining design issues and surveying previous nanophotonic network proposals at the architectural level, the microarchitectural level, and the physical level. In designing our own networks, we use an iterative process that moves between these three levels of design to meet application requirements given our technology constraints. We use our ongoing work on leveraging nanophotonics in an on-chip title-to-tile network, processor-to-main-memory network, and dynamic random-access memory (DRAM) channel to illustrate this design process.

87 citations


Journal ArticleDOI
TL;DR: This work revisits the design of crossbar and high-radix interconnects in light of advances in circuit and layout techniques that improve crossbar scalability, obviating the need for deep multi-stage networks and employs the Swizzle-Switch, an energy and area-efficient switching element that has recently been validated via silicon test chips in 45 nm technology.
Abstract: This work revisits the design of crossbar and high-radix interconnects in light of advances in circuit and layout techniques that improve crossbar scalability, obviating the need for deep multi-stage networks. We employ a new building block, the Swizzle-Switch-an energy and area-efficient switching element that can readily scale to radix 64-that has recently been validated via silicon test chips in 45 nm technology. We evaluate the Swizzle-Switch as both the high-radix building block of a Flattened Butterfly and as a single-stage interconnect, the Swizzle-Switch Network. In the process we address the architectural and layout challenges associated with centralized crossbar systems. Compared to a conventional Mesh, the Flattened Butterfly provides a 15% performance improvement with a 2.5× reduction in the standard deviation of on-chip access times. The Swizzle-Switch Network achieves further gains, providing a 21% improvement in performance, a 3× reduction in on-chip access variability, a 33% reduction in interconnect power, and a 25% reduction in total system energy while only increasing chip area by 7%. Finally, this paper details a 3-D integrated version of the Swizzle-Switch Network, showing up to a 30% gain in performance over the 2-D Swizzle-Switch Network for benchmarks sensitive to interconnect latency. One major concern with 3-D designs is thermal dissipation. We show through detailed thermal analysis that with the highly energy-efficient Swizzle-Switch Network design that the thermal budget is well within that of passive cooling solutions.

83 citations


Journal ArticleDOI
TL;DR: This paper presents two generic very-large-scale integration (VLSI) architectures that implement the approximate message passing (AMP) algorithm for sparse signal recovery and shows that AMP-T is superior to AMp-M with respect to silicon area, throughput, and power consumption, whereasAMP-M offers more flexibility.
Abstract: Sparse signal recovery finds use in a variety of practical applications, such as signal and image restoration and the recovery of signals acquired by compressive sensing. In this paper, we present two generic very-large-scale integration (VLSI) architectures that implement the approximate message passing (AMP) algorithm for sparse signal recovery. The first architecture, referred to as AMP-M, employs parallel multiply-accumulate units and is suitable for recovery problems based on unstructured (e.g., random) matrices. The second architecture, referred to as AMP-T, takes advantage of fast linear transforms, which arise in many real-world applications. To demonstrate the effectiveness of both architectures, we present corresponding VLSI and field-programmable gate array implementation results for an audio restoration application. We show that AMP-T is superior to AMP-M with respect to silicon area, throughput, and power consumption, whereas AMP-M offers more flexibility.

82 citations


Journal ArticleDOI
TL;DR: A novel energy efficient traffic-aware dynamic (TAD) MAC protocol for WBASN that allows the wake-up interval to converge to a steady state for fixed and variable traffic rates, which results in optimized energy consumption.
Abstract: A wireless body area sensor network (WBASN) demands ultra low power and energy efficient protocols. Medium access control (MAC) layer plays a pivotal role for energy management in WBASN. Moreover, idle listening is the dominant energy waste in most of the MAC protocols. WBASN exhibits wide range of traffic variations based on different physiological data emanating from the monitored patient. For example, electrocardiogram data rate is multiple times more in comparison with body temperature rate. In this context, we propose a novel energy efficient traffic-aware dynamic (TAD) MAC protocol for WBASN. The protocol relies on dynamic adaptation of wake-up interval based on a traffic status register bank. The proposed technique allows the wake-up interval to converge to a steady state for fixed and variable traffic rates, which results in optimized energy consumption. A comparison with other energy efficient protocols for three different widely used radio chips i.e., cc2420, cc1000, and amis52100 is presented. The results show that TAD-MAC outperforms all the other protocols under fixed and variable traffic rates. Finally, life- time of a WBASN was estimated and found to be 3-6 times better than other protocols.

79 citations


Journal ArticleDOI
TL;DR: The spread spectrum random modulator pre-integrator (SRMPI), which is a new design and implementation of a CS-based A2I read-out system that uses spread spectrum techniques prior to random modulation in order to produce the low rate set of digital samples, is introduced.
Abstract: The long-standing analog-to-digital conversion paradigm based on Shannon/Nyquist sampling has been challenged lately, mostly in situations such as radar and communication signal processing where signal bandwidth is so large that sampling architectures constraints are simply not manageable. Compressed sensing (CS) is a new emerging signal acquisition/compression paradigm that offers a striking alternative to traditional signal acquisition. Interestingly, by merging the sampling and compression steps, CS also removes a large part of the digital architecture and might thus considerably simplify analog-to-information (A2I) conversion devices. This so-called “analog CS,” where compression occurs directly in the analog sensor readout electronics prior to analog-to-digital conversion, could thus be of great importance for applications where bandwidth is moderate, but computationally complex, and power resources are severely constrained. In our previous work (Mamaghanian, 2011), we quantified and validated the potential of digital CS systems for real-time and energy-efficient electrocardiogram compression on resource-constrained sensing platforms. In this paper, we review the state-of-the-art implementations of CS-based signal acquisition systems and perform a complete system-level analysis for each implementation to highlight their strengths and weaknesses regarding implementation complexity, performance and power consumption. Then, we introduce the spread spectrum random modulator pre-integrator (SRMPI), which is a new design and implementation of a CS-based A2I read-out system that uses spread spectrum techniques prior to random modulation in order to produce the low rate set of digital samples. Finally, we experimentally built an SRMPI prototype to compare it with state-of-the-art CS-based signal acquisition systems, focusing on critical system design parameters and constraints, and show that this new proposed architecture offers a compelling alternative, in particular for low power and computationally-constrained embedded systems.

Journal ArticleDOI
TL;DR: This ECG sensor node can accurately record and detect the QRS peaks of ECG waveform with high-frequency noise suppression and is convenient for long-term monitoring of cardiovascular condition of patients, and is very suitable for on-body WBAN applications.
Abstract: This paper proposes a power and area efficient electrocardiogram (ECG) acquisition and signal processing application sensor node for wireless body area networks (WBAN). This sensor node can accurately record and detect the QRS peaks of ECG waveform with high-frequency noise suppression. The proposed system is implemented in 0.18-μm complementary metal-oxide-semiconductor technology with two chips: analog front end integrated circuit (IC) and digital application specific integrated circuit (ASIC), where the analog IC consumes only 79.6 μW with area of 4.25 mm2 and digital ASIC consumes 9 μW at 32 kHz with 1.2 mm2. Therefore, this ECG sensor node is convenient for long-term monitoring of cardiovascular condition of patients, and is very suitable for on-body WBAN applications.

Journal ArticleDOI
TL;DR: A unified design methodology is proposed to determine the optimal location of the power supplies and decoupling capacitors in high performance integrated circuits.
Abstract: The performance of an integrated circuit depends strongly upon the power delivery system. With the introduction of ultra-small on-chip voltage regulators, novel design methodologies are needed to simultaneously determine the location of the on-chip power supplies and decoupling capacitors. In this paper, a unified design methodology is proposed to determine the optimal location of the power supplies and decoupling capacitors in high performance integrated circuits. Optimization algorithms widely used for facility location problems are applied in the proposed methodology. The effect of the number and location of the power supplies and decoupling capacitors on the power noise and response time is discussed.

Journal ArticleDOI
TL;DR: The Nyquist Folding Receiver (NYFR), an efficient A2I architecture that folds the broadband RF input prior to digitization by a narrowband ADC, enables information recovery with very low computational complexity algorithms in addition to traditional CS reconstruction techniques.
Abstract: Recovering even a small amount of information from a broadband radio frequency (RF) environment using conventional analog-to-digital converter (ADC) technology is computationally complex and presents significant challenges. For sparse or compressible RF environments, an alternate approach to conventional sampling is analog-to-information (A2I) to enable sub-Nyquist rate sampling based on compressive sensing (CS) principles. This paper presents the Nyquist Folding Receiver (NYFR), an efficient A2I architecture that folds the broadband RF input prior to digitization by a narrowband ADC. The folding is achieved by undersampling the RF spectrum with a stream of short pulses that have a phase modulated sampling period. The undersampled signals then fold down into a low pass interpolation filter. The pulse sample time modulation induces a corresponding phase modulation on the received signals that is scaled by an integer modulation index that varies with the Nyquist zone (i.e., fold number), allowing the signals to be separated based on the measured modulation index. Unlike many schemes motivated by CS that randomize the RF prior to digitization, the NYFR substantially preserves signal structure. This enables information recovery with very low computational complexity algorithms in addition to traditional CS reconstruction techniques. The paper includes a comparison of seven other A2I architectures with the NYFR.

Journal ArticleDOI
TL;DR: It is shown that the (Random Modulation Pre-Integration) RMPI architecture and its recently proposed adjustments are probably the most versatile approach though not always the most economic to implement and that when 1-bit quantization is sought, dynamically mixing quantization and integration in a randomized ΔΣ architecture help bringing the performance much closer to that of multi-bit approaches.
Abstract: The paper aims to highlight relative strengths and weaknesses of some of the recently proposed architectures for hardware implementation of analog-to-information converters based on Compressive Sensing. To do so, the most common architectures are analyzed when saturation of some building blocks is taken into account, and when measurements are subject to quantization to produce a digital stream. Furthermore, the signal reconstruction is performed by established and novel algorithms (one based on linear programming and the other based on iterative guessing of the support of the target signal), as well as their specialization to the particular architecture producing the measurements. Performance is assessed both as the probability of correct support reconstruction and as the final reconstruction error. Our results help highlighting pros and cons of various architectures and giving quantitative answers to some typical design-oriented questions. Among these, we show: 1) that the (Random Modulation Pre-Integration) RMPI architecture and its recently proposed adjustments are probably the most versatile approach though not always the most economic to implement; 2) that when 1-bit quantization is sought, dynamically mixing quantization and integration in a randomized ΔΣ architecture help bringing the performance much closer to that of multi-bit approaches; 3) for each architecture, the trade-off between number of measurements and number of bits per measurements (given a fixed bit-budget); and 4) pros and cons of the use of Gaussian versus binary random variables for signal acquisition.

Journal ArticleDOI
TL;DR: This paper proposes an alternative compressive sensing based approach to exploit the sparsity of simultaneous touches with respect to the number of sensor nodes to achieve similar levels of responsiveness.
Abstract: Capacitive touch screens are ubiquitous in today's electronic devices. Improved touch screen responsiveness and resolution can be achieved at the expense of the touch screen controller analog hardware complexity and power consumption. This paper proposes an alternative compressive sensing based approach to exploit the sparsity of simultaneous touches with respect to the number of sensor nodes to achieve similar levels of responsiveness. It is possible to reduce the analog data acquisition complexity at the cost of extra digital computations with less total power consumption. Using compressive sensing, in order to resolve the positions of the sparse touches, the number of measurements required is related to the number of touches rather than the number of nodes. Detailed measurement circuits and methodologies are presented along with the corresponding reconstruction algorithm.

Journal ArticleDOI
TL;DR: A very-large-scale integration (VLSI) friendly electrocardiogram (ECG) QRS detector for body sensor networks by a mathematical morphological method and the multipixel modulus accumulation is employed to act as a low-pass filter to enhance the QRS complex and improve the signal-to-noise ratio.
Abstract: This paper aims to present a very-large-scale integration (VLSI) friendly electrocardiogram (ECG) QRS detector for body sensor networks. Baseline wandering and background noise are removed from original ECG signal by a mathematical morphological method. Then the multipixel modulus accumulation is employed to act as a low-pass filter to enhance the QRS complex and improve the signal-to-noise ratio. The performance of the algorithm is evaluated with standard MIT-BIH arrhythmia database and wearable exercise ECG Data. Corresponding power and area efficient VLSI architecture is designed and implemented on a commercial nano-FPGA. High detection rate and high speed demonstrate the effectiveness of the proposed detector.

Journal ArticleDOI
TL;DR: A sub-Nyquist rate data acquisition front-end based on compressive sensing theory that randomizes a sparse input signal by mixing it with pseudo-random number sequences and exploits the signal sparsity to reconstruct the signal with high fidelity.
Abstract: This paper presents a sub-Nyquist rate data acquisition front-end based on compressive sensing theory. The front-end randomizes a sparse input signal by mixing it with pseudo-random number sequences, followed by analog-to-digital converter sampling at sub-Nyquist rate. The signal is then reconstructed using an L1-based optimization algorithm that exploits the signal sparsity to reconstruct the signal with high fidelity. The reconstruction is based on a priori signal model information, such as a multi-tone frequency-sparse model which matches the input signal frequency support. Wideband multi-tone test signals with 4% sparsity in 5~500 MHz band were used to experimentally verify the front-end performance. Single-tone and multi-tone tests show maximum signal to noise and distortion ratios of 40 dB and 30 dB, respectively, with an equivalent sampling rate of 1 GS/s. The analog front-end was fabricated in a 90 nm complementary metal-oxide-semiconductor process and consumes 55 mW. The front-end core occupies 0.93 mm2.

Journal ArticleDOI
TL;DR: The paper describes the different kind of algorithms featured and the circuitry employed at top and bottom tiers, and the Gaussian pyramid is implemented with a switched-capacitor network in less than 50 μs, outperforming more conventional solutions.
Abstract: This paper reports a multi-layered smart image sensor architecture for feature extraction based on detection of interest points. The architecture is conceived for 3-D integrated circuit technologies consisting of two layers (tiers) plus memory. The top tier includes sensing and processing circuitry aimed to perform Gaussian filtering and generate Gaussian pyramids in fully concurrent way. The circuitry in this tier operates in mixed-signal domain. It embeds in-pixel correlated double sampling, a switched-capacitor network for Gaussian pyramid generation, analog memories and a comparator for in-pixel analog-to-digital conversion. This tier can be further split into two for improved resolution; one containing the sensors and another containing a capacitor per sensor plus the mixed-signal processing circuitry. Regarding the bottom tier, it embeds digital circuitry entitled for the calculation of Harris, Hessian, and difference-of-Gaussian detectors. The overall system can hence be configured by the user to detect interest points by using the algorithm out of these three better suited to practical applications. The paper describes the different kind of algorithms featured and the circuitry employed at top and bottom tiers. The Gaussian pyramid is implemented with a switched-capacitor network in less than 50 μs, outperforming more conventional solutions.

Journal ArticleDOI
TL;DR: A heterogeneous integration platform for bio-sensing applications, which seamlessly integrates low-power silicon-based circuits with cost-effective printed electronics with fast prototype of the customized electrode pattern, is presented.
Abstract: In this paper, we present a heterogeneous integration platform for bio-sensing applications, which seamlessly integrates low-power silicon-based circuits with cost-effective printed electronics. A prototype of wearable Bio-Sensing Node is fabricated to investigate the suitability of this integration approach. A 1.5 ×3.0 mm2 customized mixed-signal system-on-chip (SoC) with the size of is utilized to amplify, digitize, buffer, and transmit the sensed bio-signals. Inkjet printing technology is employed to print nano-particle silver ink on a flexible substrate to fabricate chip-on-ιex, electrodes as well as interconnections. This additive and digital fabrication technology enables fast prototype of the customized electrode pattern. Its high accuracy and fine resolution features allow the direct integration of the bare die (the pad size of 65 μm and pitch size of 90 μm) on the flexible substrate, which significantly miniaturizes the wearable system. The optimal size and layout of printed electrodes are investigated through the in vivo test for electrocardiogram recording applications. The total size of the implemented Bio-Sensing Node is 4.5 ×2.5 μm, which is comparable with a commercial electrode. This inkjet printed heterogeneous integration approach offers a promising solution for the next-generation cost-effective personalized wearable healthcare monitoring devices.

Journal ArticleDOI
TL;DR: A receiver architecture suitable for devices in wireless body area networks is presented, and a tailored demodulation structure has been designed to make the digital baseband compact and low power.
Abstract: A receiver architecture suitable for devices in wireless body area networks is presented. Such devices require minimum physical size and power consumption. To achieve this the receiver should, therefore, be fully integrated in state-of-the-art complementary metal-oxide-semiconductor (CMOS) technology, and size and power consumption must be carefully considered at all levels of design. The chosen modulation is frequency shift keying, for which transmitters can be realized with high efficiency and low spurious emissions. A direct-conversion receiver architecture is used to achieve minimum power consumption and a modulation index equal to two is chosen, creating a midchannel notch in the modulated signal. A tailored demodulation structure has been designed to make the digital baseband compact and low power. To increase sensitivity it has been designed to interface with an analog decoder. Implementation in the analog domain minimizes the decoder power consumption. Antenna design and wave propagation are taken into account via simulations with phantoms. The 2.45-GHz ISM band was chosen as a good compromise between antenna size and link loss. An ultra-low power medium access scheme has been designed, which is used both for system evaluation and for assisting system design choices. Receiver blocks have been fabricated in 65-nm CMOS, and a radio-frequency front-end and an analog-to-digital converter have been measured. Simulations of the complete baseband have been performed, investigating impairments due to 1/f noise, frequency and time offsets.

Journal ArticleDOI
TL;DR: This paper investigates alternative interconnect technologies that can be exploited to address the communication challenges in future many core processor and provides an overview of the different technologies that are available and how they impact the architecture of the on-chip communication and the system design.
Abstract: The continuing scaling of transistors has increased the number of cores available in current processors, and the number of cores is expected to continue to increase. In such many core processors, the communication between cores with the on-chip interconnect is becoming a challenge as it not only must provide low latency and high bandwidth but also needs to be cost-effective in terms of power consumption. The communication challenge is not only within a single chip but providing high bandwidth to the increasing number of cores from off-chip memory is also a challenge. The conventional metal interconnect is limited, especially for global communication, and can not scale efficiently. In this paper, we investigate alternative interconnect technologies that can be exploited to address the communication challenges in future many core processor. We provide an overview of the different technologies that are available and then, investigate how these interconnect technologies impact the architecture of the on-chip communication and the system design.

Journal ArticleDOI
TL;DR: A new hybrid photonic/plasmonic channel that can support WDM through the use of photonic micro-ring resonators as variation tolerant passive filters is proposed and can save from 28% to 45% of total channel energy-cost per bit depending on process variation conditions.
Abstract: Nanophotonic architectures have recently been proposed as a path to providing low latency, high bandwidth network-on-chips. These proposals have primarily been based on micro-ring resonator modulators which, while capable of operating at tremendous speed, are known to have both a high manufacturing induced variability and a high degree of temperature dependence. The most common solution to these two problems is to introduce small heaters to control the temperature of the ring directly, which can significantly reduce overall power efficiency. In this paper, we introduce plasmonics as a complementary technology. While plasmonic devices have several important advantages, they come with their own new set of restrictions, including propagation loss and lack of wave division multiplexing (WDM) support. To overcome these challenges we propose a new hybrid photonic/plasmonic channel that can support WDM through the use of photonic micro-ring resonators as variation tolerant passive filters. Our aim is to exploit the best of both technologies: wave-guiding of photonics, and modulating using plasmonics. This channel provides moderate bandwidth with distance independent power consumption and a higher degree of temperature and process variation tolerance. We describe the state of plasmonics research, present architecturally-useful models of many of the most important devices, explore new ways in which the limitations of the technology can most readily be minimized, and quantify the applicability of these novel hybrid schemes across a variety of interconnect strategies. Our link-level analysis shows that the hybrid channel can save from 28% to 45% of total channel energy-cost per bit depending on process variation conditions.

Journal ArticleDOI
TL;DR: A Hopfield-Network-like analog system is proposed as a solution, using the locally competitive algorithm (LCA) to solve an overcomplete l1 sparse approximation problem, and a scalable system architecture using sub-threshold currents is described, including vector matrix multipliers (VMMs) and a nonlinear thresholder.
Abstract: Compressed sensing is an important application in signal and image processing which requires solving nonlinear optimization problems. A Hopfield-Network-like analog system is proposed as a solution, using the locally competitive algorithm (LCA) to solve an overcomplete l1 sparse approximation problem. A scalable system architecture using sub-threshold currents is described, including vector matrix multipliers (VMMs) and a nonlinear thresholder. A 4 × 6 nonlinear system is implemented on the RASP 2.9v chip, a field programmable analog array with directly programmable floating gate elements, allowing highly accurate VMMs. The circuit successfully reproduced the outputs of a digital optimization program, converging to within 4.8% rms, and an objective value only 1.3% higher on average. The active circuit consumed 29 μA of current at 2.4 V, and converges on solutions in 240 μs. A smaller 2 × 3 system is also implemented. Extrapolating the scaling trends to a N=1000 node system, the analog LCA compares favorably with state-of-the-art digital solutions, using a small fraction of the power to arrive at solutions ten times faster. Finally, we provide simulations of large scale systems to show the behavior of the system scaled to nontrivial problem sizes.

Journal ArticleDOI
TL;DR: In this article, the authors propose a framework for the acquisition and reconstruction of multidimensional correlated signals, which can be applied to D-dimensional signals, even if the algorithms they propose to practically implement such architectures apply to 2-D and 3-D signals.
Abstract: Compressed sensing (CS) is an innovative technique allowing to represent signals through a small number of their linear projections. Hence, CS can be thought of as a natural candidate for acquisition of multidimensional signals, as the amount of data acquired and processed by conventional sensors could create problems in terms of computational complexity. In this paper, we propose a framework for the acquisition and reconstruction of multidimensional correlated signals. The approach is general and can be applied to D dimensional signals, even if the algorithms we propose to practically implement such architectures apply to 2-D and 3-D signals. The proposed architectures employ iterative local signal reconstruction based on a hybrid transform/prediction correlation model, coupled with a proper initialization strategy.

Journal ArticleDOI
TL;DR: It is demonstrated that the area overhead of a 3-D power distribution network with via-first TSVs is approximately 9% as compared to less than 2% in via-middle and via-last technologies.
Abstract: Three primary techniques for manufacturing through silicon vias (TSVs), via-first, via-middle, and via-last, have been analyzed and compared to distribute power in a 3-D processor-memory system with nine planes. Due to distinct fabrication techniques, these TSV technologies require significantly different design constraints, as investigated in this paper. A valid design space that satisfies the peak power supply noise while minimizing area overhead is identified for each technology. It is demonstrated that the area overhead of a 3-D power distribution network with via-first TSVs is approximately 9% as compared to less than 2% in via-middle and via-last technologies. Despite this drawback, a via-first based power network is typically overdamped and the issue of resonance is alleviated. A via-last based power network, however, exhibits a relatively low damping factor and the peak noise is highly sensitive to the number of TSVs and decoupling capacitance.

Journal ArticleDOI
Min Wang1, Jun Wu1, Sai Feng Shi1, Chong Luo2, Feng Wu2 
TL;DR: This paper proposes a fast BiCS decoding algorithm and its corresponding partial-parallel hardware design, and proposes a multilevel cyclic-shift approach to generate the CS measurement matrix.
Abstract: Binary-input compressive sensing (BiCS) has recently been applied to wireless communications as a modulated coding scheme for seamless rate adaptation. Different from conventional channel codes which generate binary symbols with logical-OR (XOR) operations, BiCS generates multilevel symbols through weighted sum operation. Although BiCS can be decoded by message passing, it needs to compute the convolution of probability functions in each iteration. The high decoding complexity has prevented the technique from being applied to practical use. In this paper, we propose a fast BiCS decoding algorithm and its corresponding partial-parallel hardware design. In this algorithm, we first build lookup tables to solve the computationally intensive problem of convolution. Through these tables, we successfully convert the convolution of probabilities into the polynomial of some exponential terms. This key step allows us to use log-likelihood ratio as message in message passing decoding and a fast algorithm is developed by approximate computing. We further design a partial-parallel hardware decoder. To avoid memory collision, we propose a multilevel cyclic-shift approach to generate the CS measurement matrix. We design horizontal unit processors with the proposed tables for iterative computing. Our analyses show that the proposed fast algorithm can reduce multiplications by nearly 90%. The decoding speed of our field-programmable gate array design reaches the range of communication rate in modern wireless networks.

Journal ArticleDOI
TL;DR: This paper investigates the problem of robust H∞ control for uncertain discrete-time Takagi-Sugeno (T-S) fuzzy networked control systems (NCSs) with state quantization and derives a less conservative delay-dependent stability condition for the closed NCSs.
Abstract: This paper investigates the problem of robust H∞ control for uncertain discrete-time Takagi-Sugeno (T-S) fuzzy networked control systems (NCSs) with state quantization. A new model of network-based control with simultaneous consideration of network induced delays and packet dropouts is proposed. Using fuzzy Lyapunov-Krasovskii functional, we derive a less conservative delay-dependent stability condition for the closed NCSs. Robust H∞ fuzzy controller is developed for the asymptotic stabilization of the NCSs and expressed in linear matrix inequality-based conditions. Numerical simulation examples show the feasibility applications of the developed technique.

Journal ArticleDOI
TL;DR: This paper models the periodic characteristics of body sensor network (BSN) wireless channels measured using custom hardware in the 900-MHz and 2.4-GHz bands to reveal characteristics of BSN channels that can be exploited for reducing the power of wireless communication.
Abstract: This paper models the periodic characteristics of body sensor network (BSN) wireless channels measured using custom hardware in the 900-MHz and 2.4-GHz bands. The hardware logs received signal strength indication (RSSI) values of both bands simultaneously at a sample rate of 1.3 kS/s. Results from a measurement campaign of BSNs are shown and distilled to reveal characteristics of BSN channels that can be exploited for reducing the power of wireless communication. A new channel model is introduced to add periodicity to existing 802.15.6 WBAN path loss equations. New parameters, activity factor and location factor, are introduced to estimate the model parameters. Finally, a strategy for exploiting the periodic characteristics of the BSN channel is presented as an example, along with the power savings from using this strategy.

Journal ArticleDOI
TL;DR: QDR inductive-coupling interface between 65-nm complementary metal-oxide-semicon ductor (CMOS) logic and emulated 100-nm dynamic random access memory (DRAM) is developed.
Abstract: 1 TB/s 1 pJ/b 6.4 mm2 /TB/s QDR inductive-coupling interface between 65-nm complementary metal-oxide-semicon ductor (CMOS) logic and emulated 100-nm dynamic random access memory (DRAM) is developed. BER <;10-10 operation is examined in 1024-bit parallel links. Compared to the latest wired 40-nm DRAM interface, the bandwidth is increased to 32×, and the energy consumption and the layout area are reduced to 1/8 and 1/22, respectively.