scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Journal of Solid-state Circuits in 2013"


Journal ArticleDOI
TL;DR: The design requirements for the very demanding target application, the SpiNNaker micro-architecture, are reviewed and the chips are fully operational and meet their power and performance requirements.
Abstract: The modelling of large systems of spiking neurons is computationally very demanding in terms of processing power and communication. SpiNNaker - Spiking Neural Network architecture - is a massively parallel computer system designed to provide a cost-effective and flexible simulator for neuroscience experiments. It can model up to a billion neurons and a trillion synapses in biological real time. The basic building block is the SpiNNaker Chip Multiprocessor (CMP), which is a custom-designed globally asynchronous locally synchronous (GALS) system with 18 ARM968 processor nodes residing in synchronous islands, surrounded by a lightweight, packet-switched asynchronous communications infrastructure. In this paper, we review the design requirements for its very demanding target application, the SpiNNaker micro-architecture and its implementation issues. We also evaluate the SpiNNaker CMP, which contains 100 million transistors in a 102-mm2 die, provides a peak performance of 3.96 GIPS, and has a peak power consumption of 1 W when all processor cores operate at the nominal frequency of 180 MHz. SpiNNaker chips are fully operational and meet their power and performance requirements.

395 citations


Journal ArticleDOI
TL;DR: The studies based on the proposed scaling methodology show that in-plane STT-MRAM will outperform SRAM from 15 nm node, while its perpendicular counterpart requires further innovations in MTJ material in order to overcome the poor write performance scaling from 22 nm node onwards.
Abstract: This paper explores the scalability of in-plane and perpendicular MTJ based STT-MRAMs from 65 nm to 8 nm while taking into consideration realistic variability effects. We focus on the read and write performances of a STT-MRAM based cache rather than the obvious advantages such as the denser bit-cell and zero static power. An accurate MTJ macromodel capturing key MTJ properties was adopted for efficient Monte Carlo simulations. For the simulation of access devices and peripheral circuitries, ITRS projected transistor parameters were utilized and calibrated using the MASTAR tool that has been widely used in industry. 6T SRAM and STT-MRAM arrays were implemented with aggressive assist schemes to mimic industrial memory designs. A constant JC0·RA/VDD scaling scenario was used which to the first order gives the optimal balance between read and write margins of STT-MRAMs. The thermal stability factor ensuring a 10 year retention time was obtained by adjusting the free layer thickness as well as assuming improvement in the crystalline anisotropy. Our studies based on the proposed scaling methodology show that in-plane STT-MRAM will outperform SRAM from 15 nm node, while its perpendicular counterpart requires further innovations in MTJ material in order to overcome the poor write performance scaling from 22 nm node onwards.

322 citations


Journal ArticleDOI
TL;DR: This SoC is designed so the integration and interaction of circuit blocks accomplish an integrated, flexible, and reconfigurable wireless BSN SoC capable of autonomous power management and operation from harvested power, thus prolonging the node lifetime indefinitely.
Abstract: This paper presents an ultra-low power batteryless energy harvesting body sensor node (BSN) SoC fabricated in a commercial 130 nm CMOS technology capable of acquiring, processing, and transmitting electrocardiogram (ECG), electromyogram (EMG), and electroencephalogram (EEG) data. This SoC utilizes recent advances in energy harvesting, dynamic power management, low voltage boost circuits, bio-signal front-ends, subthreshold processing, and RF transmitter circuit topologies. The SoC is designed so the integration and interaction of circuit blocks accomplish an integrated, flexible, and reconfigurable wireless BSN SoC capable of autonomous power management and operation from harvested power, thus prolonging the node lifetime indefinitely. The chip performs ECG heart rate extraction and atrial fibrillation detection while only consuming 19 μW, running solely on harvested energy. This chip is the first wireless BSN powered solely from a thermoelectric harvester and/or RF power and has lower power, lower minimum supply voltage (30 mV), and more complete system integration than previously reported wireless BSN SoCs.

311 citations


Journal ArticleDOI
TL;DR: This paper quantifies the benefits and derives an upper bound on the performance by considering kT/C noise and slewing requirements of the circuit driving the system and a frequency-domain analysis of interleaved converters sheds light on the corruption mechanisms due to interchannel mismatches.
Abstract: Interleaving can relax the power-speed tradeoffs of analog-to-digital converters and reduce their metastability error rate while increasing the input capacitance. This paper quantifies the benefits and derives an upper bound on the performance by considering kT/C noise and slewing requirements of the circuit driving the system. A frequency-domain analysis of interleaved converters is also presented that sheds light on the corruption mechanisms due to interchannel mismatches. A background timing mismatch calibration technique is proposed and experimentally shown to reduce the image to -75 dB for input frequencies exceeding 500 MHz.

264 citations


Journal ArticleDOI
TL;DR: A novel pixel photo sensing and transimpedance pre-amplification stage makes it possible to improve by one order of magnitude contrast sensitivity and power, and reduce the best reported FPN (Fixed Pattern Noise) by a factor of 2, while maintaining the shortest reported latency and good Dynamic Range.
Abstract: Dynamic Vision Sensors (DVS) have recently appeared as a new paradigm for vision sensing and processing. They feature unique characteristics such as contrast coding under wide illumination variation, micro-second latency response to fast stimuli, and low output data rates (which greatly improves the efficiency of post-processing stages). They can track extremely fast objects (e.g., time resolution is better than 100 kFrames/s video) without special lighting conditions. Their availability has triggered a new range of vision applications in the fields of surveillance, motion analyses, robotics, and microscopic dynamic observations. One key DVS feature is contrast sensitivity, which has so far been reported to be in the 10-15% range. In this paper, a novel pixel photo sensing and transimpedance pre-amplification stage makes it possible to improve by one order of magnitude contrast sensitivity (down to 1.5%) and power (down to 4 mW), reduce the best reported FPN (Fixed Pattern Noise) by a factor of 2 (down to 0.9%), while maintaining the shortest reported latency (3 μs) and good Dynamic Range (120 dB), and further reducing overall area (down to 30 × 31 μm per pixel). The only penalty is the limitation of intrascene Dynamic Range to 3 decades. A 128 × 128 DVS test prototype has been fabricated in standard 0.35 μm CMOS and extensive experimental characterization results are provided.

249 citations


Journal ArticleDOI
TL;DR: An 8-channel scalable EEG acquisition SoC is presented to continuously detect and record patient-specific seizure onset activities from scalp EEG.
Abstract: An 8-channel scalable EEG acquisition SoC is presented to continuously detect and record patient-specific seizure onset activities from scalp EEG. The SoC integrates 8 high-dynamic range Analog Front-End (AFE) channels, a machine-learning seizure classification processor and a 64 KB SRAM. The classification processor exploits the Distributed Quad-LUT filter architecture to minimize the area while also minimizing the overhead in power × delay . The AFE employs a Chopper-Stabilized Capacitive Coupled Instrumentation Amplifier to show NEF of 5.1 and noise RTI of 0.91 μVrms for 0.5-100 Hz bandwidth. The classification processor adopts a support-vector machine as a classifier, with a GBW controller that gives real-time gain and bandwidth feedback to AFE to maintain accuracy. The SoC is verified with the Children's Hospital Boston-MIT EEG database as well as with rapid eye blink pattern detection test. The SoC is implemented in 0.18 μm 1P6M CMOS process occupying 25 mm2, and it shows an accuracy of 84.4% in eye blink classification test, at 2.03 μJ/classification energy efficiency. The 64 KB on chip memory can store up to 120 seconds of raw EEG data.

239 citations


Journal ArticleDOI
TL;DR: This paper presents bandgap reference (BGR) and sub-BGR circuits for nanowatt LSIs, which avoid the use of resistors and contain only MOSFETs and one bipolar transistor and can operate at a sub-1-V supply.
Abstract: This paper presents bandgap reference (BGR) and sub-BGR circuits for nanowatt LSIs. The circuits consist of a nano-ampere current reference circuit, a bipolar transistor, and proportional-to-absolute-temperature (PTAT) voltage generators. The proposed circuits avoid the use of resistors and contain only MOSFETs and one bipolar transistor. Because the sub-BGR circuit divides the output voltage of the bipolar transistor without resistors, it can operate at a sub-1-V supply. The experimental results obtained in the 0.18-μm CMOS process demonstrated that the BGR circuit could generate a reference voltage of 1.09 V and the sub-BGR circuit could generate one of 0.548 V. The power dissipations of the BGR and sub-BGR circuits corresponded to 100 and 52.5 nW.

219 citations


Journal ArticleDOI
TL;DR: This paper describes the design of a low power, energy-efficient CMOS smart temperature sensor intended for RFID temperature sensing that employs an energy- efficient 2nd-order zoom ADC, which combines a coarse 5-bit SAR conversion with a fine 10-bit ΔΣ conversion.
Abstract: This paper describes the design of a low power, energy-efficient CMOS smart temperature sensor intended for RFID temperature sensing. The BJT-based sensor employs an energy- efficient 2nd-order zoom ADC, which combines a coarse 5-bit SAR conversion with a fine 10-bit ΔΣ conversion. Moreover, a new integration scheme is proposed that halves the conversion time, while requiring no extra supply current. To meet the stringent cost constraints on RFID tags, a fast voltage calibration technique is used, which can be carried out in only 200 msec. After batch calibration and an individual room-temperature calibration, the sensor achieves an inaccuracy of ±0.15°C (3σ) from -55°C to 125°C . Over the same range, devices from a second lot achieved an inaccuracy of ±0.25°C (3σ) in both ceramic and plastic packages. The sensor occupies 0.08 mm2 in a 0.16 μm CMOS process, draws 3.4 μA from a 1.5 V to 2 V supply, and achieves a resolution of 20 mK in a conversion time of 5.3 msec. This corresponds to a minimum energy dissipation of 27 nJ per conversion.

216 citations


Journal ArticleDOI
TL;DR: A 60-GHz dual-mode power amplifier is implemented in 40-nm bulk CMOS technology and a new transistor layout is proposed to minimize the device and interconnect parasitics while the neutralized amplifier stage is co-optimized with input transformer to improve the power gain and stability.
Abstract: A 60-GHz dual-mode power amplifier (PA) is implemented in 40-nm bulk CMOS technology. To boost the amplifier performance at millimeter-wave (mmWave) frequencies, a new transistor layout is proposed to minimize the device and interconnect parasitics while the neutralized amplifier stage is co-optimized with input transformer to improve the power gain and stability. The transformer-based power-combining PA consists of two unit amplifiers, operating in Class AB for better back-off efficiency. To further reduce the power consumption and hence extend battery lifetime, one unit PA is tuned off in low-power mode. A switch is used to short the output of this non-operating unit PA to reduce the combiner loss and improve the efficiency. The PA achieves a measured saturated output power (PSAT) of 17.0 dBm (12.1 dBm) and 1-dB compressed power (P1dB) of 13.8 dBm (9.1 dBm) in the high-power (low-power) mode. The power-added efficiencies (PAEs) at PSAT and P1dB are 30.3% and 21.6% respectively for the high-power mode. Compared to Class A, the PA operating in Class AB shows 5.3% improvement in measured PAE at P1dB with no compromise in linearity. The PA with the power combiner only occupies an active area of 0.074 mm 2. The reliability measurements are also conducted and the PA has an estimated lifetime of 80613 hours.

208 citations


Journal ArticleDOI
TL;DR: A CMOS image sensor architecture with built-in single-shot compressed sensing with modest quality loss relative to normal capture and significantly higher image quality than downsampling is described.
Abstract: A CMOS image sensor architecture with built-in single-shot compressed sensing is described. The image sensor employs a conventional 4-T pixel and per-column ΣΔ ADCs. The compressed sensing measurements are obtained via a column multiplexer that sequentially applies randomly selected pixel values to the input of each ΣΔ modulator. At the end of readout, each ADC outputs a quantized value of the average of the pixel values applied to its input. The image is recovered from the random linear measurements off-chip using numerical optimization algorithms. To demonstrate this architecture, a 256x256 pixel CMOS image sensor is fabricated in 0.15 μm CIS process. The sensor can operate in compressed sensing mode with compression ratio 1/4, 1/8, or 1/16 at 480, 960, or 1920 fps, respectively, or in normal capture mode with no compressed sensing at a maximum frame rate of 120 fps. Measurement results demonstrate capture in compressed sensing mode at roughly the same readout noise of 351 μVrms and power consumption of 96.2 mW of normal capture at 120 fps. This performance is achieved with only 1.8% die area overhead. Image reconstruction shows modest quality loss relative to normal capture and significantly higher image quality than downsampling.

204 citations


Journal ArticleDOI
TL;DR: A self-adapting power management unit is proposed for efficient battery voltage down conversion for wide range of battery voltages and load current and adapts itself by monitoring energy harvesting conditions and harvesting sources.
Abstract: A 1.0 mm3 general-purpose sensor node platform with heterogeneous multi-layer structure is proposed. The sensor platform benefits from modularity by allowing the addition/removal of IC layers. A new low power I2C interface is introduced for energy efficient inter-layer communication with compatibility to commercial I2C protocols. A self-adapting power management unit is proposed for efficient battery voltage down conversion for wide range of battery voltages and load current. The power management unit also adapts itself by monitoring energy harvesting conditions and harvesting sources and is capable of harvesting from solar, thermal and microbial fuel cells. An optical wakeup receiver is proposed for sensor node programming and synchronization with 228 pW standby power. The system also includes two processors, timer, temperature sensor, and low-power imager. Standby power of the system is 11 nW.

Journal ArticleDOI
TL;DR: A power-efficient wireless stimulating system for a head-mounted deep brain stimulator (DBS) is presented, which increases the stimulation efficiency up to 30% higher than a fixed supply voltage and achieves high AC-DC power conversion efficiency (PCE) through active synchronous switching.
Abstract: A power-efficient wireless stimulating system for a head-mounted deep brain stimulator (DBS) is presented. A new adaptive rectifier generates a variable DC supply voltage from a constant AC power carrier utilizing phase control feedback, while achieving high AC-DC power conversion efficiency (PCE) through active synchronous switching. A current-controlled stimulator adopts closed-loop supply control to automatically adjust the stimulation compliance voltage by detecting stimulation site potentials through a voltage readout channel, and improve the stimulation efficiency. The stimulator also utilizes closed-loop active charge balancing to maintain the residual charge at each site within a safe limit, while receiving the stimulation parameters wirelessly from the amplitude-shift-keyed power carrier. A 4-ch wireless stimulating system prototype was fabricated in a 0.5-μm 3M2P standard CMOS process, occupying 2.25 mm2. With 5 V peak AC input at 2 MHz, the adaptive rectifier provides an adjustable DC output between 2.5 V and 4.6 V at 2.8 mA loading, resulting in measured PCE of 72 ~ 87%. The adaptive supply control increases the stimulation efficiency up to 30% higher than a fixed supply voltage to 58 ~ 68%. The prototype wireless stimulating system was verified in vitro.

Journal ArticleDOI
TL;DR: A comprehensive study of circuit-to-phase-noise conversion mechanisms of different oscillators' structures shows the proposed class-F exhibits the lowest phase noise at the same tank's quality factor and supply voltage.
Abstract: An oscillator topology demonstrating an improved phase noise performance is proposed in this paper. It exploits the time-variant phase noise model with insights into the phase noise conversion mechanisms. The proposed oscillator is based on enforcing a pseudo-square voltage waveform around the LC tank by increasing the third-harmonic of the fundamental oscillation voltage through an additional impedance peak. This auxiliary impedance peak is realized by a transformer with moderately coupled resonating windings. As a result, the effective impulse sensitivity function (ISF) decreases thus reducing the oscillator's effective noise factor such that a significant improvement in the oscillator phase noise and power efficiency are achieved. A comprehensive study of circuit-to-phase-noise conversion mechanisms of different oscillators' structures shows the proposed class-F exhibits the lowest phase noise at the same tank's quality factor and supply voltage. The prototype of the class-F oscillator is implemented in TSMC 65-nm standard CMOS. It exhibits average phase noise of -136 dBc/Hz at 3 MHz offset from the carrier over 5.9-7.6 GHz tuning range with figure-of-merit of 192 dBc/Hz. The oscillator occupies 0.12 mm2 while drawing 12 mA from 1.25 V supply.

Journal ArticleDOI
TL;DR: This paper presents a power- and area-efficient 24-way time-interleaved successive-approximation-register (SAR) analog-to-digital converter (ADC) that achieves 2.8 GS/s and 8.1 ENOB in 65 nm CMOS.
Abstract: This paper presents a power- and area-efficient 24-way time-interleaved successive-approximation-register (SAR) analog-to-digital converter (ADC) that achieves 2.8 GS/s and 8.1 ENOB in 65 nm CMOS. To minimize the power and the area, the capacitors in the capacitive DAC are sized to meet the thermal noise requirements rather than the matching requirements, leading to the LSB capacitance of 50 aF. An on-chip digital background calibration is used to calibrate the capacitor mismatches in individual ADC channels, as well as the inter-channel offset, gain and timing mismatches. Measurement results at the 2.8 GS/s sampling rate show that the ADC chip prototype consumes 44.6 mW of power from a 1.2 V supply while achieving peak SNDR of 50.9 dB and retaining SNDR higher than 48.2 dB across the entire first Nyquist zone with a 1.8Vpp-diff input signal. The prototype chip occupies an area of 1.03 × 1.66 mm2, including the pads and the testing circuits. The figure of merit (FoM) of this ADC, calculated with the minimum SNDR in the first Nyquist zone, is 76 fJ/conversion-step.

Journal ArticleDOI
Cristiano Niclass1, Mineki Soga1, Hiroyuki Matsubara1, Satoru Kato1, Manabu Kagami1 
TL;DR: A single-photon detection technique for time-of-flight distance ranging based on the temporal and spatial correlation of photons is introduced and experimental results in which the depth sensor was operated in a typical traffic scenario are reported.
Abstract: This paper introduces a single-photon detection technique for time-of-flight distance ranging based on the temporal and spatial correlation of photons. A proof-of-concept prototype achieving depth imaging up to 100 meters with a resolution of 340 × 96 pixels at 10 frames/s was implemented. At the core of the system, a sensor chip comprising 32 macro-pixels based on an array of single-photon avalanche diodes featuring an optical fill factor of 70% was fabricated in a 0.18-μm CMOS. The chip also comprises an array of 32 circuits capable of generating precise triggers upon correlation events as well as of sampling the number of photons involved in each correlation event, and an array of 32 12-b time-to-digital converters. Characterization of the TDC array led to -0.52 LSB and 0.73 LSB of differential and integral nonlinearities, respectively. Quantitative evaluation of the TOF sensor under strong solar background light, i.e., 80 klux, revealed a repeatability error better than 10 cm throughout the distance range of 100 m, thus leading to a relative precision of 0.1%. In the same condition, the relative nonlinearity error was 0.37%. In order to show the suitability of our approach in a real-world situation, experimental results in which the depth sensor was operated in a typical traffic scenario are also reported.

Journal ArticleDOI
TL;DR: Schottky-barrier diodes fabricated in CMOS without process modification are shown to be suitable for active THz imaging applications and suggest that an affordable and portable fully-integrated CMOS THz imager is possible.
Abstract: Schottky-barrier diodes (SBD's) fabricated in CMOS without process modification are shown to be suitable for active THz imaging applications Using a compact passive-pixel array architecture, a fully-integrated 280-GHz 4 × 4 imager is demonstrated At 1-MHz input modulation frequency, the measured peak responsivity is 51 kV/W with ±20% variation among the pixels The measured minimum NEP is 29 pW/Hz1/2 Additionally, an 860-GHz SBD detector is implemented by reducing the number of unit cells in the diode, and by exploiting the efficiency improvement of patch antenna with frequency The measured NEP is 42 pW/Hz1/2 at 1-MHz modulation frequency This is competitive to the best reported performance of MOSFET-based pixel measured without attaching an external silicon lens (66 pW/Hz1/2 at 1 THz and 40 pW/Hz1/2 at 650 GHz) Given that incorporating the 280-GHz detector into an array increased the NEP by ~ 20%, the 860-GHz imager array should also have the similar NEP as that for an individual detector The circuits were utilized in a setup that requires neither mirrors nor lenses to form THz images These suggest that an affordable and portable fully-integrated CMOS THz imager is possible

Journal ArticleDOI
TL;DR: A fully electrical startup boost converter for thermal energy harvesting is presented, implemented in a 65-nm bulk CMOS technology and a miniaturized module is demonstrated for energy harvesting applications.
Abstract: A fully electrical startup boost converter for thermal energy harvesting is presented in this paper. The converter is implemented in a 65-nm bulk CMOS technology. With the proposed 3-stage stepping-up architecture, the minimum input voltage for startup is as low as 50 mV while the input voltage required for sustained power conversion is 30 mV. Due to the use of a zero-current-switching (ZCS) converter as the last stage and an automatic shutdown mechanism for the auxiliary converter, conversion efficiency up to 73% is achieved. By incorporating the boost converter and a thermoelectric generator (TEG), a miniaturized module is demonstrated for energy harvesting applications.

Journal ArticleDOI
TL;DR: A custom processor that integrates a CPU with configurable accelerators for discriminative machine-learning functions and an accelerator for embedded active learning enables prospective adaptation of the signal models by utilizing sensed data for patient-specific customization, while minimizing the effort from human experts is presented.
Abstract: Low-power sensing technologies have emerged for acquiring physiologically indicative patient signals. However, to enable devices with high clinical value, a critical requirement is the ability to analyze the signals to extract specific medical information. Yet given the complexities of the underlying processes, signal analysis poses numerous challenges. Data-driven methods based on machine learning offer distinct solutions, but unfortunately the computations are not well supported by traditional DSP. This paper presents a custom processor that integrates a CPU with configurable accelerators for discriminative machine-learning functions. A support-vector-machine accelerator realizes various classification algorithms as well as various kernel functions and kernel formulations, enabling range of points within an accuracy-versus-energy and -memory trade space. An accelerator for embedded active learning enables prospective adaptation of the signal models by utilizing sensed data for patient-specific customization, while minimizing the effort from human experts. The prototype is implemented in 130-nm CMOS and operates from 1.2 V-0.55 V (0.7 V for SRAMs). Medical applications for EEG-based seizure detection and ECG-based cardiac-arrhythmia detection are demonstrated using clinical data, while consuming 273 μJ and 124 μJ per detection, respectively; this represents 62.4t and 144.7t energy reduction compared to an implementation based on the CPU. A patient-adaptive cardiac-arrhythmia detector is also demonstrated, reducing the analysis-effort required for model customization by 20 t.

Journal ArticleDOI
TL;DR: A high-power broadband 260-GHz radiation source using 65-nm bulk CMOS technology is reported, an array of eight harmonic oscillators with mutual coupling through four 130-GHz quadrature oscillators that achieves the optimum conditions for the fundamental oscillation and the 2nd-harmonic generation.
Abstract: A high-power broadband 260-GHz radiation source using 65-nm bulk CMOS technology is reported. The source is an array of eight harmonic oscillators with mutual coupling through four 130-GHz quadrature oscillators. Based on a novel self-feeding structure, the harmonic oscillator simultaneously achieves the optimum conditions for the fundamental oscillation and the 2nd-harmonic generation. The signals at 260 GHz radiate through eight on-chip slot antennas, and are in-phase combined inside a hemispheric silicon lens attached at the backside of the chip. Similar to the laser pulse-driven photoconductive emitter in many THz spectrometers, the radiation of this source can also be modulated by narrow pulses generated on chip, which achieves broad radiation bandwidth. Without modulation, the chip achieves a measured continuous-wave radiated power of 1.1 mW, and an EIRP of 15.7 dBm. Under modulation, the measured bandwidth of the source is 24.7 GHz. This radiator array consumes 0.8-W DC power from a 1.2-V supply.

Journal ArticleDOI
TL;DR: A wirelessly powered 0.125 mm2 65 nm CMOS IC for Brain-Machine Interface applications integrates four 1.5 μW amplifiers with power conditioning and communication circuitry to create a multi-node backscatter frequency locks to a wireless interrogator using a frequency-domain multiple access communication scheme.
Abstract: A wirelessly powered 0.125 mm2 65 nm CMOS IC for Brain-Machine Interface applications integrates four 1.5 μW amplifiers (6.5 μVrms input-referred noise with 10 kHz bandwidth) with power conditioning and communication circuitry. The multi-node backscatter frequency locks to a wireless interrogator using a frequency-domain multiple access communication scheme. The full system, verified with wirelessly powered in vivo recordings, consumes 10.5 μW and operates at 1 mm range in air with 50 mW transmit power.

Journal ArticleDOI
TL;DR: B Bubble Razor, an architecturally independent approach to timing error detection and correction that avoids hold-time issues and enables large timing speculation windows is proposed and implemented on an ARM Cortex-M3 microprocessor in 45 nm CMOS to demonstrate the technique's automated capability.
Abstract: We propose Bubble Razor, an architecturally independent approach to timing error detection and correction that avoids hold-time issues and enables large timing speculation windows. A local stalling technique that can be automatically inserted into any design allows the system to scale to larger processors. We implemented Bubble Razor on an ARM Cortex-M3 microprocessor in 45 nm CMOS without detailed knowledge of its internal architecture to demonstrate the technique's automated capability. The flip-flop based design was converted to two-phase latch timing using commercial retiming tools; Bubble Razor was then inserted using automatic scripts. This system marks the first published implementation of a Razor-style scheme on a complete, commercial processor. It provides an energy efficiency improvement of 60% or a throughput gain of up to 100% compared to operating with worst case timing margins.

Journal ArticleDOI
TL;DR: A Data-Driven Noise-Reduction method is introduced to selectively enhance the comparator noise performance in a power-efficient 10/12 bit 40 kS/s SAR ADC for sensor applications.
Abstract: This paper presents a power-efficient 10/12 bit 40 kS/s SAR ADC for sensor applications. It supports resolutions of 10 and 12 bit and sample rates from DC up to 40 kS/s to accommodate a variety of sensor applications. A Data-Driven Noise-Reduction method is introduced to selectively enhance the comparator noise performance. In this way, a higher ADC resolution can be achieved with a small increase of the power consumption. A self-oscillating comparator is used to generate the bit-cycling clock internally. In this way, the ADC only requires an external clock at the sample-rate frequency. A segmented capacitive DAC with 250 aF unit elements is applied to save power and to reduce DNL errors at the same time. The implemented prototype in 65 nm CMOS occupies an area of 0.076 mm 2. For the two supported resolutions (10/12 bit), the ADC achieves an ENOB of 9.4 and 10.1 bit while consuming 72 and 97 nW from a 0.6 V supply at 40 kS/s. This leads to power efficiencies of 2.7 and 2.2 fJ/conversion-step for 10 bit and 12 bit resolution, respectively. Furthermore, the leakage power, which is below 0.4 nW, ensures that the efficiency can be maintained down to very low sample rates.

Journal ArticleDOI
TL;DR: A design methodology for synthesis of active N-path bandpass filters is introduced and a 0.1-to-1.2 GHz tunable 6th-order N- path channel-select filter in 65 nm LP CMOS is introduced, achieving a “flat” passband shape and high out-of-band linearity.
Abstract: A design methodology for synthesis of active N-path bandpass filters is introduced. Based on this methodology, a 0.1-to-1.2 GHz tunable 6th-order N-path channel-select filter in 65 nm LP CMOS is introduced. It is based on coupling N-path filters with gyrators, achieving a “flat” passband shape and high out-of-band linearity. A Miller compensation method is utilized to considerably improve the passband shape of the filter. The filter has 2.8 dB NF, +25 dB gain, +26 dBm wideband IIP3 ( MHz), an out-of-band 1 dB blocker compression point B1dB,CP of +7 dBm (Δf = +50 MHz) and 59 dB stopband rejection. The analog and digital part of the filter draw 11.7 mA and 3-36 mA from 1.2 V, respectively. The LO leakage to the input port of the filter is ≤-64 dBm at a clock frequency of 1 GHz. The proposed filter only consists of inverters, switches and capacitors and therefore it is friendly with process scaling.

Journal ArticleDOI
TL;DR: A 20-bit incremental ADC for battery-powered sensor applications is presented, based on an energy-efficient zoom ADC architecture, which employs a coarse 6-bit SAR conversion followed by a fine 15-bit ΔΣ conversion.
Abstract: A 20-bit incremental ADC for battery-powered sensor applications is presented. It is based on an energy-efficient zoom ADC architecture, which employs a coarse 6-bit SAR conversion followed by a fine 15-bit ΔΣ conversion. To further improve its energy efficiency, the ADC employs integrators based on cascoded dynamic inverters for extra gain and PVT tolerance. Dynamic error correction techniques such as auto-zeroing, chopping and dynamic element matching are used to achieve both low offset and high linearity. Measurements show that the ADC achieves 20-bit resolution, 6 ppm INL and 1 μV offset in a conversion time of 40 ms, while drawing only 3.5 μA current from a 1.8 V supply. This corresponds to a state-of-the-art figure-of-merit (FoM) of 182.7 dB. The 0.35 mm2 chip was fabricated in a standard 0.16 μm CMOS process.

Journal ArticleDOI
TL;DR: A 288-GHz lens-integrated high-power source implemented in a 65-nm CMOS technology is presented, which is the highest reported radiated power of a single CMOS source beyond 200 GHz.
Abstract: A 288-GHz lens-integrated high-power source implemented in a 65-nm CMOS technology is presented. The source consists of two free-running triple-push ring oscillators locked out-of phase by magnetic coupling. The oscillators drive a differential on-chip ring antenna, which illuminates a hyper-hemispherical silicon lens through the backside of the die. An on-wafer breakout of the oscillators core achieves a peak output power of -1.5 dBm with a 275-mW DC power consumption. The radiated power of the packaged source is -4.1 dBm, which is the highest reported radiated power of a single CMOS source beyond 200 GHz. The source die including the antenna occupies only 500 x 570 μ m2.

Journal ArticleDOI
TL;DR: This paper presents a 60-GHz direct-conversion RF front-end and baseband transceiver including analog and digital circuitry for PHY functions, capable of more than 7-Gb/s 16QAM wireless communication for every channel of the 60- GHz standards, which can be extended up to 10 Gb/s.
Abstract: This paper presents a 60-GHz direct-conversion RF front-end and baseband transceiver including analog and digital circuitry for PHY functions. The 65-nm CMOS front-end consumes 319 and 223 mW in transmitting and receiving mode, respectively. It is capable of more than 7-Gb/s 16QAM wireless communication for every channel of the 60-GHz standards, which can be extended up to 10 Gb/s. The 40-nm CMOS baseband including analog, digital, and I/O consumes 196 and 427 mW for 16QAM in transmitting and receiving modes, respectively. In the analog baseband, a 5-b 2304-MS/s ADC consumes 12 mW, and a 6-b 3456-MS/s DAC consumes 11 mW. In the digital baseband integrating all PHY functions, a (1440, 1344) LDPC decoder consumes 74 mW with the low energy efficiency of 11.8 pJ/b. The entire system including both RF and BB using a 6-dBi antenna built in the organic package can transmit 3.1 Gb/s over 1.8 m in QPSK and 6.3 Gb/s over 0.05 m in 16QAM.

Journal ArticleDOI
TL;DR: The design of the enabling TRX chip is presented: a highly integrated 94 GHz phase-coherent pulsed-radar with on-chip antennas that achieves 10 GHz of frequency tuning range and 300 ps of contiguous pulse position control, enabling its usage in the large-array imager with time-domain TX beamforming.
Abstract: High-resolution mm-wave array beamformers have applications in medical imaging, gesture recognition, and navigation. A scalable array architecture for 3D imaging is proposed in which single-element phase coherent transceiver (TRX) chips, with programmable TX pulse delay capability, are mounted on a common board to realize the array. This paper presents the design of the enabling TRX chip: a highly integrated 94 GHz phase-coherent pulsed-radar with on-chip antennas. The TRX achieves 10 GHz of frequency tuning range and 300 ps of contiguous pulse position control, enabling its usage in the large-array imager with time-domain TX beamforming. The TRX is capable of transmitting and receiving pulses down to 36 ps, translating to 30 GHz of bandwidth. Interferometric measurements show the TRX can obtain single-target range resolution better than 375 μm (limited by equipment). Based on delay measurements, the time of arrival rms error would be less than 1.3 ps which, if used in a 3D imaging array, leads to less than 0.36 mm of RMS error in voxel size and position.

Journal ArticleDOI
TL;DR: A novel pulse-train time amplifier is proposed that achieves linear, accurate, and programmable gain for a wide input range and achieves the fastest conversion rate and the best FoM without any calibration.
Abstract: In this paper, a novel pulse-train time amplifier is proposed that achieves linear, accurate, and programmable gain for a wide input range. Using the proposed pulse-train time amplifier, a 7-bit two-step TDC is implemented. The proposed TDC employs repetitive pulses with gated delay-lines for a calibration-free and programmable time amplification and quantization. The prototype chip fabricated in 65 nm CMOS process achieves 3.75 ps of time resolution at 200 MS/s while consuming 3.6 mW and occupying 0.02 mm2 area. Compared to previously reported TDCs, the proposed TDC achieves the fastest conversion rate and the best FoM without any calibration.

Journal ArticleDOI
TL;DR: In this article, a power-scalable SAR ADC for sensor applications is presented, which features a reconfigurable 5-to-10-bit DAC whose power scales exponentially with resolution.
Abstract: A power-scalable SAR ADC for sensor applications is presented. The ADC features a reconfigurable 5-to-10-bit DAC whose power scales exponentially with resolution. At low resolutions where noise and linearity requirements are reduced, supply voltage scaling is leveraged to further reduce the energy-per-conversion. The ADC operates up to 2 MS/s at 1 V and 5 kS/s at 0.4 V, and its power scales linearly with sample rate down to leakage levels of 53 nW at 1 V and 4 nW at 0.4 V. Leakage power-gating during a SLEEP mode in between conversions reduces total power by up to 14% at sample rates below 1 kS/s. Prototyped in a low-power 65 nm CMOS process, the ADC in 10-bit mode achieves an INL and DNL of 0.57 LSB and 0.58 LSB respectively at 0.6 V, and the Nyquist SNDR and SFDR are 55 dB and 69 dB respectively at 0.55 V and 20 kS/s. The ADC achieves an optimal FOM of 22.4 fJ/conversion-step at 0.55 V in 10-bit mode. The combined techniques of DAC resolution and voltage scaling maximize efficiency at low resolutions, resulting in an FOM that increases by only 7x over the 5-bit scaling range, improving upon a 32x degradation that would otherwise arise from truncation of bits from an ADC of fixed resolution and voltage.

Journal ArticleDOI
TL;DR: Compared with the commonly used class-B/C architectures, the optimal class-D oscillator produces less phase noise for the same power consumption, at the expense of a higher power supply pushing.
Abstract: This paper presents class-D CMOS oscillators capable of an excellent phase noise performance from a very low power supply voltage. Starting from the recognition of the time-variant nature of the class-D LC tank, accurate expressions of the oscillation frequency, oscillation amplitude, current consumption, phase noise, and figure-of-merit (FoM) have been derived. Compared with the commonly used class-B/C architectures, the optimal class-D oscillator produces less phase noise for the same power consumption, at the expense of a higher power supply pushing. A prototype of a class-D voltage-controlled oscillator (VCO) targeted for mobile applications, implemented in a standard 65-nm CMOS process, covers a 46% tuning range between 3.0 and 4.8 GHz; drawing 10 mA from 0.4 V, the phase noise at 10-MHz offset from 4.8 GHz is -143.5 dBc/Hz, for an FoM of 191 dBc/Hz with less than 1-dB variation across the tuning range. A version of the same VCO with a resonant tail filter displays a lower 1/f3 phase-noise corner and improves the FoM by 1 dB.