scispace - formally typeset
Search or ask a question
Author

Naresh R. Shanbhag

Other affiliations: Bell Labs, Wright State University, University of Minnesota  ...read more
Bio: Naresh R. Shanbhag is an academic researcher from University of Illinois at Urbana–Champaign. The author has contributed to research in topics: Adaptive filter & Signal processing. The author has an hindex of 49, co-authored 325 publications receiving 9202 citations. Previous affiliations of Naresh R. Shanbhag include Bell Labs & Wright State University.


Papers
More filters
Journal ArticleDOI
TL;DR: A high-throughput memory-efficient decoder architecture for low-density parity-check (LDPC) codes is proposed based on a novel turbo decoding algorithm, and a full-decoder architecture is presented.
Abstract: A high-throughput memory-efficient decoder architecture for low-density parity-check (LDPC) codes is proposed based on a novel turbo decoding algorithm. The architecture benefits from various optimizations performed at three levels of abstraction in system design-namely LDPC code design, decoding algorithm, and decoder architecture. First, the interconnect complexity problem of current decoder implementations is mitigated by designing architecture-aware LDPC codes having embedded structural regularity features that result in a regular and scalable message-transport network with reduced control overhead. Second, the memory overhead problem in current day decoders is reduced by more than 75% by employing a new turbo decoding algorithm for LDPC codes that removes the multiple checkto-bit message update bottleneck of the current algorithm. A new merged-schedule merge-passing algorithm is also proposed that reduces the memory overhead of the current algorithm for low to moderate-throughput decoders. Moreover, a parallel soft-input-soft-output (SISO) message update mechanism is proposed that implements the recursions of the Balh-Cocke-Jelinek-Raviv (BCJR) algorithm in terms of simple "max-quartet" operations that do not require lookup-tables and incur negligible loss in performance compared to the ideal case. Finally, an efficient programmable architecture coupled with a scalable and dynamic transport network for storing and routing messages is proposed, and a full-decoder architecture is presented. Simulations demonstrate that the proposed architecture attains a throughput of 1.92 Gb/s for a frame length of 2304 bits, and achieves savings of 89.13% and 69.83% in power consumption and silicon area over state-of-the-art, with a reduction of 60.5% in interconnect length.

612 citations

Journal ArticleDOI
01 Mar 2019-Science
TL;DR: It is now possible to fabricate wireless, battery-free vital signs monitoring systems based on ultrathin, “skin-like” measurement modules that can gently and noninvasively interface onto the skin of neonates with gestational ages down to the edge of viability.
Abstract: INTRODUCTION In neonatal intensive care units (NICUs), continuous monitoring of vital signs is essential, particularly in cases of severe prematurity. Current monitoring platforms require multiple hard-wired, rigid interfaces to a neonate’s fragile, underdeveloped skin and, in some cases, invasive lines inserted into their delicate arteries. These platforms and their wired interfaces pose risks for iatrogenic skin injury, create physical barriers for skin-to-skin parental/neonate bonding, and frustrate even basic clinical tasks. Technologies that bypass these limitations and provide additional, advanced physiological monitoring capabilities would directly address an unmet clinical need for a highly vulnerable population. RATIONALE It is now possible to fabricate wireless, battery-free vital signs monitoring systems based on ultrathin, “skin-like” measurement modules. These devices can gently and noninvasively interface onto the skin of neonates with gestational ages down to the edge of viability. Four essential advances in engineering science serve as the foundations for this technology: (i) schemes for wireless power transfer, low-noise sensing, and high-speed data communications via a single radio-frequency link with negligible absorption in biological tissues; (ii) efficient algorithms for real-time data analytics, signal processing, and dynamic baseline modulation implemented on the sensor platforms themselves; (iii) strategies for time-synchronized streaming of wireless data from two separate devices; and (iv) designs that enable visual inspection of the skin interface while also allowing magnetic resonance imaging and x-ray imaging of the neonate. The resulting systems can be much smaller in size, lighter in weight, and less traumatic to the skin than any existing alternative. RESULTS We report the realization of this class of NICU monitoring technology, embodied as a pair of devices that, when used in a time-synchronized fashion, can reconstruct full vital signs information with clinical-grade precision. One device mounts on the chest to capture electrocardiograms (ECGs); the other rests on the base of the foot to simultaneously record photoplethysmograms (PPGs). This binodal system captures and continuously transmits ECG, PPG, and (from each device) skin temperature data, yielding measurements of heart rate, heart rate variability, respiration rate, blood oxygenation, and pulse arrival time as a surrogate of systolic blood pressure. Successful tests on neonates with gestational ages ranging from 28 weeks to full term demonstrate the full range of functions in two level III NICUs. The thin, lightweight, low-modulus characteristics of these wireless devices allow for interfaces to the skin mediated by forces that are nearly an order of magnitude smaller than those associated with adhesives used for conventional hardware in the NICU. This reduction greatly lowers the potential for iatrogenic injuries. CONCLUSION The advances outlined here serve as the basis for a skin-like technology that not only reproduces capabilities currently provided by invasive, wired systems as the standard of care, but also offers multipoint sensing of temperature and continuous tracking of blood pressure, all with substantially safer device-skin interfaces and compatibility with medical imaging. By eliminating wired connections, these platforms also facilitate therapeutic skin-to-skin contact between neonates and parents, which is known to stabilize vital signs, reduce morbidity, and promote parental bonding. Beyond use in advanced hospital settings, these systems also offer cost-effective capabilities with potential relevance to global health.

467 citations

Journal ArticleDOI
TL;DR: New high-speed VLSI architectures for decoding Reed-Solomon codes with the Berlekamp-Massey algorithm are presented, which require approximately 25% fewer multipliers and a simpler control structure than the architectures based on the popular extended Euclidean algorithm.
Abstract: New high-speed VLSI architectures for decoding Reed-Solomon codes with the Berlekamp-Massey algorithm are presented in this paper. The speed bottleneck in the Berlekamp-Massey algorithm is in the iterative computation of discrepancies followed by the updating of the error-locator polynomial. This bottleneck is eliminated via a series of algorithmic transformations that result in a fully systolic architecture in which a single array of processors computes both the error-locator and the error-evaluator polynomials. In contrast to conventional Berlekamp-Massey architectures in which the critical path passes through two multipliers and 1+[log/sub 2/,(t+1)] adders, the critical path in the proposed architecture passes through only one multiplier and one adder, which is comparable to the critical path in architectures based on the extended Euclidean algorithm. More interestingly, the proposed architecture requires approximately 25% fewer multipliers and a simpler control structure than the architectures based on the popular extended Euclidean algorithm. For block-interleaved Reed-Solomon codes, embedding the interleaver memory into the decoder results in a further reduction of the critical path delay to just one XOR gate and one multiplexer, leading to speed-ups of as much as an order of magnitude over conventional architectures.

335 citations

Journal ArticleDOI
TL;DR: A prediction-based error-control scheme is proposed to enhance the performance of the filtering algorithm in the presence of errors due to soft computations, and algorithmic noise-tolerance schemes can also be used to improve theperformance of DSP algorithms in presence of bit-error rates of up to 10/sup -3/ due to deep submicron (DSM) noise.
Abstract: In this paper, we propose a framework for low-energy digital signal processing (DSP), where the supply voltage is scaled beyond the critical voltage imposed by the requirement to match the critical path delay to the throughput. This deliberate introduction of input-dependent errors leads to degradation in the algorithmic performance, which is compensated for via algorithmic noise-tolerance (ANT) schemes. The resulting setup that comprises of the DSP architecture operating at subcritical voltage and the error control scheme is referred to as soft DSP. The effectiveness of the proposed scheme is enhanced when arithmetic units with a higher "delay imbalance" are employed. A prediction-based error-control scheme is proposed to enhance the performance of the filtering algorithm in the presence of errors due to soft computations. For a frequency selective filter, it is shown that the proposed scheme provides 60-81% reduction in energy dissipation for filter bandwidths up to 0.5 /spl pi/ (where 2 /spl pi/ corresponds to the sampling frequency f/sub s/) over that achieved via conventional architecture and voltage scaling, with a maximum of 0.5-dB degradation in the output signal-to-noise ratio (SNR/sub o/). It is also shown that the proposed algorithmic noise-tolerance schemes can also be used to improve the performance of DSP algorithms in presence of bit-error rates of up to 10/sup -3/ due to deep submicron (DSM) noise.

278 citations

Journal ArticleDOI
TL;DR: A 14.3-mm/sup 2/ code-programmable and code-rate tunable decoder chip for 2048-bit low-density parity-check (LDPC) codes is presented, which implements the turbo-decoding message-passing (TDMP) algorithm for architecture-aware (AA-)LDPC codes which has a faster convergence rate and hence a throughput advantage over the standard decoding algorithm.
Abstract: A 14.3-mm/sup 2/ code-programmable and code-rate tunable decoder chip for 2048-bit low-density parity-check (LDPC) codes is presented. The chip implements the turbo-decoding message-passing (TDMP) algorithm for architecture-aware (AA-)LDPC codes which has a faster convergence rate and hence a throughput advantage over the standard decoding algorithm. It employs a reduced complexity message computation mechanism free of lookup tables, and features a programmable network for message interleaving based on the code structure. The chip decodes any mix of 2048-bit rate-1/2 (3,6)-regular AA-LDPC codes in standard mode by programming the network, and attains a throughput of 640 Mb/s at 125 MHz for 10 TDMP-decoding iterations. In augmented mode, the code rate can be tuned up to 14/16 in steps of 1/16 by augmenting the code. The chip is fabricated in 0.18-/spl mu/m six-metal-layer CMOS technology, operates at a peak clock frequency of 125 MHz at 1.8 V (nominal), and dissipates an average power of 787 mW.

245 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Focusing on using probabilistic metrics such as average values or variance to quantify design objectives such as performance and power will lead to a major change in SoC design methodologies.
Abstract: On-chip micronetworks, designed with a layered methodology, will meet the distinctive challenges of providing functionally correct, reliable operation of interacting system-on-chip components. A system on chip (SoC) can provide an integrated solution to challenging design problems in the telecommunications, multimedia, and consumer electronics domains. Much of the progress in these fields hinges on the designers' ability to conceive complex electronic engines under strong time-to-market pressure. Success will require using appropriate design and process technologies, as well as interconnecting existing components reliably in a plug-and-play fashion. Focusing on using probabilistic metrics such as average values or variance to quantify design objectives such as performance and power will lead to a major change in SoC design methodologies. Overall, these designs will be based on both deterministic and stochastic models. Creating complex SoCs requires a modular, component-based approach to both hardware and software design. Despite numerous challenges, the authors believe that developers will solve the problems of designing SoC networks. At the same time, they believe that a layered micronetwork design methodology will likely be the only path to mastering the complexity of future SoC designs.

3,852 citations

Journal ArticleDOI
Shekhar Borkar1
TL;DR: This article discusses effects of variability in transistor performance and proposes microarchitecture, circuit, and testing research that focuses on designing with many unreliable components (transistors) to yield reliable system designs.
Abstract: As technology scales, variability in transistor performance continues to increase, making transistors less and less reliable. This creates several challenges in building reliable systems, from the unpredictability of delay to increasing leakage current. Finding solutions to these challenges require a concerted effort on the part of all the players in a system design. This article discusses these effects and proposes microarchitecture, circuit, and testing research that focuses on designing with many unreliable components (transistors) to yield reliable system designs.

1,421 citations

Proceedings ArticleDOI
03 Dec 2003
TL;DR: A solution by which the circuit can be operated even below the ‘critical’ voltage, so that no margins are required and thus more energy can be saved.
Abstract: With increasing clock frequencies and silicon integration, power aware computing has become a critical concern in the design of embedded processors and systems-on-chip. One of the more effective and widely used methods for power-aware computing is dynamic voltage scaling (DVS). In order to obtain the maximum power savings from DVS, it is essential to scale the supply voltage as low as possible while ensuring correct operation of the processor. The critical voltage is chosen such that under a worst-case scenario of process and environmental variations, the processor always operates correctly. However, this approach leads to a very conservative supply voltage since such a worst-case combination of different variabilities is very rare. In this paper, we propose a new approach to DVS, called Razor, based on dynamic detection and correction of circuit timing errors. The key idea of Razor is to tune the supply voltage by monitoring the error rate during circuit operation, thereby eliminating the need for voltage margins and exploiting the data dependence of circuit delay. A Razor flip-flop is introduced that double-samples pipeline stage values, once with a fast clock and again with a time-borrowing delayed clock. A metastability-tolerant comparator then validates latch values sampled with the fast clock. In the event of timing error, a modified pipeline mispeculation recovery mechanism restores correct program state. A prototype Razor pipeline was designed in a 0.18 /spl mu/m technology and was analyzed. Razor energy overhead during normal operation is limited to 3.1%. Analyses of a full-custom multiplier and a SPICE-level Kogge-Stone adder model reveal that substantial energy savings are possible for these devices (up to 64.2%) with little impact on performance due to error recovery (less than 3%).

1,137 citations